Inside .git: Understanding Git Internals and Data Structures

We all grind Data Structures and Algorithms(DSA) for interviews and then we switch to Development and think it’s a totally different world.

But here is the truth I found while digging into Git internals: Git is literally just a Linked List.

When we learn about Linked Lists in college, it feels theoretical. But every time you type git commit, you are adding a Node to the most famous Linked List in the world.

The Behind The Scenes(BTS) Logic

If you look at the raw data inside the .git folder, the mapping is crazy accurate:

The Node: Your Commit is the Node. It holds your metadata or user data (author, message).

The Pointer: Every commit has a Parent Hash. This is literally the prev pointer in a Linked List. It points to the commit that came before it.

The Head: You know how we have a head pointer in DSA to track the start? In Git, HEAD is a pointer that always points to the last node you worked on.

Secret Part: SHA-1 & Trees

This is where it gets cool. How does Git actually save your code inside that Node?

It uses SHA-1 Hashing.

Every time you save a file, Git takes the raw content, runs it through an algorithm, and spits out a 40-character unique ID (like a1b2c3 . . .).

Git doesn't care about the filename here. It only cares about the content.
If two files have the exact same code, they share the exact same Hash, Efficiency level: 100.

So how does it remember folder structures?

Enter the Tree Object.

Think of a Tree Object as a simple directory list. It maps your filenames to those SHA-1 hashes.

"index.html" - > points to Blob e4f2 . . .
"styles/" - > points to another Tree 89a1 . . .

How Changes Spread

When you make a new Commit (node), Git doesn't copy all your files.

It creates a new Tree representing the current state of your folder.
If you only changed one file, Git generates a new SHA-1 for that file.
The new Tree points to the new file hash, but keeps pointing to the old hashes for everything else that didn't change.

The Conclusion

Stop treating Git like magic. It’s just a database of nodes (Commits) pointing to maps (Trees) pointing to data (Blobs).

If you can reverse a Linked List in an interview, you already understand how Git history works. We are just traversing nodes, bro!

How Git Works Internally

The Behind The Scenes(BTS) Logic

Secret Part: SHA-1 & Trees

How Changes Spread

The Conclusion

Comments (1)

Git Exploration

Problems Faced Before Version Control Systems

More from this blog

What coding tutorials don't teach you about pushing to Prod

Chef’s Recipe (Declarations vs. Expressions)

Masterji Logic (If, Else & Switch)

Late-Night Lobby (Operators)

Surviving the Flash Sale: My Journey from a Fragile Monolith to a K8s Distributed System

Command Palette

The Behind The Scenes(BTS) Logic

Secret Part: SHA-1 & Trees

How Changes Spread

The Conclusion

Comments (1)

Git Exploration

Problems Faced Before Version Control Systems

More from this blog