How Git Works Internally

Where does the code GO? How does Git remember everything? What sorcery is this? Today we're going DEEP. We're opening that mysterious .git folder and understanding what the hell is inside.

The `.git` Folder — Where All the Magic Lives

When you run git init, Git creates a hidden folder called .git in your project directory. This folder? It IS Git. Delete this folder and boom — your entire Git history is gone. No more commits, no more branches, nothing. Just regular files. So what's inside this magical folder?

.git/
├── HEAD
├── config
├── description
├── hooks/
├── index
├── info/
├── logs/
├── objects/
├── refs/
│   ├── heads/
│   └── tags/
├── COMMIT_EDITMSG
├── FETCH_HEAD
└── ORIG_HEAD

Looks intimidating right? Let me break it down:

HEAD — tells Git "where am I right now?", points to current branch
config — repo configuration, author name, email, remote URLs
index — your staging area, when you do git add file info goes here
objects/ — THE MOST IMPORTANT FOLDER, stores everything — files, commits, trees
refs/ — pointers to commits, branches live in refs/heads/, tags in refs/tags/
logs/ — tracks when refs were updated, useful for git reflog
hooks/ — scripts that run automatically on Git events

Git Objects — The Building Blocks

Git is basically a content-addressable filesystem. Fancy words, simple meaning — Git stores everything as objects, and each object has a unique ID (a hash). There are THREE main types of objects:

1. Blob (Binary Large Object) A blob is basically the content of a file. Just the content, not the filename, not the location — just what's inside. When you add a file to Git, it creates a blob with the file's content and generates a SHA-1 hash for it. Same content = Same hash. Always.

2. Tree A tree is like a directory listing. It contains:

references to blobs (files)
references to other trees (subdirectories)
filenames and permissions

Think of it as Git's way of saying "in this folder, there's file A (blob xyz) and subfolder B (tree abc)."

3. Commit A commit object contains:

pointer to a tree (snapshot of your project)
author and committer information
commit message
pointer to parent commit(s)

This is the unreadable shit you see when you look at raw Git data — but it's actually beautifully organized.

The Objects Folder — Let's Get Our Hands Dirty

Remember the objects/ folder? Let's see what's actually inside.

.git/objects/
├── 1b/
├── 05/
├── 5a/
├── 1c/
├── 8a/
├── pack/
└── info/

See those two-letter folders? 1b, 05, 5a? Here's the trick: Git takes the first 2 characters of the hash and makes it a folder name. The rest becomes the filename. So if your commit hash is 2b9b0a8hdec89a1667..., Git stores it as objects/2b/9b0a8hdec89a1667... Why? Performance. Instead of having millions of files in one folder, Git distributes them. You can actually READ these objects:

bash

git cat-file -p 2b9b0a8hdec

And you'll see the raw content — the changes, the author, the commit message, the parent commit hash. Everything.

How Git Tracks Changes — The Flow

Let me walk you through what ACTUALLY happens when you use Git.

When You Do git add:

Git reads your file's content
creates a blob object with that content
generates a SHA-1 hash for it
stores the blob in .git/objects/
updates the index (staging area)

Your file is now "staged" — but not committed yet.

When You Do git commit:

Git takes everything in the staging area
creates a tree object representing current directory structure
creates a commit object pointing to this tree
stores author, message, timestamp, parent commit hash
updates branch pointer in refs/heads/

And THAT'S IT. Your commit now exists as objects in the .git/objects/ folder.

How Git Uses Hashes — Integrity on Steroids

Every single thing in Git is identified by a SHA-1 hash — a 40-character hexadecimal string like 2b9b0a8hdec89a1667a4c5d8e9f0a1b2c3d4e5f6. Why is this genius?

Content-based addressing — hash is generated FROM the content, same content = same hash, always
Integrity verification — if even ONE bit changes, hash completely changes, Git detects corruption instantly
Deduplication — same file in 10 commits? Git stores ONE blob
Distributed trust — clone a repo and verify your copy is EXACTLY the same by comparing hashes

Branching — It's Just Pointers

Here's a mind-blowing fact: branches in Git are just text files containing a commit hash. That's it. When you create a branch:

bash

git branch feature-xyz

Git just creates a file at .git/refs/heads/feature-xyz containing the current commit hash. When you switch branches with git checkout feature-xyz, Git updates the HEAD file to point to refs/heads/feature-xyz. That's why branching in Git is INSTANT. No copying files, no duplicating data. Just updating a pointer. How does Git know which branch you're on?

bash

cat .git/HEAD

Output: ref: refs/heads/main HEAD points to a branch, and that branch points to a commit. Simple.

The Dangerous HARD — Reset vs Revert

Now here's where things get spicy.

git reset --hard <commit> This command makes the HEAD pointer go to the specified commit. But in the process, the commits after that are LOST. Like, if you have commits A -> B -> C -> D (HEAD) and you do git reset --hard B, commits C and D? Gone. PUFF. UN-REVERTABLE (well, technically recoverable with reflog but that's advanced shit).

git reset <commit> (without --hard) This also moves HEAD back, but the changes from those commits are kept in staging area. They're not deleted abruptly. So there's still scope to commit them again. Much safer.

git revert <commit> This is the SAFE way. Instead of deleting commits, it creates a NEW commit that undoes the changes from the specified commit. Say there's a bug in the 3rd commit, and after that you've done new feature updates. Using reset would lose those feature commits too. Using revert creates a new commit that just removes the buggy changes while keeping everything else.

Reset ka Problem: Commits after the reset point are LOST. Revert is the Good Boi: Creates a new commit without the buggy commit's changes.

Before revert:  O -- O -- BUG -- O -- O -- HEAD
After revert:   O -- O -- BUG -- O -- O -- FIX (new commit that undoes BUG)

Building Your Mental Model

Stop memorizing commands. Start understanding the model:

Git is a database of objects — blobs, trees, commits
everything is identified by hashes — content-addressable
branches are just pointers — lightweight, instant
HEAD tells you where you are — points to current branch
staging area (index) is your prep zone — between working directory and commits

Once you understand this, commands start making sense:

git add -> creates blobs, updates index
git commit -> creates tree + commit objects, updates branch pointer
git checkout -> moves HEAD, updates working directory
git reset -> moves branch pointer (dangerous without care)
git revert -> creates new commit that undoes changes (safe)

Wrapping Up

The .git folder isn't magic. It's just a well-organized database of objects and pointers. Understanding this makes you a BETTER developer because you know WHY commands work the way they do, you can recover from mistakes, and you can debug Git issues instead of just deleting and re-cloning. Next time you run git commit, remember — you're not just "saving". You're creating a blob, building a tree, generating a commit object, updating refs, and moving HEAD. Pretty cool when you think about it. Now go explore your .git folder. I dare you.

How Git Works Internally

The `.git` Folder — Where All the Magic Lives

Git Objects — The Building Blocks

The Objects Folder — Let's Get Our Hands Dirty

How Git Tracks Changes — The Flow

How Git Uses Hashes — Integrity on Steroids

Branching — It's Just Pointers

The Dangerous HARD — Reset vs Revert

Building Your Mental Model

Wrapping Up

Comments

More from this blog

How DNS Resolution Works

Getting Started with cURL

DNS Record Types Explained.

Why Version Control Exists: A Pendrive Horror Story

Command Palette

The .git Folder — Where All the Magic Lives

Git Objects — The Building Blocks

The Objects Folder — Let's Get Our Hands Dirty

How Git Tracks Changes — The Flow

How Git Uses Hashes — Integrity on Steroids

Branching — It's Just Pointers

The Dangerous HARD — Reset vs Revert

Building Your Mental Model

Wrapping Up

Comments

More from this blog

The `.git` Folder — Where All the Magic Lives