Skip to main content

Command Palette

Search for a command to run...

Inside Git: How It Works and the Role of the .git Folder

Updated
4 min read
Inside Git: How It Works and the Role of the .git Folder

Most developers use Git every day.

They run commands like git add, git commit, and git log—but very few truly understand what Git is doing internally.
This article is about going one level deeper.

How Git Works

At its core, Git is a content-addressable database.

Instead of thinking:

“Git saves files”

Think this instead:

Git stores snapshots of content, linked together by cryptographic hashes

Every piece of data in Git:

  • Is stored once

  • Is immutable

  • Is referenced by a hash (SHA-1 / SHA-256)

Git never tracks “changes” directly.
It tracks objects that represent complete states.

What Is the .git Folder and Why Does It Exist?

When you run:

git init

Git creates a hidden directory called .git.

This folder is the entire Git repository.

Your project files are just the working copy.
The real Git database lives inside .git.

If you delete .git, your project becomes a normal folder again.

Structure of the .git Directory

Key components

  • HEAD
    Points to the current branch/commit

  • objects/
    Stores all Git objects (blobs, trees, commits)

  • refs/
    Stores branch and tag references

  • index
    Represents the staging area

  • config
    Repository-level configuration

📌 Everything Git knows about your project is stored here.

Git Objects: Blob, Tree, Commit

Git stores only three main object types.

Understanding these removes 80% of Git confusion.

1. Blob (File Content)

A blob stores:

  • The contents of a file

  • Nothing else (no filename, no permissions)

If two files have identical content → Git stores one blob.

2. Tree (Directory Structure)

A tree represents:

  • A directory

  • Filenames

  • Permissions

  • Pointers to blobs or other trees

Trees define project structure.

3. Commit (Snapshot + Metadata)

A commit contains:

  • A pointer to a root tree

  • Parent commit(s)

  • Author

  • Timestamp

  • Commit message

A commit does not store files directly.

It points to a tree, which points to blobs.

Relationship Between Commits, Trees, and Blobs

Think of it like this:

Commit
 └── Tree (root directory)
     ├── Blob (file1)
     ├── Blob (file2)
     └── Tree (subdirectory)
         └── Blob (file3)

Each object is:

  • Immutable

  • Identified by a hash

  • Reused if content does not change


How Git Tracks Changes (The Truth)

Git does not store diffs.

Instead:

  • Each commit is a complete snapshot

  • Git optimizes storage by reusing unchanged objects

If a file does not change:

  • Its blob hash remains the same

  • Git simply references the existing blob

This is why Git is both fast and efficient.


What Happens Internally During git add

https://miro.medium.com/0%2Aw_iFdtySLrVj4ecD.jpg

https://phoenixnap.com/kb/wp-content/uploads/2021/09/git-workflow.png

Step-by-step:

  1. You modify a file in the working directory

  2. git add file.txt is executed

  3. Git:

    • Reads file content

    • Creates a blob object

    • Stores it in .git/objects

  4. Updates the index (staging area) to reference the blob

📌 No commit yet.
📌 The change is prepared, not saved permanently.
What Happens Internally During git commit

https://i.sstatic.net/MgaV9.png

https://miro.medium.com/1%2ALa9tFzHMf2OIhzyhIcs20Q.png

Step-by-step:

  1. Git reads the staging area (index)

  2. Creates a tree object from staged files

  3. Creates a commit object that:

    • Points to the tree

    • Points to the previous commit

  4. Updates branch reference (e.g., main)

  5. Moves HEAD to the new commit

📌 Now the snapshot is permanent.


Why Git Uses Hashes (Integrity & Trust)

Every Git object is named by a cryptographic hash.

Example:

f374df2a6b...

This ensures:

  • Data integrity (no silent corruption)

  • Content verification

  • Tamper detection

If content changes → hash changes → object identity changes.

Git literally cannot lie about history without breaking hashes.

This is why Git is trusted for:

  • Open source

  • Security-sensitive projects

  • Distributed collaboration


Mental Model Summary (Remember This)

Instead of memorizing commands, remember this flow:

Files → Blobs → Trees → Commits → History

And this truth:

Git is a database of snapshots, not a diff tracker.


Final Thoughts

Understanding Git internals:

  • Makes you confident during conflicts

  • Helps you debug weird Git behavior

  • Separates senior developers from beginners

  • Makes Git feel logical, not magical

Once this mental model clicks, Git becomes predictable.

More from this blog

B

Birendra K

17 posts