Keywords: Git Diff | Version Control | Code Changes
Abstract: This technical article provides an in-depth exploration of three core methods for reviewing file changes in Git before committing: git diff for comparing working directory with staging area, git diff --staged/--cached for staging area versus latest commit, and git diff HEAD for working directory versus latest commit. Through detailed code examples and workflow analysis, developers learn to accurately track modifications and prevent erroneous commits. The article systematically explains the underlying logic of file tracking states and difference comparisons within Git's architecture.
Core Functionality of Git Diff Command
During software development, programmers frequently need to interrupt their work and may forget specific changes when returning later. Git provides powerful difference comparison tools to address this scenario, with the git diff command serving as the primary tool for examining file modifications.
Fundamentals of File States and Difference Comparison
Files in a Git working directory exist in two primary states: tracked and untracked. Tracked files further divide into unmodified, modified, and staged states. Understanding these states is crucial for properly utilizing difference comparison commands.
When executing the git status command, Git displays the current state of files:
$ git status
On branch main
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: example.py
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: README.md
Three Core Difference Comparison Methods
Comparing Working Directory with Staging Area
When files have been modified but not yet added to the staging area, use the following command to view specific changes:
git diff [filename]
This command displays differences between files in the working directory and their counterparts in the staging area. For example, after modifying main.py without executing git add:
$ git diff main.py
diff --git a/main.py b/main.py
index 7898192..a1b2c3d 100644
--- a/main.py
+++ b/main.py
@@ -5,7 +5,7 @@ def calculate_sum(a, b):
return a + b
def main():
- print("Hello World")
+ print("Hello Git Diff")
result = calculate_sum(5, 3)
print(f"Sum: {result}")
The output shows the deletion of "Hello World" line and addition of "Hello Git Diff" line, clearly demonstrating specific code changes.
Comparing Staging Area with Latest Commit
When files have been added to the staging area but not yet committed, use the following command to review content about to be committed:
git diff --staged [filename]
Or use the synonym:
git diff --cached [filename]
Assuming git add main.py has been executed:
$ git diff --staged main.py
diff --git a/main.py b/main.py
index 7898192..a1b2c3d 100644
--- a/main.py
+++ b/main.py
@@ -5,7 +5,7 @@ def calculate_sum(a, b):
return a + b
def main():
- print("Hello World")
+ print("Hello Git Diff")
result = calculate_sum(5, 3)
print(f"Sum: {result}")
This command displays differences between the staged file version and the latest committed version, helping verify the correctness of pending commits.
Comparing Working Directory with Latest Commit
To view complete differences between files in the working directory and the latest commit, including both staged and unstaged changes:
git diff HEAD [filename]
This command provides the most comprehensive view of changes, suitable for scenarios requiring understanding of all modifications since the last commit.
Practical Workflow Examples
Typical Development Scenario Analysis
Consider a typical development scenario: a developer modifies multiple files, with some staged and others remaining in the working directory.
Initial state:
$ git status
On branch main
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: config.json
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: app.py
View unstaged changes:
$ git diff app.py
diff --git a/app.py b/app.py
index 1234567..abcdefg 100644
--- a/app.py
+++ b/app.py
@@ -10,3 +10,4 @@ def initialize_app():
# Application initialization
setup_database()
load_config()
+ validate_settings() # New validation function
View staged changes:
$ git diff --staged config.json
diff --git a/config.json b/config.json
index 1111111..2222222 100644
--- a/config.json
+++ b/config.json
@@ -1,5 +1,6 @@
{
"database": {
- "host": "localhost"
+ "host": "localhost",
+ "port": 5432
}
}
Advanced Usage and Best Practices
Directory-Level Difference Comparison
The git diff command supports directory paths and can recursively display changes for all files within a directory:
git diff src/
This command shows all unstaged changes for files in the src directory.
Parameter-Free Global Comparison
When no filename is specified, git diff displays all unstaged changes:
git diff
This is particularly useful for quickly browsing all modifications, especially in large changes involving multiple files.
Integration into Development Workflow
Integrate git diff into regular development workflows:
- Use
git diffimmediately after modifying files to verify changes - Use
git diff --stagedbefore staging files to confirm commit content - Use
git diff HEADfor final inspection before committing
Technical Principles Deep Dive
Git's Three-Tree Architecture
Git's difference comparison is based on its three-tree architecture: working directory, staging area (index), and commit history. Different parameters of git diff essentially compare different combinations of these three trees:
git diff: Working directory ↔ Staging areagit diff --staged: Staging area ↔ Latest commitgit diff HEAD: Working directory ↔ Latest commit
Difference Algorithm Optimization
Git uses optimized difference algorithms to efficiently compute file changes. This algorithm can:
- Intelligently identify code block movements and rearrangements
- Minimize the size of difference output
- Maintain context information for better understanding
Common Issues and Solutions
Analysis of No Output Situations
When git diff produces no output, potential reasons include:
- All changes have been staged (use
git diff --stagedto view) - Working directory is clean with no modifications
- Files are not tracked by Git
Binary File Handling
For binary files, git diff typically displays them as binary file differences without showing specific content changes. It's recommended to use specialized comparison tools for binary files.
Conclusion
Mastering the three core usages of the git diff command is essential for effective Git utilization. By accurately understanding differences between working directory, staging area, and commit history, developers can: precisely control commit content, avoid accidental erroneous commits, and maintain transparency and traceability of code changes. Integrating these commands into daily development workflows will significantly enhance version control quality and efficiency.