Debugging in Git with Blame and Bisect

Shaumik Daityari

When you are working with a huge code base, you may discover bugs in your code (or worse, in someone else’s code) that prevent you from proceeding any further in your development. You can checkout to an old commit to see if the bug was present there — but this is often the worst way of doing so. Imagine you have a hundred commits to check — how much time would be wasted?

Thankfully, Git has two tools that help you with debugging. We will have a look at both and try to understand their use cases. Let us start by intentionally introducing a bug into our code:

Adding an error line intentionally

I’ve added a line to the file my_file that is unwanted and assumed to cause the error. I also add a few commits after that to bury the faulty commit. Let us verify that the faulty line has been added to the file by running the following:

cat my_file

Faulty file

Notice the “Unwanted line” that is supposedly causing the error.

Debugging with Blame

Once you have discovered a bug, you may or may not know the location of the faulty code. Let’s say that you do. In our case, let’s say that you know my_file is causing the trouble. In that case, we can run the following command to get more information about the lines in the files, and the commits those lines belong to.

git blame my_file

blame contents of a file

If you look at the output of git blame, you can see that commit 0bf63b53 is what introduced the bug (“Unwanted line”). If you want to check what else was changed in that commit, or want more information about the same, you can run the following:

git show 0bf63b53

running git show on the bad commit

There you go — we now know which commit caused the error and what else was changed in that commit. We can proceed to fixing the error.

Debugging with Bisect

git blame helps you when you have some idea about what is causing the problem. What if you had no idea what is causing the error and there are hundreds of commits before you can go back to a working state? This is where git bisect comes into play.

I will mention it once again — git bisect for a trivial situation like our case is overkill. However, I am going through the process for demonstration purposes only.

Imagine git bisect as a wizard that takes you through your commits to find out which commit brought in the error. It performs a binary search to look through the commits, before arriving at the one that introduced the bug.

To start off, we need to select a “good” commit, where the bug is not present. We then need to select a “bad” commit where the bug was present (ideally the latest commit that contains the bug, so you can assign it as “bad”). Git then walks you through recent commits and asks you whether they are “good” or “bad”, until it finds the culprit. It’s essentially a binary search algorithm over the array of commits to find which commit was the first “bad” commit.

A very important thing to note here is that you should be searching for a single bug in this process. If you have multiple bugs, you need to perform a binary search for each of the bugs.

Working with the same bug in the case of git blame, we will assume that we don’t know what file has the error. To check if an error is present in a certain commit, we will run cat my_file to see if the contents of the file contain the unwanted line.

Start the git bisect Wizard

We will run the following command to tell Git that we are going into binary search mode to find a bug:

git bisect start

Select a Good Commit

After we start the wizard, we need to inform Git about the commit where everything was working. Let’s examine the commit history to find the commit we want.

git log --oneline

Project history

We go with 8dd76fc, which is the oldest one:

git bisect good 8dd76fc

Select a Bad Commit

After we have assigned the “good” tag to a good commit, we need to find a bad commit so that Git can search in between those two and tell us where the bug was introduced. Since we know that the latest commit (1094272) has the error, we go with that one:

git bisect bad 1094272

Assign Commits as “Good” or “Bad”

Once we have assigned our good and bad commits (which serve as the initial and final pointers for our search) Git walks us through the commits and asks us whether each commit contains the bug.

Starting git bisect wizard

Notice in the screen shot that 7 revisions would be covered in roughly 3 steps. The number of steps grows logarithmically. Since 22 < 7 < 23, we need three steps. If there were a hundred revisions, we would need roughly 7 steps and if there were a thousand revisions, we would need about 10 steps.

Now we are presented with commit cc48fb and we need to ascertain if it’s a good or a bad commit. In our case, we check the contents of the file and see if the unwanted line is present:

Checking the contents of my_file

Since the line is not present, we designate it as a good commit.

git bisect good

We continue this process for the next few steps until git bisect finds the first bad commit:

git bisect finds the bad commit

After we are done with the commit, we need to come out of the Git binary search mode:

git bisect reset

Git bisect reset - coming back to old state

You may want to have a look at this nice screencast on Git Bisect, which takes you through the process that I have discussed.

Automating the Process

We have gone through the process of debugging in Git interactively. If you are familiar with unit testing, you could write a unit test that identifies the bug. In case you want to run the tests automatically, you need to provide Git the test script that you have written.

git bisect start
git bisect run [location_to_script_file]

Replace location_to_script_file with the actual location of the script file, removing the square brackets.

Here’s a tutorial on how to mechanize the process of debugging in Git in PHP.

Conclusion

We took an overly simplified case to explain a very powerful concept. If you know what file has the bad code, you should proceed with git blame without a second thought. However, if you have no idea what is causing the error, and your repository is considerably large, with an enormous history, git bisect is definitely the way to go.

How do you debug your code? Do you like binary search in Git? Do you have a better way of doing the same thing? Let us know in the comments below.

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

  • LouisLazaris

    Thanks, it looks like you’re right. I’ve corrected it. If there’s still something wrong with that, Shaumik will let me know.

  • http://dada.theblogbowl.in/ Shaumik Daityari

    Thanks Chris for noticing. It was indeed a typo :)