Committing Changes to Your Codebase the Right Way
The difference between a good and a bad commit can be huge. It’s no fun having to ask your colleague — or your past self — what a particular change was about, or what the current state of things is.
This article aims to provide a thorough guide to the best practices of software commits.
If you’re already storing your projects on GitHub, you might assume the files are safe and that whenever you need to update code you’ll pull the changes, and that’s enough. All of that might be true. But let’s see what potential problems you can avoid by going the extra mile, and what additional benefits await if you do.
No Man Is an Island, Either in Teams or Individually
The reasoning above typically comes from a developer used to working alone. But the moment they need to share code with somebody else, we can expect that things are going to get messy and require a lot of explanation. Like, a lot.
Remember that our work doesn’t end at just writing code. We also need to manage things, and that requires a degree of organization and methodology. And while working in teams more readily exposes the problems caused by poor organization, we can also benefit from a better approach when working by ourselves.
Atomic vs Bloated Commits
We’ve all needed to revert a small change and found ourselves looking for it in a massive commit that changes dozens of files and adds multiple features. How much easier would the rollback be if it was located in one commit that only addressed that specific issue?
The Messy, Bloated Way
git add * git commit -m "new components"
In this example, we can bet that a large number of files are being affected. Additionally, the message “new components” doesn’t tell us much of anything — such as what components, which functionality for those components, and if the functionality is new or a refactor. Also, are any existing bugs being addressed?
That information will be important when we need to change or recover something. We’ll be trying to find a needle in a haystack, and we might just end up looking at the codebase instead and spending valuable time debugging while we’re at it.
The Atomic Way
git add ui/login.html static/js/front-end.js git commit -m "validate input fields for login"
Now we’re getting somewhere, as we start to have a clearer idea of what’s going on with that one commit.
The trick is that we can semi-automatically commit changes as part of our workflow. That is, doing a block of work that does something very specific (implementing particular functionality, fixing a bug, optimizing an algorithm), testing (and write a unit test, if need be), adding a description while our memories are fresh, and committing right away. Rinse and repeat.
The Structure of a Good Commit
These rules aren’t carved in stone, but can help you estimate what a good commit might look like:
- unambiguous: no second guessing about what those changes do.
- insightful: clearly describing what the code does, even providing links or extra information when necessary, and marking the bugs or issues that are being addressed.
- atomic: addressing one single thing at the time (think of a “block of work”, which could be anything from 20min to 2h, or even 2min if it was a quick bugfix).
Let’s look at a template and break it down:
<type/component/subsystem>: <subject> <BLANK LINE> <body>
Type, Components, or Subsystem
On my projects I often use the term “component”, with some examples being:
- i18n, l18n
- other, 3rd party
- QA, tests
- UI, GUI
The (Mandatory) Subject
The subject is a simple, straightforward line that describes what the commit does, so that everybody can get a solid idea on their first glance.
When it comes to formatting the subject, I often follow these simple guidelines:
- use the imperative (“change” instead of “changed”)
- don’t capitalize the first letter
- no period (.) at the end
- append “(…)” if there’s an optional body available
These would be some valid subjects:
- i18n: support simplified Chinese (zh-hans)
- auth: refactor Google Sign-In
- other: add jQuery 3.4.1
- QA: pass AWS deployment test (…)
As you can see, there’s no guessing involved as to what these commits do, and on the last QA commit we can also see that there’s more information available (perhaps links to relevant documentation, or further explanation for the fix).
The (Optional) Body
Occasionally, we’ll need to provide more detail than fits in a subject line to provide context, such as when fixing a persistent bug, or when hacking an algorithm.
In these cases, you can simply enter a double break line (so the subject works as a title), and enter as much information as needed.
For our previous QA commit, we could do something like this:
QA: pass AWS deployment test (...) I added a `DJANGO_SETTINGS_LIVE` environment variable to [AWS Elastic Beanstalk](https://aws.amazon.com/elasticbeanstalk/)'s `django.config` file, so that the synchronization management commands in `db-migrate` are _only_ executed in production.
As you can see, the body can be harder to follow, and that’s okay, as it’s intended for those who are actively looking for more detail. Anyone can get an idea of what the commit does just by reading the subject, and the body will serve for further context, saving us back-and-forth emails or exchanges on Slack!
— “Hey, how did you get to …”
— “Read the commit 😑.”
Don’t Forget to Address Issues!
Finally, there’s the issue of addressing issues (pun!). Any decent mid-to-large software development project should use an issue tracker as a way to keep track of tasks, improvements, and bugs — whether it’s Atlassian Jira, Bugzilla, GitHub’s issue tracker, or another.
In case you didn’t know, with most systems you can manage issues right from the commit message!
- close/resolve an issue
- re-open an issue if it has been closed before
- hold an issue, should a feature be postponed for later
All it takes is using those keywords with the ID number for the issue.
- tools: consolidate DB data with cron job; resolve #46
- UI: add routine to serialize user input; bug found, open #32
- auth: comment out Facebook login; hold #12
Additionally, you can still reference an issue as a way of providing context, even if you don’t want to modify its status — for example, “see #12”.
All of these references will be visible to anybody opening that issue on the tracker, which makes it easy to follow the progress for a given task or bug.
Wrapping It Up
You won’t always get it right (I, for one, don’t!). Things will get messy and from time to time you won’t follow the rules you’ve set for yourself or your team — and that’s part of the process. But hopefully, you know that you can be very organized with just a few upgrades to your workflow, saving yourself and your team time over the long run.
I’ve also established from experience that it makes little difference whether a project involves ten developers or if it’s handled entirely by you. Simply put, committing changes to your codebase the right way is a crucial part of good project management.