The difference between a good and a bad commit can be huge. It’s no fun having to ask your colleague — or your past self — what a particular change was about, or what the current state of things is. If you commit changes to your codebase the right way, you can avoid headaches down the road.
This article aims to provide a thorough guide to the best practices of software commits.
If you’re already storing your projects on GitHub, you might assume the files are safe and that whenever you need to update code you’ll pull the changes, and that’s enough. All of that might be true. But let’s see what potential problems you can avoid by going the extra mile, and what additional benefits await if you do.
No Man Is an Island, Either in Teams or Individually
The reasoning above typically comes from a developer used to working alone. But the moment they need to share code with somebody else, we can expect that things are going to get messy and require a lot of explanation. Like, a lot.
Remember that our work doesn’t end at just writing code. We also need to manage things, and that requires a degree of organization and methodology. And while working in teams more readily exposes the problems caused by poor organization, we can also benefit from a better approach when working by ourselves.
Atomic vs Bloated Commits
We’ve all needed to revert a small change and found ourselves searching for it when a massive commit changes dozens of files and adds multiple features. How much easier would the rollback be if it was located in one commit that only addressed that specific issue?
The Messy, Bloated Way
git add *
git commit -m "new components"
In this example, we can bet that a large number of files are being affected. Additionally, the message “new components” doesn’t tell us much of anything — such as what components, which functionality for those components, and if the functionality is new or a refactor. Also, are any existing bugs being addressed?
That information will be important when we need to change or recover something. We’ll be trying to find a needle in a haystack, and we might just end up looking at the codebase instead and spending valuable time debugging while we’re at it.
The Atomic Way
git add ui/login.html static/js/front-end.js
git commit -m "validate input fields for login"
Now we’re getting somewhere, as we start to have a clearer idea of what’s going on with that one commit.
The trick is that we can semi-automatically commit changes as part of our workflow. That is, doing a block of work that does something very specific (implementing particular functionality, fixing a bug, optimizing an algorithm), testing (and write a unit test, if need be), adding a description while our memories are fresh, and committing right away. Rinse and repeat.
The Structure of a Good Commit
These rules aren’t carved in stone, but can help you estimate what a good commit might look like:
- unambiguous: no second guessing about what those commit changes do.
- insightful: clearly describing what the code does, even providing links or extra information when necessary, and marking the bugs or issues that are being addressed.
- atomic: addressing one single thing at the time (think of a “block of work”, which could be anything from 20min to 2h, or even 2min if it was a quick bugfix).
Let’s look at a template and break it down:
Type, Components, or Subsystem
On my projects I often use the term “component”, with some examples being:
- i18n, l18n
- other, 3rd party
- QA, tests
- UI, GUI
The (Mandatory) Subject
The subject is a simple, straightforward line that describes what the commit does so that everybody can get a solid idea on their first glance.
When it comes to formatting the subject, I often follow these simple guidelines:
- use the imperative (“change” instead of “changed”)
- don’t capitalize the first letter
- no period (.) at the end
- append “(…)” if there’s an optional body available
These would be some valid subjects:
- i18n: support simplified Chinese (zh-hans)
- auth: refactor Google Sign-In
- other: add jQuery 3.4.1
- QA: pass AWS deployment test (…)
As you can see, there’s no guessing involved as to what these commits do, and on the last QA commit we can also see that there’s more information available (perhaps links to the relevant documentation, or further explanation for the fix).
The (Optional) Body
Occasionally, we’ll need to provide more detail than fits in a subject line to provide context, such as when fixing a persistent bug, or when hacking an algorithm.
In these cases, you can simply enter a double break line (so the subject works as a title), and enter as much information as needed.
For our previous QA commit, we could do something like this:
QA: pass AWS deployment test (...)
I added a `DJANGO_SETTINGS_LIVE` environment variable to
[AWS Elastic Beanstalk](https://aws.amazon.com/elasticbeanstalk/)'s
`django.config` file, so that the synchronization management commands
in `db-migrate` are _only_ executed in production.
As you can see, the body can be harder to follow, and that’s okay, as it’s intended for those who are actively looking for more detail. Anyone can get an idea of what the commit does just by reading the subject, and the body will serve for further context, saving us back-and-forth emails or exchanges on Slack!
— “Hey, how did you get to …”
— “Read the commit 😑.”
Don’t Forget to Address Issues!
Finally, there’s the issue of addressing issues (pun!). Any decent mid-to-large software development project should use an issue tracker as a way to keep track of tasks, improvements, and bugs — whether it’s Atlassian Jira, Bugzilla, GitHub’s issue tracker, or another.
In case you didn’t know, with most systems you can manage issues right from the commit message!
- close/resolve an issue
- re-open an issue if it has been closed before
- hold an issue, should a feature be postponed for later
All it takes is using those keywords with the ID number for the issue.
- tools: consolidate DB data with cron job; resolve #46
- UI: add routine to serialize user input; bug found, open #32
- auth: comment out Facebook login; hold #12
Additionally, you can still reference an issue as a way of providing context, even if you don’t want to modify its status — for example, “see #12”.
All of these references will be visible to anybody opening that issue on the tracker, which makes it easy to follow the progress for a given task or bug.
Wrapping It Up
You won’t always get it right (I, for one, don’t!). Things will get messy and from time to time you won’t follow the rules you’ve set for yourself or your team — and that’s part of the process. But hopefully, you know that you can be very organized with just a few upgrades to your workflow, saving yourself and your team time over the long run.
I’ve also established from experience that it makes little difference whether a project involves ten developers or if it’s handled entirely by you. Simply put, commit changes to your codebase the right way — it’s a crucial part of good project management.
- Telling stories with your Git history. A fun piece by Seb Jabocs on FutureLearn.
- Angular’s Commit Message Guidelines. Even if you don’t use Angular, this is a helpful read.
- FreeBSD Committer’s Guide. An in-depth guide on the topic if there is one.
- How to Properly Organize Files in Your Codebase & Avoid Mayhem. We explain how to organize files for both large and small projects, offering some easy-to-follow best practices.
- Jump Start Git. This concise guide is designed to help beginners get up to speed with Git in a single weekend.
- Professional Git. This book from Wiley takes things further, giving developers the deep dive they need to become Git masters.
Frequently Asked Questions (FAQs) about Codebase and Committing Changes
What is the difference between a codebase and source code?
A codebase refers to the whole collection of source code that is used to build a particular software or application. It includes all the versions and branches of the code. On the other hand, source code is the part of the codebase that is currently being worked on. It is the code written in a programming language which is then compiled into an executable program.
How does committing changes work in a codebase?
Committing changes in a codebase involves making changes to the source code and then saving these changes to the codebase. This process is usually done in a version control system like Git. When you commit a change, you are essentially taking a snapshot of your work at that point in time. This allows you to keep track of the changes you’ve made and revert back to a previous version if necessary.
What is the importance of committing changes the right way?
Committing changes the right way is crucial for maintaining the integrity of the codebase. It ensures that the codebase remains clean and manageable, making it easier for other developers to understand and work on. It also helps in tracking changes and identifying when and where bugs were introduced into the code.
What are some best practices for committing changes?
Some best practices for committing changes include making small, incremental commits, writing clear and descriptive commit messages, and testing your changes before committing. It’s also important to regularly sync your local codebase with the main codebase to avoid conflicts.
What is a version control system and how does it relate to a codebase?
A version control system is a tool that helps manage changes to a codebase. It keeps track of every modification to the code in a special kind of database. If a mistake is made, developers can turn back the clock and compare earlier versions of the code to help fix the mistake while minimizing disruption to all team members.
How can I avoid conflicts when committing changes?
Conflicts can be avoided by regularly syncing your local codebase with the main codebase. This ensures that you are always working on the latest version of the code. It’s also important to communicate with your team and make sure everyone is aware of the changes being made.
What is the role of a codebase in software development?
A codebase plays a crucial role in software development. It serves as the central repository for all the source code, allowing developers to collaborate and work on different parts of the software simultaneously. It also helps in tracking changes and maintaining the history of the project.
What is the difference between a codebase and a code repository?
A codebase refers to the entire collection of source code for a software, while a code repository is a place where this code is stored and managed. A code repository can contain multiple codebases and is usually managed by a version control system.
How can I ensure that my commits are meaningful and useful?
To ensure that your commits are meaningful and useful, it’s important to make small, incremental commits that each serve a specific purpose. Each commit should represent a single logical change. It’s also important to write clear and descriptive commit messages that explain what changes were made and why.
What is the relationship between a codebase and a build?
A build is the process of converting the source code in a codebase into an executable program. The codebase serves as the input for the build process, and the output is a software product that can be installed and run on a computer. The build process can include compiling the code, linking libraries, and packaging the software for distribution.