Introducing the Centralized Workflow for Git Beginners
If you’re associated with the tech industry, chances are that you’ve used—or at least heard of—version control. You may even have unknowingly used some form of version control with Google Drive or Wikis. Version control allows you to make checkpoints in your code and collaborate with others, while working on the same code base.
Git is a version control option that has, over the past decade, risen in popularity to become the preeminent choice for developers.
In this post, we’ll discuss one of the Git workflows used by teams working in the business analytics domain—known as the centralized workflow.
Git Considerations for Beginners
When you start with Git, it may be for simple reasons such as storing a backup and retaining historical revisions. If the task isn’t too complex, it may be just you working on the project.
A typical beginner getting started with Git might be a student needing to track their assignments. Or an accountant may be learning Git to keep a structured history of financial statements. Or a lawyer may also wish to use Git to properly manage a diverse set of files.
Someone else might want to store data, charts, visualizations and so on in Excel (.xls
, .xlsx
) or image files. Storing binary files is generally not recommended in Git, because changes can’t be highlighted, but there can still be benefit in retaining historical revisions.
A beginner may often think of version control and its associated concepts as a new skill that doesn’t add much value to their work. Therefore, it’s important that, when they get started with Git, they use a workflow that’s easy to understand.
The centralized workflow offers the opportunity to learn about the basic functions of Git before getting into the complexities that are introduced by extensive use of branches.
Centralized Workflow Features
A Git workflow is a set of guidelines that teams should adhere to in order to bring about some structure in their development process. Centralized workflow, also known as trunk workflow, is the simplest of Git workflows where all development happens in a single branch, typically the master
branch. Every team member commits changes directly to this branch irrespective of the reason for the change (like a bug fix, a critical patch or a feature addition).
To understand what inspired the development of centralized workflow, one must look at the early days of version control. In the 1980s and 1990s, popular version control software was “centralized”, such that the bulk of the development happened in one branch. Also, as master
is the main branch of development in Git, trunk
was its counterpart in Subversion. In the early days of centralized development, only a single developer was permitted to work on a file at a time. Other developers were locked out until the changes were completed or cancelled.
Centralized workflow is easy for Git beginners to adapt to as this process doesn’t include the extensive use of branches. It’s intuitive to people with a background of subversion too, and they feel right at home.
As the main branch for developers to work on is the master
branch, you need to have a repository at a location that’s accessible to all developers. Whenever someone makes changes to the code, they pull the changes from the central repository to make sure their code is up to date and push to the master
.
A little variation may be introduced locally by working on small feature branches temporarily. However, to follow the centralized workflow, this feature branch is merged locally with master
before pushing to the server.
Code reviews in the centralized workflow pose a challenge as they involve an audit of the full codebase. This serves as a disadvantage if your project has grown significantly in complexity and contributors.
Follow a Centralized Workflow
To create a repository that follows a centralized workflow, you need give write access to the master
branch to any potential contributor. Git itself does not allow branch level permissions, so if you implement a centralized workflow on a server, you’ll need to give write access to all branches.
On the cloud, there’s an additional authentication layer. If you’re implementing the centralized workflow on GitHub or BitBucket, you can therefore give branch level permissions to each user.
This section discusses briefly how to implement these changes on a server using a terminal and then on the cloud.
Installing Git
The process you use to install Git depends on the operating system. The easiest ways to install Git on the various systems are shown below:
- macOS Installer
- Windows Installer
- on Linux, you need to use the command line to install Git on Linux
In Ubuntu, you can use apt
package manager to install Git.
apt install git
Note to Git GUI Client Users
If you’re already using a Git GUI client like GitHub Desktop, you’ll need to install Git separately to be able to access it from the command line.
Implementing Git on the Server
In this section, we’ll look at implementing a centralized workflow on a server using the terminal. The first step is to initialize an empty Git repository. Create a directory and initialize Git in it:
mkdir my_git_projectcd my_git_projectgit init
Next, you should also initialize global configuration settings on the server for your Git user account:
git config --global user.name "Admin Name"git config --global user.email "admin@company.com"
You can check the full list of configuration options using the following commands:
git config –list
When a new user wants to clone this central repository, they just need to run the following command:
git clone /path/to/central/repository
When the cloning process is complete, the origin
remote will point to /path/to/central/repository
and a normal push
command to the master
branch of the origin
remote should suffice. However, you need to ensure that a user has write permissions to the directory where the central repository is stored (/path/to/central/repository
).
In Unix-based systems like Linux and macOS, you need to use the chmod
command to change permissions of a file or directory:
chmod 666 /path/to/central/repository
In Windows, you need to edit the security permissions of a file or directory to grant users write permission. Right-click on the directory and select Properties. In the Security tab, add, modify or remove file permissions user.
If you give only read access of your root directory to a user, they’ll be able to clone the repository and pull the latest changes. However, if they attempt a push, Git will reject it as the user has no write permissions.
Implementing Git in the Cloud
Working with the centralized workflow in the cloud gives you more control on the admin rights of the repository and the ability to change to any other workflow in the future seamlessly.
In GitHub, the first step is to create the repository in the cloud (even if you have a working copy of a Git repository on your server). In the next step, you’ll be asked to select whether you’d like to create an empty repository, or if you’d prefer to push an existing project to the cloud.
In the next step, you’re asked if you’d like to initialize an empty repository with a README
file, push changes from an existing repository, or import code from a different version control system.
In the Collaborators section of the Settings page of the repository, you’ll be able to set the permissions of each developer.
Once you add a collaborator, they’ll be able to confirm if they’d like to collaborate on the project. When a developer confirms this, you’ll be able to change their permissions.
GitHub’s documentation on setting branch permissions explains the process in detail. The process is similar for BitBucket and GitLab as described in their respective documentation links.
Considerations for Git Beginners
For a new team member, the introduction to centralized workflow is fairly simple.
Once the repository is cloned and changes are made to the master
, the first step is to pull the latest changes from the centralized repository to make sure all new changes are incorporated before a push.
Finally, the changes are pushed to the master
branch of the origin
remote to incorporate the changes:
git pull origin mastergit push origin master
If there are any conflicts, the step where you pull updated code from master
will raise a conflict, which you will have to fix and commit before pushing the code.
In the centralized workflow, forking is optional, as code reviews are not required and users have direct access to the master
branch. Forking and creating a pull request is a redundant process in the centralized workflow.
For Git beginners, it’s often a challenge to get used to the terminology and processes. Therefore, in this section, let’s talk about the process of undoing errors in Git.
Undo Changes
Let’s assume that some changes have been made in a file since the last commit. Run the following command to revert it back to the state at the last commit:
git rm --cached file_name
This process works when the changes haven’t been staged yet. If instead you’ve staged the changes for a commit using git add
, you need to run the following command to tell Git that you’d like to remove the staged changes:
git reset HEAD file_name
Changing the Contents of a File
This command only changes the staged changes and doesn’t affect the contents of a file. Once you’ve reset the file, you can use git rm --cached
to change the contents of the file back to the state of the last commit.
Undo Commit
If you’ve committed a change that you weren’t supposed to, you can use a form of the git reset
command to revert it back to a state before the commit. The format of the command is as follows:
git reset --option HEAD~N
The HEAD
pointer in Git points to the latest commit of a branch. In the command above, you tell Git that you’d like to revert to a state which is N
commits behind the current state of a branch. HEAD~1
, thus, refers to undoing a single commit.
There are three options that you can use with the reset option:
--soft
: Undoes a commit, but keeps changes to files staged. This doesn’t change the content of any file. You may want to do this in case you want to make another change to your existing changes before a commit.--mixed
: This option undoes a commit and unstages any changes. The contents of the files remain the same too. This option is similar to the--soft
option.--hard
: Undoes a commit and changes the state of files to the earlier commit. This is generally not advised unless you’re absolutely sure you want to lose any changes in the lastN
commits.
Undo Push
What happens if you push a commit you didn’t intend to push? You need to reset the changes, and then push the changed history of the file again!
First change the commits using the steps explained in the “Undo Commit” section above and commit the new changes. When the changes are done, you’ll need to force push the changes:
git push -f origin master
Note that this may be harmful in case of a centralized workflow. If a developer pushes changes after your wrong push, this will erase those changes.
Centralized Workflow: Wins and Fails
The best part about a centralized workflow is its simplicity! When your team is small and time is critical in pushing new changes quickly, this workflow is perfect for you. A centralized workflow lets you make changes to your code frequently. This is possible because there are no intermediate review steps. Further, this is helpful when you’ve been working with your teammates for a long time; you’d rather trust their code than review it at each merge.
When I started out with Git, my primary task was to make sure I didn’t create files like index.html
, index2.html
and index_backup.html
in a project—all of which pointed to various versions of the same file. Various concepts of version control were new to me, so it made perfect sense to stick to a single branch and get comfortable with Git, before moving to more difficult concepts.
The centralized workflow does come with its fair share of disadvantages, though. When your project matures and the number of team members increases, you need to introduce an additional layer to code reviews before merges—and this is only possible when you move to a different workflow. The centralized workflow also isn’t the right choice for you if have a bunch of new team members, as giving everyone write permissions to the central repository isn’t a great idea.
Back when I was still getting to know the basics on Git, I once messed up the history of an academic project and sent a forced push to the central repository. When the initial panic settled down, another team member was able to restore the project’s history from their local repository. This experience shows a major flaw in centralized development, so one must be really careful when working on this workflow.
Conclusion
In this post, we discussed the nuances of the centralized workflow—its features, a step-by-step guide to setting it up for your team, and finally its advantages and disadvantages.
Owing to the structure of typical business analytics teams, the centralized workflow is a good fit. Its simplicity appeals to beginners and enables them to warm up to version control, setting the stage for a more complicated workflow in the future.