How to Properly Organize Files in Your Codebase & Avoid Mayhem

The main library, data, UI, docs and wiki, tests, legacy and third-party components … How do we keep track and maintain order within all of this? Organizing the files in your codebase can become a daunting task.

Relax — we’ve got this! In this article, we’ll review the most common systems for both small and large projects, with some easy-to-follow best practices.

Key Takeaways

Organizing files in a codebase reduces problems and saves time when needing to access and review things in the future. It’s important to establish basic rules for naming files, addressing project documentation, and organizing an effective workflow.
Every software project should have a README, CHANGELOG, COPYING LICENSE, and .gitignore file. Depending on the project, additional files like AUTHORS, BUGS, CONTRIBUTING/HACKING, FAQ, INSTALL, NEWS, THANKS, TODO/ROADMAP, and VERSION/RELEASE might also be included.
Files should be organized into folders for components or subsystems, but the creation of directories should be limited to keep things manageable. Certain types of files like data, binary files, and settings should be left out of the project.
Consistency is key in file organization. Whether it’s in the naming of directories or files, or the structure of the project, maintaining consistency makes the codebase easier to navigate and understand.

Why Bother?

As with pretty much all of the tasks related to project management — documentation, software commits, deployment — you’ll benefit from taking a conscious, programmatic approach. Not only it will reduce problems now, but it will also save you and your team quality time in the future when you need to quickly access and review things.

You surely can recall function names from the top of your head for whatever is it that you’re coding right now, and quickly find a file you need to edit, and sharply tell what works from what doesn’t — or so you think. But could you say the same about that project you were working on last year?

Let’s admit it: software projects can go on spans of inactivity that last for months, and even years. A simple README file could do a lot for your colleagues or your future self. But let’s think about the other ways you could structure your project, and establish some basic rules to name files, address project documentation, and to some degree organize an effective workflow that would stand the test of time.

Making Sense of Things

We’ll establish a “baseline” for organizing files in a project — a logic that will serve us for a number of situations within the scope of software development.

As with our rules for committing changes to your codebase the right way, none of this is carved in stone, and for what it’s worth, you and your team might come up with different guidelines. In any case, consistency is the name of the game. Be sure you understand (and discuss or dispute) what the rules are, and follow them once you’ve reached a consensus.

The Mandatory Set

This is a reference list of files that nearly every software project should have:

README: this is what GitHub renders for you right under the sourcetree, and it can go a long way to explaining what the project is about, how files are organized, and where to find further information.
CHANGELOG: to list what’s new, modified or discontinued on every version or revision — normally in a reverse chronological order for convenience (last changes first).
COPYING LICENSE: a file containing the full text of the license covering the software, including some additional copyright information, if necessary (such as third-party licenses).
.gitignore: assuming you use Git (you most probably do), this will also be a must to tell what files not to sync with the repository. (See Jump Start Git’s primer on .gitignore and the documentation for more info, and have a look at a collection of useful .gitignore templates for some ideas.)

Supporting Actors

Some additional files you might also consider including, depending on the project:

AUTHORS: credits to those participating in writing the code.
BUGS: known issues and instructions on reporting newly found bugs.
CONTRIBUTING/HACKING: guide for prospective contributors, especially useful for open-source projects.
FAQ: you already know what that is. ;)
INSTALL: instructions on how to compile or install the software on different systems.
NEWS: similar to the CHANGELOG file, but intended for end users, not developers.
THANKS: acknowledgments.
TODO/ROADMAP: a listing for planned upcoming features.
VERSION/RELEASE: a one-liner describing the current version number or release name.

Folders for Components or Subsystems

Often we’ll come across a set of functionalities that can be grouped into a single concept.

Some examples could be:

internationalization (i18n) and localization (l18n)
authentication modules
third-party add-ons
general purpose tools and cron jobs
user interface (UI) and graphical user interface (GUI)

All these can be organized into a single “component” or “subsystem” directory — but don’t go crazy!

We want to limit the creation of directories to keep things manageable, both on the root directory (where the main components will be located) and recursively (inside each directory). Otherwise, we might end up spending a lot of time routinely browsing files in carefully — and excessively — organized directories.

Leave that Out of the Sourcetree, Please

As much as we want the project to be neat and organized, there are certain kinds of files we want to leave entirely out of it.

Data. You might be tempted to have a data/ directory in your sourcetree for CSV files and such, especially if they take up just a few kilobytes. But how about if they take megabytes or even gigabytes (which isn’t unusual these days)? Do you really want to commit that to your codebase as if it were code? No.

Binary files. You don’t want renders of videos or compiled executable files next to source code. These aren’t development files, and they simply don’t belong here. As with data files, they can also end up using a lot of space.

Settings. This is another big NO. You shouldn’t put credentials, passwords, or even security tokens in your codebase. We can’t cover the ways around this here, but if you’re a Python developer, consider using Python Decouple.

Case 1: Web App

Let’s consider a web application — software that runs on a web server and that you can access through the browser, either on your desktop computer or mobile device. And let’s say this is a web app that offers a membership to access a premium service of sorts — maybe exclusive reports, or travel tips, or a library of videos.

File Structure

├── .elasticbeanstalk
├── .env
├── billing
├── changelog.txt
├── locale
│   ├── en
│   └── zh_Hans
├── members
├── readme.txt
├── static
│   ├── fonts
│   ├── images
│   ├── javascript
│   └── styles
├── templates
│   ├── admin
│   └── frontend
├── todo.txt
└── tools

Analysis

This is a basic structure for a web app with support for two languages — English and simplified Chinese for mainland China (locale directory). Also two main components, billing and members.

If you’re a tiny bit familiar with website development, the contents of the static and templates folder might look familiar to you. Perhaps the only unusual elements might be .elasticbeanstalk, which stores deployment files for Amazon Web Services (AWS), and .env, which only locally stores settings for the project, such as database credentials. The rest, such as README and TODO, we’ve already discussed.

The tools directory is an interesting one. Here we can store scripts that, for example, prune the database, or check the status of a payment, or render static files to a cache — essentially, anything that isn’t the app itself but helps to make it function properly.

Regarding naming, it doesn’t make much of a difference if we name the images directory images/ or img/, or the styles directory styles/ or css/, or the javascript/ directory js/. The main thing is that the structuring is logical, and we always follow something of a convention, either long descriptive names, or short ones.

Case 2: Desktop App

Now let’s consider an application that you can download and install on your computer. And let’s say the app takes some input, such as CSV files, and presents a series of reports afterward.

In this examples, we’ll let the sourcetree grow a little larger.

File Structure

├── .gitignore
├── data
├── doc
├── legacy
│   ├── dashboard
│   ├── img
│   └── system
├── LICENSE
├── README
├── tests
├── thirdparty
├── tools
│   ├── data_integration
│   └── data_scraping
├── ui
│   ├── charts
│   ├── css
│   ├── csv
│   ├── dashboard
│   ├── img
│   │   └── icons
│   ├── js
│   ├── reports
│   └── summaries
├── VERSION
└── wiki

Analysis

The ui/ folder is, essentially, the core of the app. The name of the subfolders are pretty much self-descriptive (another good practice). And unlike our web app example, here we’ve opted for shortened names (such as js instead of javascript). Once again, what really matters is that we’re consistent within the project.

Earlier, I suggested leaving data files out the sourcetree, and yet there’s a data/ folder in there. How come? Think of this tree as a developer’s box that needs data in order to properly test the app. But that data is still out of the repository synchronization, following the rules set in the .gitignore file.

The legacy/ folder is for a part of the app that’s being discontinued but still provides some functionality that might come in handy until it’s fully refactored into the new system. So it provides a good way of separating old from current code.

Also new here are tests/, which provides a place to do quality assurance with unit tests, and thirdparty/, a place to store external libraries that the software needs.

Notice there are doc/ and wiki/ folders, which might look like duplication. However, it’s also perfectly possible — and even reasonable — to have a documentation folder intended for the end-user, and a wiki for the development team.

Wrap Up

A good message is worth repeating: be organized, even when working individually. Hopefully, this article has given you some ideas that you can start implementing into your workflow right away to prevent mess as the number of files in your app increases.

As mentioned, the guidelines might change here and there, as (almost) every project is different, and so are teams. Ideally, you or your team will get to decide how you structure the project — adding a little document describing the reasoning for this structure — and you’ll then stay consistent with those rules from now on.

And remember that, with most of the guidelines here, it isn’t all that important if you choose dashes or underscores to name files (to choose one topic among many). Consistency is key.

Frequently Asked Questions (FAQs) on Organizing Project Files

What are the benefits of organizing project files in a structured manner?

Organizing project files in a structured manner has several benefits. Firstly, it improves code readability and maintainability. When files are organized logically, it becomes easier for developers to understand the codebase and make changes without breaking existing functionality. Secondly, it enhances team collaboration. When multiple developers are working on the same project, a well-organized file structure ensures everyone knows where to find specific pieces of code. Lastly, it speeds up the development process. Developers spend less time searching for files and more time writing and optimizing code.

How can I decide on the best structure for my project files?

The best structure for your project files depends on the nature and complexity of your project. For small projects, a simple directory structure might suffice. However, for larger projects with multiple components, you might need a more complex structure. Consider factors like the programming language you’re using, the frameworks or libraries in use, and the team’s preferences. It’s also important to keep the structure flexible so it can evolve as the project grows.

What are some common strategies for organizing code?

There are several strategies for organizing code. One common approach is to group files by feature. This means all files related to a specific feature are kept in the same directory. Another approach is to group files by type, such as separating CSS, JavaScript, and HTML files into different directories. Some developers prefer to use a hybrid approach, combining elements of both strategies. The key is to choose a strategy that makes sense for your project and team.

How can I keep my codebase organized as it grows?

As your codebase grows, it’s important to regularly review and refactor your file structure. This might involve splitting large files into smaller, more manageable ones, or reorganizing directories to better reflect the current state of the project. Automated tools can help identify areas of the codebase that are becoming unwieldy or difficult to maintain. Regular code reviews can also help ensure that new code adheres to the established file structure.

What role do naming conventions play in file organization?

Naming conventions play a crucial role in file organization. Consistent, descriptive file names make it easier to understand what each file contains at a glance. This can significantly speed up the development process, especially when working on large projects or collaborating with a team. Naming conventions should be agreed upon at the start of a project and adhered to consistently.

How can I ensure that my file organization strategy is followed by all team members?

To ensure that your file organization strategy is followed by all team members, it’s important to clearly document your strategy and make this documentation easily accessible. Regular code reviews can also help enforce adherence to the strategy. Additionally, consider using automated tools that can check for compliance with your file organization rules.

Can I change my file organization strategy midway through a project?

Yes, you can change your file organization strategy midway through a project, but it should be done carefully to avoid disrupting the workflow. Before making any changes, discuss the proposed new strategy with your team and ensure everyone understands the reasons for the change and how to implement it. It’s also important to update any relevant documentation to reflect the new strategy.

How can I handle dependencies when organizing my project files?

Handling dependencies can be a challenge when organizing project files. One approach is to keep all dependencies in a separate directory. This makes it easier to manage and update them. Some programming languages and package managers also provide tools for managing dependencies, which can automate much of this process.

What are some common mistakes to avoid when organizing project files?

Some common mistakes to avoid when organizing project files include not planning your file structure in advance, not following a consistent naming convention, not documenting your file organization strategy, and not regularly reviewing and refactoring your file structure. Avoiding these mistakes can help keep your codebase clean, organized, and easy to navigate.

How can I learn more about best practices for file organization?

There are many resources available for learning about best practices for file organization. Online tutorials, coding bootcamps, and developer forums can provide valuable insights. Additionally, studying the file structures of open-source projects can provide practical examples of how to effectively organize project files.