PHP
Article
By Tobias Schlitt

Cleaning up Code: Is Refactoring for Aesthetics worth It?

By Tobias Schlitt
Help us help you! You'll get a... FREE 6-Month Subscription to SitePoint Premium Plus you'll go in the draw to WIN a new Macbook SitePoint 2017 Survey Yes, let's Do this It only takes 5 min

Most development teams want to get their codebase into a better, more maintainable state. But what definition of better should be chosen? In many cases, it is not necessary to dig deep into Domain Driven Design (DDD) to achieve this goal. Sometimes, it’s even counter productive. But one of the most basic collections of principles can help each team a lot already: Clean Code.

Vector image of sweeping broom

The Goal

The Clean Code book by Robert C. Martin summarizes many simple and advanced improvements to get better, understandable, and therefore more maintainable code. Let’s take the following code snippet as an example:

public function getJobs($date)
{
    $ret = array();
    $res = $this->db->query(
        'SELECT * FROM jobs WHERE year=' . date('Y', $date) . ' AND month='
            . date('m', $date)
    );
    $res = $res->fetchAll();
    foreach ($res as $data) {
        $j = new Job(/* $data['...'] */);
        $j->rev = $j->time * $j->euro;
        $ret[] = $j;
    }
    return $ret;
}

The purpose of this already simple method seems to be to return jobs for a certain date. While this intention is clear, reading through the method and understanding how it performs its job is not that easy. Once done with it the algorithm appears pretty easy: first it fetches some data from a database, then it prepares job objects, calculates some additional data and returns the objects.

But the time spent to understand this simple code snippet is wasted. As you, a developer, analyze your and other people’s everyday work, you will notice that you spend much more time reading and understanding code than actually writing it.

A low hanging fruit would be to just rename the local variables to improve readability:

public function getJobs($date)
{
    $jobs = array();
    $statement = $this->db->query(
        'SELECT * FROM jobs WHERE year=' . date('Y', $date) . ' AND month='
            . date('m', $date)
    );
    $rows = $statement->fetchAll();
    foreach ($rows as $row) {
        $job = new Job(/* $row['...'] */);
        $job->rev = $job->time * $job->euro;
        $jobs[] = $job;
    }
    return $jobs;
}

The variable previously known as $ret was renamed to $jobs. This name now clearly indicates what we can expect from it: a list ob jobs. Furthermore, this cannot be easily confused with $ret which was renamed into $statement and $rows. We now understand what’s going on much better.

Another step towards improving readability would be to break the code up into logical pieces that belong together and make it read more like a book, or newspaper, where paragraphs are information groups:

public function getJobs($date)
{
    $statement = $this->db->query(
        'SELECT * FROM jobs WHERE year=' . date('Y', $date) . ' AND month='
            . date('m', $date)
    );
    $rows = $res->fetchAll();

    $jobs = array();
    foreach ($rows as $row) {
        $job = new Job(/* $row['...'] */);
        $job->rev = $job->time * $job->euro;
        $jobs[] = $job;
    }
    return $jobs;
}

There are two paragraphs now. In the first one, there is a number of rows fetched from the database. The second paragraph converts these rows into objects. This minor relocation of the declaration of $jobs and the delimiting blank line did not only ease the reading flow of the method, but also prepared a possible subsequent refactoring step of splitting the tasks into private methods with the intention of revealing names.

This is just the tip of the iceberg that Clean Code offers.

--ADVERTISEMENT--

The Methodology

As can be seen above, the refactoring steps are really minimalistic – baby steps. This is not just for the purpose of presentation but should be the way whenever you refactor. The recommended approach is to commit after each and every step:

  1. Choose 1 single refactoring step
  2. Execute the selected step
  3. Verify the code still works
  4. Commit current state and goto 1

Most importantly, this procedure gives you a safety line so you can easily jump one step back. You should do this whenever you feel like you might run in the wrong direction. If you need to start debugging – don’t. It is way too much effort to try to fix a refactoring instead of reverting it. Chances are good that you’ll lose your way in debugging and it will eat up more time than starting over would. If you go with baby steps, a reversion of the latest step would take 5 minutes instead.

Furthermore, this approach helps you stay focused and open minded about what the eventual goal of your refactoring is. Code is complex and our brain often has a biased idea of what the optimal outcome of this complex structure is. By keeping your mind open to adjusting the goal during the process, you allow it to find better solutions along the way.

The process also allows you to stop refactoring at any point in time without losing progress. An urgent bug to fix? A colleague asks for sparring? Lunch? No problem, you can either just merge the working state you have or leave it in a branch to pick it up later again.

Still, there is one draw-back with frequent commits during refactoring: so many small commits might pollute your version control history. If you are using Git, you can mitigate this by squashing the commits of a refactoring into a single one before pushing upstream.

The Preconditions

Taking Clean Code as a goal and applying baby steps are good building blocks for sustainable refactoring, but there is more to take into account.

First of all, it is not a good idea to perform a refactoring just because you can or because you think the current code looks ugly. Changing the code is always a risk and you should take no risk without good reason. One good reason would e.g. be that you often need to work on a single piece of code which is hard to grasp and where changes cascade throughout your system. But there are many other reasons like frequent bugs, unforeseen business requirements, etc. It is important that you analyze the reasons carefully, and not just blindly run into refactoring.

Secondly, it’s important to have at least some degree of automated tests that cover the most important business cases where the code being refactored is used. Legacy systems are typically hard to test with unit or integration test methods. Therefore, it’s recommended not to try to apply these methods through a brute force approach. Instead, we found it quite useful to apply tests through the front-end – in our case using Mink and PHPUnit.

But, as with so many decisions in software development, these are only rules of thumb and every project has its own requirements and priorities. Join me in my Clean Code workshop at the WebSummerCamp 2016 to learn more. Bring a companion, too – there’s a dedicated program for significant others, too, so bring company and make it a work vacation!

Login or Create Account to Comment
Login Create Account
Recommended
Sponsors
Get the most important and interesting stories in tech. Straight to your inbox, daily.Is it good?