Track Your Hacks with CVS

The following is republished from the Tech Times #130.

Quite by coincidence, three times in the past week I have had to hack the code of some open source software that went into the site I was working on. First I had to modify phpBB to include an embedded calendar on the home page of a private forum I administer. Next I made some custom tweaks to the code of the K2 theme for WordPress. Finally, I had to hack phpAdsNew to produce XHTML Strict output.

In each case, the hack required me to actually modify the code of the software. Obviously I prefer not to do this, because when the next release comes along the updated files will overwrite my hacks, and I’ll need to implement them all over again.

Normally I’d just document my hacks someplace and grumble about the lack of customization features in the software, but three times in a week was too much. Let me show you how I solved the problem using a common development tool in an unconventional way!

CVS (Concurrent Versions System) is a system for tracking changes made to files in a project over time, potentially by multiple developers, each working on his or her own copy of the project files at the same time. As it turns out, it’s extremely useful for managing the custom hacks you make to open source software.

In the past few years, Subversion has sprung up as an alternative to CVS that eliminates some of the headaches in CVS to do with things like moving or renaming files. Since such changes don’t usually happen when you’re hacking an existing script, and since SitePoint already has a decent introduction to CVS, I’ll stick with CVS for this discussion. If you know Subversion, you can use it instead.

Setting Up

You first need to create a CVS repository for yourself (if you don’t already have one). Because I use Windows, I set up CVSNT to do this. If you’re on Linux, you can use the original CVS software. You’ll also want to get an easy-to-use client program (I recommend SmartCVS), unless you particularly like working from the command prompt, in which case I’ve included all the commands below.

When your CVS server is set up, store a “clean” copy of the software version that you have hacked for use on your site as a new module (or project) in the repository (e.g. cvs import phpBB2 phpBB2 init_ver). Tag this “clean” version in the repository to indicate the version number of the software it represents (e.g. cvs tag release-2-0-17 .).

Immediately create a branch in the repository (e.g. cvs tag -b custom-mods-branch .) from this initial version, and then check out a copy of the files from the branch into a convenient working directory (e.g. cvs checkout -r custom-mods-branch phpBB2). This copy is where you’ll keep track of your hacks.

Copy your site’s (hacked) copy of the software’s files on top of the “clean” copy you just checked out from the branch, and then perform a CVS update to identify the files that have been modified with hacks (e.g. cvs update .). Review these changes to make sure they are all wanted.

Review hacks in your CVS client
Fig 1. See hacks as changes in the working copy

You can now track changes to your hacks as you make them in this branch. Simply hack the files in this working copy of the branch to your liking and commit your changes to the repository. To update your site with these hacks, delete everything in the destination directory and then export the latest version of your branch files to your site (e.g. cvs export -D /home/www/htdocs phpBB2).

Merging with New Releases

When a new version of the software comes out, extract its files into the “clean” copy you made at the beginning, commit all the changes to CVS, and then tag the updated files for the release (e.g. cvs tag release-2-0-19 .). These updates will be stored into the trunk of your repository, so they won’t affect your hacked version (which is tracked in the branch).

Now, here’s the payoff: to update your hacked version of the software with the changes in the latest official release, just go to your working copy of the branch and merge in all the changes from the trunk (e.g.
cvs update -j release-2-0-17 -j release-2-0-19). Your hacked files should be updated seamlessly with changes that were made in the official update(s).

Files where your hacks occurred close to or on the same line as a change in an official release will report conflicts when you perform the merge. You’ll have to open these files and resolve the conflicts yourself (CVS will helpfully include both versions of the code at the point of conflict) before committing the corrected versions to the branch. You should then set a tag on the branch to indicate where you did the merge (e.g. cvs tag merge-2-0-19 .).

In effect, CVS will perform all the updates that it can for you, and then will pick out those updates that appear to interfere with your hacks so that you can deal with them. If that isn’t useful, I don’t know what is!

A typical CVS tree for hack management
Fig 2. The CVS tree showing merged releases.

The next time an official update comes out, you can perform another merge, but remember to update the starting tag for the merge so you only get the changes since the last time you did a merge (e.g. cvs update -j release-2-0-19 -j release-2-0-21).

A Note on the Vendor Branch

The CVS gurus in the audience may be up in arms at this point. In fact, CVS automatically creates a vendor branch for every module you check into it. You can import each new version of the software into this special branch, and then merge the changes from the branch into your (hacked) trunk.

So why didn’t I use this? The truth is, it’s just as easy to do all this using a normal branch as I described above, and it doesn’t require you to learn all about vendor branches and their pitfalls. Also, you can do it in simpler CVS clients that don’t support vendor branches (like the free version of SmartCVS).

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.shaunhills.com hillsy

    Given that your CVS article seems targeted (at least in part) at those who don’t currently do source control, skipping over Subversion so quickly seems like a bit of an omission.

    It’s fairly commonly accepted that Subversion is a compelling replacement for CVS in almost all cases because it introduces many new features – e.g. atomic commits – with practically no downside.

    Probably the only good reason to use CVS is if you’re already using it and don’t want the hassle of migrating. Anyone looking at a new source control deployment should seriously consider Subversion instead.

    Harry’s article is four years old this year. Perhaps it’s time for someone to write a Subversion article for SP? ;)

  • http://www.phppatterns.com HarryF

    Great stuff – think this is a very common problem.

    hillsy

    Probably the only good reason to use CVS is if you’re already using it and don’t want the hassle of migrating.

    My (bad) reason is being grumpy and set in my ways – migrating myself is a problem ;). Experience with darcs, thanks to Dokuwiki and distributed version control suggests if I’m going to change, it needs to be for a radically different system, rather than just a better CVS.

    CVS on a Stick

    One side tip of sorts – have found keeping a CVS repository on a USB stick works pretty well (which are getting bigger these days).

    To use CVS doesn’t necessarily require a server as such – you can maintain a repository via the filesystem, via the cvs command line binary.

    For example, to kick off a new CVS repository on a USB stick (assuming Linux with the USB stick mounted to /media/usb);

    $ cvs -d /media/usb/cvsroot init
    

    That will setup a new CVS repository in the subdirectory cvsroot on the USB stick.

    From there, for the first import and initial checkout you need to specify the usb stick e.g.

    $ cd phpBB2
    $ cvs -d /media/usb/cvsroot import phpBB2 phpBB2 init_ver
    

    Then later…

    cvs -d /media/usb/cvsroot checkout -r custom-mods-branch phpBB2
    

    After that, when working with this checkout, you will no longer need to specifiy the CVS_root with the -d option.

    On Windows, where your USB stick is mounted to a drive letter like F:, you need to preceed the CVS root with the special identifier :local: e.g.;

    C:> cvs -d :local:F:/cvsroot checkout -r custom-mods-branch phpBB2
    

    Note also since cvshome.org recently had a “makeover” (Sopranos style), the Windows command line cvs.exe had to be found via other sources. One location is here. Alternatively there’s the Cygwin version which may be a saner choice.

    The only problem (which I haven’t dealt with properly yet) is when using a USB stick between Linux and Windows, they typically come, by default, with a FAT filesystem. This means on Linux you lose filesystem permissions. Probably the best way to go there is to reformat the stick with a Linux filesystem (e.g. ext2) then use something like ext2fsd to access it from Windows. That’s only a guess though – haven’t actually tried doing so.

  • http://www.igeek.info asp_funda

    Nice one Kev!! how about an article on setting up SVN & using it, for n00bs like me who’ve never touched it before(& please just don’t look at the Linux & its comman line perspective, also consider the win folks)?

    One side tip of sorts—have found keeping a CVS repository on a USB stick works pretty well

    looks cool Harry, can that be done with SVN(on windows)? If yes then a pointer would be very much appreciated!! :)

  • Pingback: » Blog Archive - » Track Your Hacks with CVS Alex Jones - No, not that Alex Jones… Really, I’m not the Alex Jones you think I am.

  • Anonymous

    Tortoise SVN has OK windows gui support
    http://tortoisesvn.tigris.org/

    On OS X the gui is not so hot, so I just use terminal.

  • http://www.whitelionsoft.com veslach

    The irony of it all… I spent a good portion of yesterday reading up on how to do vendor branches in subversion & then TechTimes & the SP rss feed appeared in my inbox last night for this article on CVS.

    Here’s hoping that SP will include something similar for SVN.

  • Pingback: b.l.o.g. » Blog Archive » Manage your Hacks with Subversion

  • http://www.phppatterns.com HarryF

    can that be done with SVN(on windows)? If yes then a pointer would be very much appreciated!!

    svn seems to be able to do the same thing as CVS – run a repository purely over filesystems.

    For Windows it’s probably worth bearing in mind this faq for coping with drive letters, bearing in mind how to create a new repository;

    svn import file:///d:/some/path/to/repos/on/d/drive
    

    That’s all guessing though.

  • bramernic

    I currently maintain two phpBB forums. Can I maintain one main trunk with two different custom-mod branches where the files for one set of mods will never need merging with the files from the other? If it can then that would seem to save even more time.

  • http://www.whitelionsoft.com veslach

    that’s what’s great about subversion, is that you can use an external repository & link 2 different projects to the same external “library”. I’ve heard there are ways to do this with CVS, but I’m not sure how…

    if you have a CVS repository with both phpBB forum projects in it, then here is 1 way (not necessarily the best) to organize it (using trunk as in the article rather than the vendor branch) -

    core phpBB code would go in ‘repos/trunk/phpBB’
    phpBB project 1 would go in ‘repos/trunk/project1′
    phpBB project 2 would go in ‘repos/trunk/project2′

    in order to apply vendor updates, you’d take a version diff of ‘repos/trunk/phpBB’ & apply it to both ‘repos/trunk/project1′ & ‘repos/trunk/project2′.

    I’m not sure how to do this with CVS if you’re maintaining a separate repository for each project.

  • Pingback: SitePoint Blogs » PHP / Web Application Integration

  • bkjones@gmail.com

    I’m not the world’s greatest cvs hacker, but I’m confused, because it says in this article to set up a cvs server, then put this software into a repository, and then immediately tag the clean version, and make a branch, and *then* check out a working copy from the branch.

    Well, I’m running the standard cvs software on a linux box (all CLI), and it doesn’t even let you tag anything until you’ve checked out a working copy. If I set up my repository, cd to the directory where my repository is, and run “cvs tag -b mybranch” it says:

    cvs tag: in directory .:
    cvs [tag aborted]: there is no version here; run ‘cvs checkout’ first

    Clues?

  • ramk8928

    The issue

    cvs [tag aborted]: there is no version here; run ‘cvs checkout’ first

    is due to the the CVS directory not found in the available version

    eg., If you have checkedout a version of files from the repository a default “CVS” directory will be available which has the details of the repository hence for tagging the source this is required. If this directory is not present the above error may occur

    ram