Optimization Made Easy with mod_pagespeed

Big announcement last week from the folks at Google: they’ve released a new module for the Apache web server: mod_pagespeed. mod_pagespeed bundles together a bunch of server-side page speed optimizations into one easy-to-use module. With the appropriate configuration options, it will compress and combine your CSS and JavaScript files, optimize caching settings, remove comments and whitespace from your files, optimize images, and more.Let’s take it for a quick spin, shall we?

Installation

The mod_pagespeed module is available as a simple installable package. That’s good news: you won’t need to recompile Apache from source to use it. At the moment it’s available for CentOS/Fedora and Ubuntu/Debian systems, whether they’re 32 or 64-bit. For the purposes of this walkthrough, I’ll be using Ubuntu, but you can find installation instructions for CentOS/Fedora on the downloads page.Download the .deb file, then run (assuming, of course, you already have Apache 2.2 installed):

sudo dpkg -i mod_pagespeed_*.debsudo apt-get -f install

Now you just need to restart Apache to enable the new module:

sudo /etc/init.d/apache2 restart

Configuration

By default, all of mod_pagespeed’s filters are turned off, so you won’t actually see it doing anything. To remedy this, we need to edit its configuration file. By default (on Ubuntu or Debian-based systems), this will be located in /etc/apache2/mods-available/pagespeed.conf.At the very top of that file, you’ll see something like:

<IfModule pagespeed_module>  SetOutputFilter MOD_PAGESPEED_OUTPUT_FILTER  ModPagespeed on

That’s good! If you see ModPagespeed off, change it to on before going any further.A little farther down the file (line 29 in the version I have), you’ll see a commented out line like this:

# ModPagespeedRewriteLevel CoreFilters

As the comment block above that line explains, the default rewrite level of CoreFilters will give you a basic set of optimizations that are safe for most web pages. Uncomment that, and restart your Apache server.Now, if you load any pages from your server, you should notice a few things happening. For example, multiple small CSS files will be combined into one tag inline in your page. mod_pagespeed is smart about the way it handles this: it weighs the potential value of caching that could be derived from serving the CSS files separately against the cost of those extra HTTP requests. So, only files of a certain size or larger will be inlined like this (of course, you can configure this size threshold yourself).Those filters are pretty conservative, so let’s enable some more! Further still down the conf file (line 46 for me), there’s a line like this:

# ModPagespeedEnableFilters collapse_whitespace,elide_attributes

The ModPagespeedEnableFilters declaration simply takes a comma-separated list of filters you’d like to enable. By default, it’s written out with collapse_whitespace (which, as the name implies, will collapse superfluous whitespace characters in your HTML files) and elide_attributes. That latter is a tricky one (and an indicator of how serious Google is about byte-counting): it will shorten any HTML attributes which make no difference to the browser. So, for example, disabled="disabled" works exactly the same as just disabled, so mod_pagespeed will change the former to the latter. type="text" is the default for input elements, so mod_pagespeed will drop it entirely.Once you’ve uncommented that line, you can experiment with all the other available filters by stringing them to the end of it. There’s remove_comments, which strips comments from your HTML (but is intelligent enough not to remove your IE conditional comments). There’s optimize_images, which re-scales, re-compresses, and strips metadata from images loaded via img tags. For the totally obsessive among you, there’s even remove_quotes, which strips unnecessary quotation marks from around HTML attributes (so class="description" will become class=description).The full list of available filters and descriptions of how they work and how to use them is available on the project’s Google Code page

What else?

mod_pagespeed also comes with some simple ways to track statistics with regards to page performance. To do this, it injects tiny JavaScript snippets into the top and very bottom of your pages, and uses those to report back to the server on load times. This way, you can measure the effects of various different filters and decide which ones you’d like to keep using.So, will you be rushing to set up mod_pagespeed on your servers?

note:Want more?

If you want to read more from Louis, subscribe to our weekly tech geek newsletter, Tech Times.

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

  • http://www.cemerson.co.uk Stormrider

    I don’t like the way google uses invalid code in order to save a few bytes personally, so probably won’t be using it, but it does look interesting nonetheless!

  • Tim

    Agreed, elide_attributes sounds like a bad idea for those of us that try and do the right thing. Last thing I want is the web server destroying all my good intentions. If I run it, I wont be using that option.

  • Gryffyn

    Been trying to keep an eye on mod_pagespeed since my web host, Dreamhost, has it installed now and it seems to be going well. The only thing they’ve noted having an issue with is the Serf module.. some conflict that causes a lot of Serf traffic.

    We’re looking at possibly using it at work, so I appreciate the extra info and thanks to @justdesign for posting this link.

  • M

    @Gryffyn: Thanks for that info. We’re using DH and I was not sure if this article would be of any use to me. Glad to know that it is.

  • http://htmlblox.com samanime

    I think this could be very handy. I’d probably be leery about using the options to get rid of the attribute values and quotation marks…

    Scratch that. Mid-sentence I decided to do a quick test. I have always been taught (and even taught myself) that quotation marks and those attribute values for things like disabled are required, but I just did a test and using HTML 4.01 Strict, this code is perfectly valid:

    Untitled Document
        Blah
    

    I guess it’s only required for XHTML (which makes sense now that I think about it =p).

    This is an all around great module now. =D Hopefully shared hosts that don’t allow us to install our own modules will implement this as well.

  • arts-multimedia

    Not using googleAnalytics is a better way to speed up pages. ;-)
    Seriously, googleAnalytics can slow down page view with 20 to 30 seconds if you are unlucky. This probably why they came up with this half baked solution.

  • John Crumpton

    Our hosting company told me they ran some tests and the CPU load was far too high for shared hosting. Anyone else like to comment?

  • wally

    Thanks google, I like where you are going with this.
    With a few refinements I would use mod_pagespeed.
    I take page load time very seriously and currently optimise each site with minify, mod_deflate and mod_ expires (acheiving Y-slow scores of 92-96)
    Firstly I would want to be able to switch filters on or off using htaccess file. Turning some of these filters on for all accounts on a shared server would be a nightmare. (eg. minifying a malformed css can lose code)
    Secondly, I would want resized images to be cached, then serve the cached version from then on. (It is cruel to ask a server to resize an image more than once, during high load this could quickly use up all available ram)
    Thirdly I would want the concatenated, minified css and js to be zipped and cached (minify already does this for me – again to be kind to the server)
    Forth, I would want htaccess control over which css or js are concatenated, and which are left as separate files. The reason for this is that often your home page will begin with 13 js files = 450kb, but the about us page will have 14 js files = 451kb, so for one extra 1kb file the server reconcatenates, reminifies, rezips and resends all the original 13 files + 1 that are already in the browser cache.
    SO… to have achieved the functionality it has already means that so very skilled developers are working on the project. I hope they recognise the need for caching ability and htaccess control.

  • rolty

    I would have liked to try it but:
    error: Failed dependencies:
    httpd >= 2.2 is needed by mod-pagespeed-beta-0.9.1.1-171.i386
    However the server has apache 2.2.3….
    Sounds like a cool axtension though!

  • http://www.reich-consulting.net/ coffee_ninja

    They need to compile this thing for Windows ASAP… I’m very excited to try it but REALLY don’t feel like going through the pain of trying (and failing) to compile it myself :p

  • tablelover

    OK, I’ll admit it: I use tables. I get the whole semantic vs. presentation thing, but if the alternative to tables is CSS, then make CSS actually work consistently. If I need to place some graphic objects on a page in three columns, I can wrestle with CSS and the inevitable workarounds for IE that may or may not work consistently and end up using more characters than just typing in some table tags – that work the first time, in any browser.

    I could use css table-cell, but IE won’t recognize it.

    I know, the table won’t validate. But I try not to use any styling on the table itself, cellpadding, etc., and just use the cells to encapsulate divs and such and put the styling on the divs.

    The whole CSS-is-the-only-way thing strikes me as a religious argument, rather than a pragmatic one. If I’m wrong, educate me.

  • arts-multimedia

    @tablelover, tables are permitted for tabular data, that’s what they are for. Other then that, you use CSS for layout. It has actually nothing to do with validation, rather with usability because screen readers have difficulties with tables if they are used for layout.
    For the rest, I agree it is largely a convention to split up content and coding.
    By the way, I regard CSS tables are a travesty invented by css purists. You have to write tons of code to produce them, which is counter productive, apart from compatibility issues.
    Same thing with CSS animation. It is fun as an experiment, but totally pointless since flash and video do a much better job.