Compress Web Output Using mod_gzip and Apache

Share this article

Web page compression is not a new technology, but it has recently gained higher recognition in the minds of IT administrators and managers because of the rapid ROI it generates. Compression extensions exist for most of the major Web server platforms, but in this article I’ll focus on the Open Source Apache and mod_gzip solution.

GZIP-Encoding Basics

The idea behind GZIP-encoding documents is very straightforward. Take a file that is to be transmitted to a Web client, and send a compressed version of the data, rather than the raw file. Depending on the size of the file, the compressed version can run anywhere from 50% to 20% of the original file size.

In Apache, this can be achieved using Content Negotiation, which requires that two separate sets of HTML files be generated: one for clients who can handle GZIP-encoding, and one for those who can’t. This solution sends gzip-encoded files to clients who understand them, but does not allow for the compression of dynamically-generated pages.

A More Graceful Solution

A more graceful solution is the use of mod_gzip, one of the many additional modules available for Apache. I consider it one of the overlooked gems for designing a high-performance Web server. Using this module, configured file types will be compressed using GZIP-encoding after they’ve been processed by all of Apache’s other modules, and before they’re sent to the client. The compressed data that’s generated reduces the number of bytes transferred to the client, without any loss in the structure or content of the original, uncompressed document.

mod_gzip can be compiled into Apache as either a static or dynamic module -- I've chosen to compile it as a dynamic module in my own server. The advantage of using mod_gzip is that this method doesn't require anything to be done on the client side in order to make it work. As for the server side, all the server or site administrator has to do is:
  • compile the module,
  • edit the appropriate configuration directives that were added to the httpd.conf file,
  • enable the module in the httpd.conf file, and
  • restart the server.

In less than 10 minutes, you can be serving HTML files using GZIP-encoding.

How it Works

When a request is received from a client, Apache determines if mod_gzip should be invoked by noting whether the “Accept-Encoding” HTTP request header has been sent by the client. If the client sends the header (shown below), mod_gzip will compress the output of all configured file types when they’re sent to the client.

Accept-encoding: gzip

This client header announces to Apache that the client will understand files that have been GZIP-encoded. mod_gzip then processes the outgoing content and includes the following server response headers.

Content-Type: text/html 
Content-Encoding: gzip

These server response headers announce that the content returned from the server is GZIP-encoded, but that when the content is expanded by the client application, it should be treated as a standard HTML file. Not only is this successful for static HTML files, but it can also be applied to pages that contain dynamic elements, such as those produced by Server-Side Includes (SSI), PHP, and other dynamic page generation methods. You can also use it to compress your Cascading Stylesheets (CSS) and plain text files. My httpd.conf file sets the following configuration for mod_gzip:

mod_gzip_item_exclude         file       .js$ 
mod_gzip_item_exclude         mime       ^text/css$

mod_gzip_item_include         file       .html$
mod_gzip_item_include         file       .shtml$
mod_gzip_item_include         file       .php$
mod_gzip_item_include         mime       ^text/html$

mod_gzip_item_include         file       .txt$
mod_gzip_item_include         mime       ^text/plain$

mod_gzip_item_include         file       .css$
mod_gzip_item_include         mime       ^text/css$

I’ve had limited success compressing other file formats, mainly because Microsoft’s Internet Explorer appears to examine the “Content-Type” header message before it examines the “Content-Encoding” header message. So, say you configure your server to GZIP-encode PDF files using the following mod_gzip directives:

mod_gzip_item_include         file       .pdf$ 
mod_gzip_item_include         mime       ^application/pdf$

This will work perfectly in both Mozilla and Opera, as these applications decode the GZIP-encoded content before they pass it along to the PDF reader (most people use Adobe Acrobat Reader).

However, Internet Explorer simply passes the GZIP-encoded content directly to the PDF reader. Once this issue is rectified in the MSIE code, you are likely to see a lot more Web servers serving a broader range of GZIP-encoded content.

Bandwidth Savings

As you can see, GZIP-encoded documents can produce substantial savings in bandwidth usage:

http://www.pierzchala.com/bio.html 
Uncompressed File Size:  3122 bytes
http://www.pierzchala.com/bio.html
Compressed File Size:  1578 bytes

http://www.pierzchala.com/compress/homepage2.html  
Uncompressed File Size:  56279 bytes
http://www.pierzchala.com/compress/homepage2.html  
Compressed File Size:  16286 bytes

As a server administrator, you may be concerned that mod_gzip will place a heavy burden on your systems as they compress files on the fly. I’d like to point out that this does not seem to concern the administrators of Slashdot, one of the busiest Web servers on the Internet, who use mod_gzip in their very high-traffic environment.

The mod_gzip project page is located at SourceForge. Try it out for yourself.

Frequently Asked Questions about Mod_gzip and Apache

What is the main function of Mod_gzip in Apache?

Mod_gzip is an external extension module for the Apache web server that is used to compress web content before it is delivered to the client. This compression significantly reduces the size of the data being transferred, thereby improving the speed and efficiency of the web server. It uses the GZIP compression algorithm, which is highly effective for text-based content such as HTML, CSS, and JavaScript files.

How does Mod_gzip compare to Mod_deflate?

Both Mod_gzip and Mod_deflate are modules used for data compression in Apache. While they both serve the same purpose, there are some differences between them. Mod_deflate is generally easier to set up and configure, and it is included by default in Apache 2.0 and later. On the other hand, Mod_gzip offers more configuration options and is considered to be more powerful and flexible.

How can I install and configure Mod_gzip on my Apache server?

To install Mod_gzip, you will need to download the module and compile it into your Apache server. Once installed, you can configure it by adding directives to your httpd.conf file. These directives allow you to control various aspects of the compression process, such as the types of files to compress and the level of compression to apply.

Can Mod_gzip compress all types of web content?

Mod_gzip is most effective for compressing text-based content such as HTML, CSS, and JavaScript files. It is less effective for binary data and multimedia content, which are often already compressed in their native formats. However, you can configure Mod_gzip to compress any type of content by specifying the appropriate MIME types in your configuration file.

What are the potential drawbacks of using Mod_gzip?

While Mod_gzip can significantly improve the performance of your web server, it does require additional CPU resources to compress and decompress data. This can potentially slow down your server if it is already under heavy load. Additionally, not all web browsers support GZIP compression, so you will need to ensure that your server is configured to serve uncompressed content to these clients.

How can I test if Mod_gzip is working correctly?

You can test if Mod_gzip is working by using online tools such as GIDNetwork’s GZIP Test. This tool will fetch a page from your website and tell you whether it was compressed and how much bandwidth was saved as a result.

Can I use Mod_gzip with other Apache modules?

Yes, Mod_gzip can be used in conjunction with other Apache modules. However, it is important to ensure that these modules are compatible with Mod_gzip and that they do not interfere with its operation.

How does Mod_gzip affect SEO?

Mod_gzip can have a positive impact on SEO by improving the speed and performance of your website. Faster websites are favored by search engines and can lead to higher rankings in search results. Additionally, by reducing the amount of data that needs to be transferred, Mod_gzip can help to improve the user experience, which is another important factor in SEO.

Is Mod_gzip compatible with all versions of Apache?

Mod_gzip was originally developed for Apache 1.3, but it can also be used with Apache 2.x. However, since Apache 2.0 and later include the Mod_deflate module by default, many users choose to use Mod_deflate instead of Mod_gzip.

Can I use Mod_gzip on a shared hosting plan?

Whether you can use Mod_gzip on a shared hosting plan depends on your hosting provider. Some providers allow you to use Mod_gzip, while others do not. If you are unsure, it is best to contact your hosting provider for more information.

Stephen PierzchalaStephen Pierzchala
View Author

Stephen is currently the Principal Technical Trainer and Senior Diagnostic Analyst at Keynote Systems. He has been actively been working with, supporting and analyzing data from Internet technologies since 1994.

Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week