Compress Web Output Using mod_deflate and Apache 2.0.x

In my previous article, we discussed the use of mod_gzip to dynamically compress the output from an Apache 1.3.x server. With the growing use of the Apache 2.0.x family of Web servers, the question arises of how we can perform a similar GZIP-encoding function within this server. With great foresight, the developers of the Apache 2.0.x servers have included in the codebase for this server, a module that performs this very task.

Compile and Enable mod_deflate
mod_deflate

is included in the Apache 2.0.x source package. Its compilation it is a simple matter of adding it to the configure command, like this:

./configure --enable-modules=all --enable-mods-shared=all  
--enable-deflate

When the server is made and installed, the GZIP-encoding of documents is easily enabled. We simply add three lines to the httpd.conf file:

  SetOutputFilter DEFLATE 
 SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$  
no-gzip dont-vary
 SetEnvIfNoCase Request_URI .pdf$ no-gzip dont-vary
What Does It Do?

This enables the automatic GZIP-encoding of all MIME-types, except image and PDF files, as they leave the server.

Image files and PDF files are excluded as they are already in a highly compressed format. In fact, PDFs become unreadable by Adobe’s Acrobat Reader if they’re further compressed by mod_deflate or mod_gzip.

On the server that we used to test mod_deflate for this article, no Windows executables or compressed files were served to visitors. If you deliver these types of files, add the following line to the httpd.conf file to prevent them being sent in a GZIP-encoded format.

  SetEnvIfNoCase Request_URI .(?:exe|t?gz|zip|bz2| 
sit|rar)$ no-gzip dont-vary

For the file-types indicated in the exclude statements, the server is told explicitly not to send the Vary header. The Vary header indicates to any proxy or cache server the particular condition(s) that will cause this response to Vary from other responses to the same request.

If a client sends a request that doesn’t include the Accept-Encoding: gzip header, the cached item won’t be returned. The item that’s stored in the cache cannot be returned to the requesting client if the Accept-Encoding headers do not match. In this case, the request must be passed directly to the origin server to obtain a non-encoded version. In effect, proxy servers may store 2 or more copies of the same file, depending on the client request conditions that cause the server response to Vary.

Removing the Vary response requirement for objects that are not handled means that, if the objects do not vary due to any other directives on the server (browser type, for example), then the cached object can be served up without any additional requests until the Time-To-Live (TTL) of the cached object has expired.

How Does It Compare?

In examining the performance of mod_deflate against mod_gzip, the one item that seems to distinguish mod_deflate is the compression algorithm used. The mod_deflate algorithm uses ZLIB and doesn’t seem to be as effective at compressing files as the GZIP method used for mod_gzip for Apache 1.3.x. The examples below demonstrate that the compression algorithm for mod_gzip produces between 4-6% more compression than mod_deflate for the same file.

1129_test1
Table 1 – /compress/homepage2.html

1129_test2
Table 2 – /documents/spierzchala-resume.ps

Attempts to increase the compression ratio of mod_deflate using the directives that were provided for this module produced no further decrease in transferred file size. A comment from one of the author’s mod_deflate states that the module was written specifically to ensure that server performance was not degraded by using this compression method. With future releases of this module, the authors of mod_deflate may want to compare their algorithm to the one used in mod_gzip, to see if there are ways to improve the achieved compression ratio in mod_deflate without compromising server performance.

Conclusions

Despite the fact that the compression algorithm is not as effective as that found in mod_gzip for Apache 1.3.x, using mod_deflate for Apache 2.0.x is still an excellent way to decrease the size of the files that sent to clients. Anything that can produce between 50% and 80% in bandwidth savings with so little effort should definitely be considered for any and all Apache 2.0.x deployments.

Win an Annual Membership to Learnable,

SitePoint's Learning Platform

No Reader comments

Comments on this post are closed.