Compress Web Output Using mod_deflate and Apache 2.0.x
In my previous article, we discussed the use of
mod_gzip to dynamically compress the output from an Apache 1.3.x server. With the growing use of the Apache 2.0.x family of Web servers, the question arises of how we can perform a similar GZIP-encoding function within this server. With great foresight, the developers of the Apache 2.0.x servers have included in the codebase for this server, a module that performs this very task.
Compile and Enable
is included in the Apache 2.0.x source package. Its compilation it is a simple matter of adding it to the configure command, like this:
./configure --enable-modules=all --enable-mods-shared=all
When the server is made and installed, the GZIP-encoding of documents is easily enabled. We simply add three lines to the httpd.conf file:
SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$
SetEnvIfNoCase Request_URI .pdf$ no-gzip dont-vary
What Does It Do?
This enables the automatic GZIP-encoding of all MIME-types, except image and PDF files, as they leave the server.
Image files and PDF files are excluded as they are already in a highly compressed format. In fact, PDFs become unreadable by Adobe’s Acrobat Reader if they’re further compressed by
On the server that we used to test
mod_deflate for this article, no Windows executables or compressed files were served to visitors. If you deliver these types of files, add the following line to the httpd.conf file to prevent them being sent in a GZIP-encoded format.
SetEnvIfNoCase Request_URI .(?:exe|t?gz|zip|bz2|
sit|rar)$ no-gzip dont-vary
For the file-types indicated in the exclude statements, the server is told explicitly not to send the Vary header. The Vary header indicates to any proxy or cache server the particular condition(s) that will cause this response to Vary from other responses to the same request.
If a client sends a request that doesn’t include the Accept-Encoding: gzip header, the cached item won’t be returned. The item that’s stored in the cache cannot be returned to the requesting client if the Accept-Encoding headers do not match. In this case, the request must be passed directly to the origin server to obtain a non-encoded version. In effect, proxy servers may store 2 or more copies of the same file, depending on the client request conditions that cause the server response to Vary.
Removing the Vary response requirement for objects that are not handled means that, if the objects do not vary due to any other directives on the server (browser type, for example), then the cached object can be served up without any additional requests until the Time-To-Live (TTL) of the cached object has expired.
How Does It Compare?
In examining the performance of
mod_gzip, the one item that seems to distinguish
mod_deflate is the compression algorithm used. The
mod_deflate algorithm uses ZLIB and doesn’t seem to be as effective at compressing files as the GZIP method used for
mod_gzip for Apache 1.3.x. The examples below demonstrate that the compression algorithm for
mod_gzip produces between 4-6% more compression than
mod_deflate for the same file.
Table 1 – /compress/homepage2.html
Table 2 – /documents/spierzchala-resume.ps
Attempts to increase the compression ratio of
mod_deflate using the directives that were provided for this module produced no further decrease in transferred file size. A comment from one of the author’s
mod_deflate states that the module was written specifically to ensure that server performance was not degraded by using this compression method. With future releases of this module, the authors of
mod_deflate may want to compare their algorithm to the one used in
mod_gzip, to see if there are ways to improve the achieved compression ratio in
mod_deflate without compromising server performance.
Despite the fact that the compression algorithm is not as effective as that found in
mod_gzip for Apache 1.3.x, using
mod_deflate for Apache 2.0.x is still an excellent way to decrease the size of the files that sent to clients. Anything that can produce between 50% and 80% in bandwidth savings with so little effort should definitely be considered for any and all Apache 2.0.x deployments.