Browser caching: Etag vs. Cache-Control

Hi,

I’m trying to improve our website’s performance. So I decided to first include some Browser caching.

Now I added following to my .htacess file (I will later, after final testing, add this to the vhost.conf file)

AddType application/javascript .js

ExpiresActive On
# Cache only static files
ExpiresByType image/jpeg "access plus 30 days"
ExpiresByType image/gif "access plus 30 days"
ExpiresByType image/png "access plus 30 days"
ExpiresByType application/javascript "access plus 30 days"
ExpiresByType text/css "access plus 30 days"

Running Live HTTP headers I get following output

http://www.xxxxx.local/images/xxx/jobs.gif

GET /images/xxx/jobs.gif HTTP/1.1
Host: www.xxxx.local
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; de; rv:1.9.2) Gecko/20100115 Firefox/3.6
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: de-de,de;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://www.xxx.local/
Cookie: xxxx
If-Modified-Since: Sat, 21 Feb 2009 23:00:00 GMT
If-None-Match: "a5d24-bc1-46375ba9cbc00"
Cache-Control: max-age=0

HTTP/1.1 304 Not Modified
Date: Tue, 16 Feb 2010 13:04:27 GMT
Server: Apache/2.2.8 (Linux/SUSE)
Connection: Keep-Alive
Keep-Alive: timeout=15, max=100
Etag: "a5d24-bc1-46375ba9cbc00"
Expires: Thu, 18 Mar 2010 13:04:27 GMT
Cache-Control: max-age=2592000

Now I have three questions (and some sub-questions):

There are two parts. One starts with “GET /images/xxx/jobs.gif HTTP/1.1”, the second “HTTP/1.1 304 Not Modified”. I guess the first is the request, and the second the answer?!

Why is there a Cache-Controll in both parts? What does the first say and what does the second say?

The headers also use Etag. As far as I understand Etags are unique identifiers of a file. So when the file is updated, the Etag will change.
When Etags are used anyway, does the use of Cache-Controll still make sense?
What does the browser do, if the Etag has changed, but the Cache-Control told to keep the item for 30 day and by this, the item would still be valid?

Thanks,
Flözen

  1. That’s correct

  2. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3

  3. I’m not sure about that one, would have to test

Hi,

I’ve got some questions regarding the use of mod_cache:

When mod_cache (and mem_cache_module or disk_cache_module) is loaded, will it automatically be used for all Virtual Hosts?
Do I need to use

 <IfModule mod_cache.c>
CacheDisable /
</IfModule>

to decativate it for a singel vhost?

Or is it only used when defining the parameters at a vhost like


<IfModule mod_cache.c>
 <IfModule mod_mem_cache.c>
    CacheEnable mem /
    MCacheSize 4096
    MCacheMaxObjectCount 100
    MCacheMinObjectSize 1
    MCacheMaxObjectSize 2048
  </IfModule>
</IfModule>

?

How is controlled, what can be cached? Is this done by the header of each file (cache-controll)?

We are running a completely database driven website based on PHP and mySQL with user-accounts etc. Websites should therefore not be cached.
But I believe we could speed up our page when caching images, css and js files.
What type of caching would be better for us:
mod_disk_cache or mod_mem_cache?

Thanks for suggestions!

Regards
Flözen

Actually, if you use PHP and mySQL, the websites SHOULD be cached, it’ll just cache the content that was loaded (in case it’s reloaded). The only pages not cached by default within a web browser would be those accessed through HTTPS (secured content is never cached for security reasons).

PS: If you want a htaccess speedy-up script I’ll post the one I use within my page, in YSlow it gave me an A+ rating upon creating it (meaning very optimized).

Header unset ETag
FileETag None
<ifmodule mod_deflate.c>
AddOutputFilterByType DEFLATE text/text text/html text/plain text/xml text/css application/x-javascript application/javascript
</ifmodule>
<FilesMatch "\\.(ico|pdf|jpg|jpeg|png|gif|swf)$">
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
</FilesMatch>

This removes ETag’s eliminating the added bulk in the cache. Uses the mod_deflate mechanism to gzip content (uses less bandwidth). Set’s the expire header (cache) for a set date so constant content like images, flash (stuff that doesn’t need refreshing often) will use cache when available rather than requesting it again. Hope that’s helpful, it might be small bit it works for me, and Yahoo’s speed testing tool (YSlow) gave it the thumbs up too. :slight_smile:

As I understood, pages with changing content depending on for example the IP should not be cached. So, if the URL stays the same, but the content looks different if the user is logged in should not be cached - right?!