How to Solve Caching Conundrums
The web wouldn’t operate without caching. Between you and the server, there is a browser and any number of proxy servers which cache responses. Much of this is handled transparently by applications which dramatically reduce Internet traffic. However, it can also be the cause of bizarre web application quirkiness if you’re not very careful …
Set Your Headers
Caching is controlled by the HTTP status code and the
Cache-Control headers returned by every request. For subsequent requests to the same URL, the browser/proxy will either:
- retrieve the previous data from its own cache
- ask the server to verify whether the data has changed, or
- make a fresh request.
Cache-Control header primarily determines this action. It can set up to three comma-separated values:
no-store or no-cache
no-store stops the browser and all proxy servers caching the returned data. Every request will therefore incur a trip back to the server.
The alternative is
no-cache. The browser/proxy will make a server request and pass back
Last-Modified (date/time) and/or an
Etag (response hash/checksum) in the header. These are present on subsequent requests and, if the response has not changed, the server returns a
304 Not Modified status, which instructs the browser/proxy to use its own cached data. Otherwise, the new data is passed back with a
200 OK status.
public or private
public means the response is the same for everyone and the data can be cached in browser or proxy stores. It’s the default behavior, so it’s not necessary to set it.
private responses are intended for a single user. For example, the URL
https://myapp.com/messages returns a set of messages unique to each logged-in user, even though both of them use the same URL. Therefore, the browser can cache the response, but proxy server caching is not permitted.
This specifies the maximum time in seconds a response remains valid. For example,
max-age=60 indicates the browser/proxy can cache the data for one minute before making a new request.
Your server, language and framework often control these settings, so you rarely need to tinker — but you can. Presume you wanted to cache an individual user’s JSON response to an Ajax request for 30 seconds. In PHP:
header('Cache-Control: private,max-age=30'); echo json_encode($data);
or a Node.js/Express router:
res .set('Cache-Control', 'private,max-age=30') .json(data);
Differentiate Page and Ajax Data URLs
Setting HTTP headers may not be enough, because browsers work in slightly different ways when you hit the back button.
In Firefox and Safari, hitting back will attempt to show the previous page in its last known state — presuming the URL has been changed with an updated
#hash, or by intercepting actions with history API events.
In practice, it rarely matters which browser you’re using, but there are some weird edge cases. Presume your application presents a paginated table of records, which the user can search and click page navigation buttons. We’re good developers, so we’ll use progressive enhancement to ensure the system works in all browsers:
- The user enters the page at
- Submitting the form to change filters or navigate to a new page will change the URL and make a new request — for example,
- The Ajax request calls the same URL such as
http://myapp.com/list/?search=bob&page=42— but sets the
X-Requested-WithHTTP header to
In summary, our server can either return HTML or JSON for the same URL, depending on the state of the request’s
X-Requested-With header. Unfortunately, this can cause a problem with Chrome and Edge, because either HTML or JSON could be cached.
Presume you randomly navigate around the record list and, at
http://myapp.com/list/?search=bob&page=42, you click a link to another (non-list) page, followed by the browser back button to return. Chrome looks at its cache, sees JSON data for that URL and presents it to the user! Hitting refresh will fix the problem because a request will be made without the
X-Requested-With header. What’s more bizarre is that Firefox works as expected and restores the actual page’s state.
The fix: ensure your page and data URLs are never the same. When navigating to
http://myapp.com/list/?search=bob&page=42, the Ajax call should use a different URL: it can be as simple as
http://myapp.com/list/?search=bob&page=42&ajax=1. This ensures Chrome can cache both the HTML and JSON requests separately, but JSON is never presented, because the Ajax URL never appears within the browser address bar.
Unfortunately, there’s a further complication …
Beware Self-signed SSL Certificates
Ideally, your application is using the encrypted HTTPS protocol. However, there’s no need to purchase SSL certificates for all 57 members of your team, because you can use a fake, self-signed certificate and click proceed whenever a browser complains.
Be aware that Chrome (and presumably most Blink-based browsers) refuses to cache page data when a fake certificate is encountered. It’s similar to setting
no-store on every request.
Your test sites will work exactly as expected, and you’ll never experience the same page/data URL issues described above. The cache is never used, and all requests return to the server. The same application on a live server with a real SSL certificate will cache data. Your users may report seeing strange JSON responses from Chrome — which you won’t be able to reproduce locally.
These are the sorts of nightmare challenges that continue to plague web development! I hope you found this overview helpful. Feel free to share your own nightmares in the comments. We’re all in this together …