Offline Browsing in HTML5 with ApplicationCache

Offline browsing is becoming increasingly important to web developers and designers. Giving the user the ability to use your website offline has always been a goal, but one that was pretty difficult to reach, to say the least. As we move into the age of HTML5, however, this is changing and you can now take advantage of the ApplicationCache interface.

Using the application cache, you can specify which files the browser can cache and use when the user is offline. Your website will work as if the user is online, and best of all, they won’t notice any difference!

So, how do you specify which files the browser should cache? Well, this is defined in the cache manifest file.

The Cache Manifest

The cache manifest file lives inside your website, and it defines which files can be cached by the browser. The cache manifest has an appcache extension, and to use it, you need to reference it in the html tag of your webpage.

<!DOCTYPE HTML>

<html manifest="offline.appcache">

The catch is this must be included on every page. If it’s not there, the browser won’t cache the page. And what does that mean? Well, even if you don’t include the current HTML page in the manifest file, it will be explicitly cached by the browser too.

Cache Manifest Gotcha

Here’s something else to watch out for. For the web server to serve the cache manifest properly a new mime type of text/cache-manifest must be added, otherwise things won’t work as you may expect. To do this in IIS 7, select your website and click MIME Types.

cache manifest

Choose Add and enter the new MIME type.

MIME Type

This needs to be done before any files are to be cached by the browser.

The Cache Manifest Structure

The cache manifest is broken up into three sections:

  • cache – defines which resources the browser can cache
  • network – defines which resources requires the user to be online
  • fallback – defines a fallback for resources that cannot be cached

The minimum requirement for this file is the opening line CACHE MANIFEST. This is the only required section. Currently the size of the cache is limited to 5MB, which – when you think about it – is quite a lot for a website. Here’s a complete cache manifest file.

CACHE MANIFEST

# Created on 8 October 2011

CACHE:

site.css

site.js

NETWORK:

login.aspx

currency.aspx

# offline.jpg will replace all images in the images folder

# offline.html will replace all html pages if they cannot be found

FALLBACK:

site/images images/offline.jpg

*.html offline.html

The manifest file is not hard to understand. Lines beginning with # are comments and will be ignored by the browser. Each section tells the browser what can be cached, what can’t be cached and what to do when a resource can’t be found. These sections can be listed in any order.

Before moving on, there’s one piece of information to remember at this point – and it’s very important. If one resource fails to download, the entire cache process fails. It’s all or nothing. If this happens, the browser will fall back to the old cached files.

Bear that in mind.

Update the Application Cache

Caching resources improves performance, but it can also mean the resources are not current. This can happen if, for example, a resource is updated on a website, but the application cache remains cached until one of the following occurs:

  • the cache manifest file has changed
  • the user clears their temporary internet files
  • the application cache is programmatically updated

It’s a good idea to have a version number in the manifest file, so when you deploy changes to the website, the old cached resources are removed and new ones are downloaded and cached.

Application Cache and JavaScript

The application cache has many events that fire during the caching process. I really can’t think of too many occasions when you’d want to hook into these events, apart from manually forcing the cache to refresh, or writing a demo for a presentation. Nevertheless, here they are:

  • onchecking – the user agent is checking for updates, or attempting to download the manifest for the first time.
  • onnoupdate – no update to the manifest file.
  • ondownloading – the user agent has found an update, or attempting to download the manifest for the first time.
  • onprogress – the user agent is download the resources listed in the manifest.
  • oncached – the download is complete and the resources are cached.
  • onupdateready – the resources have been downloaded and they can be refreshed by calling swapCache
  • onobsolete – the manifest is either a 404 or 410 page, so the application cache gets deleted.
  • onerror – caused by a number of items. The manifest is a 404 or 410 page. The manifest could have changed while an update was being run.

Creating event handlers is a piece of cake.

var appCache = window.applicationCache;

function logEvent(e) {

console.log(e);

}

function logError(e) {

console.log("error " + e);

};

appCache.addEventListener('cached', logEvent, false);

appCache.addEventListener('checking', logEvent, false);

appCache.addEventListener('downloading', logEvent, false);

appCache.addEventListener('error', logError, false);

appCache.addEventListener('noupdate', logEvent, false);

appCache.addEventListener('obsolete', logEvent, false);

appCache.addEventListener('progress', logEvent, false);

appCache.addEventListener('updateready', logEvent, false);

If you want to refresh the page for the user when the cache is cleared, you can add some extra code to the updateready event to handle that.

appCache.addEventListener('updateready', function (e) {

appCache.swapCache();

window.location.reload();

}, false);

As always you can check out the complete API reference here.

This is one area of HTML5 that is a game changer for me. I recommend you come to know it, love it and use it.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Tom Ang

    How do you debug this when it doesn’t work? I followed your instructions (it’s not that hard), but the built in android browser gets the page when the wireless is enabled and gives a page not found error when the wireless is off. Shouldn’t visiting the page once with the wireless on have created a cached version? And shouldn’t that version have been displayed when the wireless was off?

  • Tom Ang

    I can see the /offline-1.appcache file being served out. I thought maybe you had to explicitly list all of the allowed files under the CACHE: line but that isn’t doing it either.

  • Malcolm Sheridan

    @Tom

    Debugging this is hard and next to impossible. You can’t see what’s in the cache, so to debug it the best thing I found was to clear the cache and hook into the JavaScript events.

    Yes visiting the site when you had connectivity should have cached the resources. Yes that version should be displayed because you’re offline. I didn’t test this on a mobile browser, so if you find anything, let me know.

  • http://orbital.co.nz Johan

    Nice article Malcolm. Got it working following your steps.

    @Tom Ang: easiest is to use your desktop browser to debug. Chrome displays manifest processing in the console and flags any errors so really easy to see what is going on.

    To clear appcache in Chrome use url chrome://appcache-internals which will list all caches and allows you to delete them individually.

  • Tom Ang

    Strange. I can see the process working by fetching a page, turning off the Apache web server and then refreshing the page. For my test page, I have both text and an image and only the text is cached and redisplayed. But I am using Chrome 14.0.835.202 m Windows and chrome://appcache-internals is claiming that there are no caches to display.

    I don’t think the built in web browser (based on WebKit) on my DroidX is fancy enough yet to do any of this. When I repeat the test turning the wifi off, I just get a “you can’t get there from here error message”.

    Also Dolphin Browser HD 6.2.0 is not honoring it either (although it does ask me about clearing the cache on exit).

    But I’m using Android 2.3.3. It might be fun to try with the emulator and the latest SDK to see if the built in browser in Android 3.2 (SDK 13). or ? (SDK 14) is better.

    This is a great feature and will surely get more notice by web and browser developers. Thanks for the article, by the way.

  • Nelson Monteiro

    Hi,

    The main problem for me using cache manifest is with Firefox. As it asks permission to save data – “This website (websiteurl.com) is asking to store data on your computer for offline use” – cache manifest will not work if you choose “Never for this Site” and if someone use “Not Now” it can be annoying for people when they return to the website and get the message again.
    I’m not using Offline Appcache because of that.

    What do you think?

    best regards

  • http://orbital.co.nz Johan

    Mobile:

    I tested on iOS Safari and works seamlessly. You can also see cache and size in the Safari Settings > Advanced where you can delete individually.

    Tested on Android (think it is 2.1) and also worked fine.

    Yes the warning/confirmation in Firefox is not ideal.

  • vince

    I still don’t understand how this enables offline access, doesn’t the browser make a request when you open it from a bookmark a second time?

  • Gaurav Chandra

    One thing I came across which needs to be specified here is that all the html or php pages which have attribute to them get cached irrespective whether they are included in cache manifest or not in Google Chrome. I learned it a hard way and I can’t seem to have a work around to it. The sitepoint book also does not mention this.