JavaScript
Article

Cache Fetched AJAX Requests Locally: Wrapping the Fetch API

By Peter Bengtsson

This article is by guest author Peter Bengtsson. SitePoint guest posts aim to bring you engaging content from prominent writers and speakers of the JavaScript community

This article demonstrates how you implement a local cache of fetched requests so that if done repeatedly it reads from session storage instead. The advantage of this is that you don’t need to have custom code for each resource you want cached.

Follow along if you want to look really cool at your next JavaScript dinner party, where you can show off various skills of juggling promises, state-of-the-art APIs and local storage.

The Fetch API

At this point you’re hopefully familiar with fetch. It’s a new native API in browsers to replace the old XMLHttpRequest API.

Can I Use fetch? Data on support for the fetch feature across the major browsers from caniuse.com.

Where it hasn’t been perfectly implemented in all browsers, you can use GitHub’s fetch polyfill (And if you have nothing to do all day, here’s the Fetch Standard spec).

The Naïve Alternative

Suppose you know exactly which one resource you need to download and only want to download it once. You could use a global variable as your cache, something like this:

let origin = null
fetch('https://httpbin.org/get')
  .then(r => r.json())
  .then(information => {
    origin = information.origin  // your client's IP
  })

// need to delay to make sure the fetch has finished
setTimeout(() => {
  console.log('Your origin is ' + origin)
}, 3000)

On CodePen

That just relies on a global variable to hold the cached data. The immediate problem is that the cached data goes away if you reload the page or navigate to some new page.

Let’s upgrade our first naive solution before we dissect its shortcomings.

fetch('https://httpbin.org/get')
  .then(r => r.json())
  .then(info => {
    sessionStorage.setItem('information', JSON.stringify(info))
  })

// need to delay to make sure the fetch has finished
setTimeout(() => {
  let info = JSON.parse(sessionStorage.getItem('information'))
  console.log('Your origin is ' + info.origin)
}, 3000)

On CodePen

The first and immediate problem is that fetch is promise-based, meaning we can’t know for sure when it has finished, so to be certain we should not rely on its execution until its promise resolves.

The second problem is that this solution is very specific to a particular URL and a particular piece of cached data (key information in this example). What we want is a generic solution that is based on the URL instead.

First Implementation – Keeping It Simple

Let’s put a wrapper around fetch that also returns a promise. The code that calls it probably doesn’t care if the result came from the network or if it came from the local cache.

So imagine you used to do this:

fetch('https://httpbin.org/get')
  .then(r => r.json())
  .then(issues => {
    console.log('Your origin is ' + info.origin)
  })

On CodePen

And now you want to wrap that, so that repeated network calls can benefit from a local cache. Let’s simply call it cachedFetch instead, so the code looks like this:

cachedFetch('https://httpbin.org/get')
  .then(r => r.json())
  .then(info => {
    console.log('Your origin is ' + info.origin)
  })

The first time that’s run, it needs to resolve the request over the network and store the result in the cache. The second time it should draw directly from the local storage.

Let’s start with the code that simply wraps the fetch function:

const cachedFetch = (url, options) => {
  return fetch(url, options)
}

On CodePen

This works, but is useless, of course. Let’s implement the storing of the fetched data to start with.

const cachedFetch = (url, options) => {
  // Use the URL as the cache key to sessionStorage
  let cacheKey = url
  return fetch(url, options).then(response => {
    // let's only store in cache if the content-type is
    // JSON or something non-binary
    let ct = response.headers.get('Content-Type')
    if (ct && (ct.match(/application\/json/i) || ct.match(/text\//i))) {
      // There is a .json() instead of .text() but
      // we're going to store it in sessionStorage as
      // string anyway.
      // If we don't clone the response, it will be
      // consumed by the time it's returned. This
      // way we're being un-intrusive.
      response.clone().text().then(content => {
        sessionStorage.setItem(cacheKey, content)
      })
    }
    return response
  })
}

On CodePen

There’s quite a lot going on here.

The first promise returned by fetch actually goes ahead and makes the GET request. If there are problems with CORS (Cross-Origin Resource Sharing) the .text(), .json() or .blob() methods won’t work.

The most interesting feature is that we have to clone the Response object returned by the first promise. If we don’t do that, we’re injecting ourselves too much and when the final user of the promise tries to call .json() (for example) they’ll get this error:

TypeError: Body has already been consumed.

The other thing to notice is the carefulness around what the response type is: we only store the response if the status code is 200 and if the content type is application/json or text/*. This is because sessionStorage can only store text.

Here’s an example of using this:

cachedFetch('https://httpbin.org/get')
  .then(r => r.json())
  .then(info => {
    console.log('Your origin is ' + info.origin)
  })

cachedFetch('https://httpbin.org/html')
  .then(r => r.text())
  .then(document => {
    console.log('Document has ' + document.match(/<p>/).length + ' paragraphs')
  })

cachedFetch('https://httpbin.org/image/png')
  .then(r => r.blob())
  .then(image => {
    console.log('Image is ' + image.size + ' bytes')
  })

What’s neat about this solution so far is that it works, without interfering, for both JSON and HTML requests. And when it’s an image, it does not attempt to store that in sessionStorage.

Second Implementation – Actually Return Cache Hits

So our first implementation just takes care of storing the responses of requests. But if you call the cachedFetch a second time it doesn’t yet bother to try to retrieve anything from sessionStorage. What we need to do is return, first of all, a promise and the promise needs to resolve a Response object.

Let’s start with a very basic implementation:

const cachedFetch = (url, options) => {
  // Use the URL as the cache key to sessionStorage
  let cacheKey = url

  // START new cache HIT code
  let cached = sessionStorage.getItem(cacheKey)
  if (cached !== null) {
    // it was in sessionStorage! Yay!
    let response = new Response(new Blob([cached]))
    return Promise.resolve(response)
  }
  // END new cache HIT code

  return fetch(url, options).then(response => {
    // let's only store in cache if the content-type is
    // JSON or something non-binary
    if (response.status === 200) {
      let ct = response.headers.get('Content-Type')
      if (ct && (ct.match(/application\/json/i) || ct.match(/text\//i))) {
        // There is a .json() instead of .text() but
        // we're going to store it in sessionStorage as
        // string anyway.
        // If we don't clone the response, it will be
        // consumed by the time it's returned. This
        // way we're being un-intrusive.
        response.clone().text().then(content => {
          sessionStorage.setItem(cacheKey, content)
        })
      }
    }
    return response
  })
}

On CodePen

And it just works!

To see it in action, open the CodePen for this code and once you’re there open your browser’s Network tab in the developer tools. Press the “Run” button (top-right-ish corner of CodePen) a couple of times and you should see that only the image is being repeatedly requested over the network.

One thing that is neat about this solution is the lack of “callback spaghetti”. Since the sessionStorage.getItem call is synchronous (aka. blocking), we don’t have to deal with “Was it in the local storage?” inside a promise or callback. And only if there was something there, do we return the cached result. If not, the if statement just carries on to the regular code.

Third Implementation – What About Expiry Times?

So far we’ve been using sessionStorage which is just like localStorage except that the sessionStorage gets wiped clean when you start a new tab. That means we’re riding a “natural way” of not caching things too long. If we were to use localStorage instead and cache something, it’d simply get stuck there “forever” even if the remote content has changed. And that’s bad.

A better solution is to give the user control instead. (The user in this case is the web developer using our cachedFetch function). Like with storage such as Memcached or Redis on the server side, you set a lifetime specifying how long it should be cached.

For example, in Python (with Flask)

>>> from werkzeug.contrib.cache import MemcachedCache
>>> cache = MemcachedCache(['127.0.0.1:11211'])
>>> cache.set('key', 'value', 10)
True
>>> cache.get('key')
'value'
>>> # waiting 10 seconds
...
>>> cache.get('key')
>>>

Now, neither sessionStorage nor localStorage has this functionality built-in, so we have to implement it manually. We’ll do that by always taking note of the timestamp at the time of storing and use that to compare on a possible cache hit.

But before we do that, how is this going to look? How about something like this:

// Use a default expiry time, like 5 minutes
cachedFetch('https://httpbin.org/get')
  .then(r => r.json())
  .then(info => {
    console.log('Your origin is ' + info.origin)
  })

// Instead of passing options to `fetch` we pass an integer which is seconds
cachedFetch('https://httpbin.org/get', 2 * 60)  // 2 min
  .then(r => r.json())
  .then(info => {
    console.log('Your origin is ' + info.origin)
  })

// Combined with fetch's options object but called with a custom name
let init = {
  mode: 'same-origin',
  seconds: 3 * 60 // 3 minutes
}
cachedFetch('https://httpbin.org/get', init)
  .then(r => r.json())
  .then(info => {
    console.log('Your origin is ' + info.origin)
  })

The crucial new thing we’re going to add is that every time we save the response data, we also record when we stored it. But note that now we can also switch to the braver storage of localStorage instead of sessionStorage. Our custom expiry code will make sure we don’t get horribly stale cache hits in the otherwise persistent localStorage.

So here’s our final working solution:

const cachedFetch = (url, options) => {
  let expiry = 5 * 60 // 5 min default
  if (typeof options === 'number') {
    expiry = options
    options = undefined
  } else if (typeof options === 'object') {
    // I hope you didn't set it to 0 seconds
    expiry = options.seconds || expiry
  }
  // Use the URL as the cache key to sessionStorage
  let cacheKey = url
  let cached = localStorage.getItem(cacheKey)
  let whenCached = localStorage.getItem(cacheKey + ':ts')
  if (cached !== null && whenCached !== null) {
    // it was in sessionStorage! Yay!
    // Even though 'whenCached' is a string, this operation
    // works because the minus sign converts the
    // string to an integer and it will work.
    let age = (Date.now() - whenCached) / 1000
    if (age < expiry) {
      let response = new Response(new Blob([cached]))
      return Promise.resolve(response)
    } else {
      // We need to clean up this old key
      localStorage.removeItem(cacheKey)
      localStorage.removeItem(cacheKey + ':ts')
    }
  }

  return fetch(url, options).then(response => {
    // let's only store in cache if the content-type is
    // JSON or something non-binary
    if (response.status === 200) {
      let ct = response.headers.get('Content-Type')
      if (ct && (ct.match(/application\/json/i) || ct.match(/text\//i))) {
        // There is a .json() instead of .text() but
        // we're going to store it in sessionStorage as
        // string anyway.
        // If we don't clone the response, it will be
        // consumed by the time it's returned. This
        // way we're being un-intrusive.
        response.clone().text().then(content => {
          localStorage.setItem(cacheKey, content)
          localStorage.setItem(cacheKey+':ts', Date.now())
        })
      }
    }
    return response
  })
}

On CodePen

Future Implementation – Better, Fancier, Cooler

Not only are we avoiding hitting those web APIs excessively, the best part is that localStorage is a gazillion times faster than relying on network. See this blog post for a comparison of localStorage versus XHR: localForage vs. XHR. It measures other things but basically concludes that localStorage is really fast and disk-cache warm-ups are rare.

So how could we further improve our solution?

Dealing with binary responses

Our implementation here doesn’t bother caching non-text things, like images, but there’s no reason it can’t. We would need a bit more code. In particular, we probably want to store more information about the Blob. Every response is a Blob basically. For text and JSON it’s just an array of strings. And the type and size doesn’t really matter because it’s something you can figure out from the string itself. For binary content the blob has to be converted to a ArrayBuffer.

For the curious, to see an extension of our implementation that supports images, check out this CodePen.

Using hashed cache keys

Another potential improvement is to trade space for speed by hashing every URL, which was what we used as a key, to something much smaller. In the examples above we’ve been using just a handful of really small and neat URLs (e.g. https://httpbin.org/get) but if you have really large URLs with lots of query string thingies and you have lots of them, it can really add up.

A solution to this is to use this neat algorithm which is known to be safe and fast:

const hashstr = s => {
  let hash = 0;
  if (s.length == 0) return hash;
  for (let i = 0; i < s.length; i++) {
    let char = s.charCodeAt(i);
    hash = ((hash<<5)-hash)+char;
    hash = hash & hash; // Convert to 32bit integer
  }
  return hash;
}

If you like this, check out this CodePen. If you inspect the storage in your web console you’ll see keys like 557027443.

Conclusion

You now have a working solution you can stick into your web apps, where perhaps you’re consuming a web API and you know the responses can be pretty well cached for your users.

One last thing that might be a natural extension of this prototype is to take it beyond an article and into a real, concrete project, with tests and a README, and publish it on npm – but that’s for another time!

  • M S i N Lund

    “The first an immediate problem is that fetch is promise-based, meaning we can’t know for sure when it has finished”

    So why not just use XMLHttpRequest then?

    What is the point of fetch, if it cant do what XMLHttpRequest is already doing?

    Don’t get me wrong, I LOVE learning all these new syntaxes to do the same old things as i have been doing for years, as much as the next guy….

    But to do less? Nope.

    • Sean McLellan

      Unfortunately, that’s probably a very poorly worded paragraph. We absolutely know when a Promise returned by a proper implementation of fetch is complete, it’s when it calls its success callback. There seems to be a general tone of favoring synchronous code over async.

      There are also a few other things fun with this post 1) fetch (and XMLHttpRequest) uses browser caching based on the response header anyway, so beware double-caching and race conditions2) the implementation’s behavior doesn’t change based on request method supplied in options — rarely would one want to cache the response of, say, a POST or PUT request. I guess one response would be “don’t use cachedFetch in those scenarios” but why use this impl at all?

      • http://www.peterbe.com Peter Bengtsson

        The point about knowing when the promise returns is that once you introduce an async bit you need to do your next action (e.g. storing the result in a cache) in a callback thing. The warm-up code does something with the data after a little bit of time has passed. Clearly you won’t do that when you build a real application.

        There are times when the browser doesn’t bother to send If-Last-Modified. e.g. a browser refresh. And if the server doesn’t send cache headers appropriately the browser would go out on the network.

        For example, you’re writing an app that talks to an expensive resource and you know that once queried it doesn’t need to be queried for a long time but you want that to persist across tabs or over time.

  • Aleksander Tatarczyk

    There’s also kinda cool library that does catching and batches requests, too – https://github.com/facebook/dataloader

    • http://www.peterbe.com Peter Bengtsson

      Cool! Didn’t know about it. Lumping requests together is hard though. I can see how it can be useful if you have, for example, a proxy server that does the actual fetching of the URLs you really want.

  • hacke2

    nice

  • http://non.co.il Yoni Jah

    You should probably make sure you only cache GET methods and let any other Request method thru

    • http://www.peterbe.com Peter Bengtsson

      Excellent point! I had that originally but it must have slipped out. Annoying.
      Anyway, it’s true. It should only concern itself with GET requests.

Recommended

Learn Coding Online
Learn Web Development

Start learning web development and design for free with SitePoint Premium!

Get the latest in JavaScript, once a week, for free.