Why The data: URI Scheme Could Help Save Your Slow Site

Whenever you load a web page, your web browser initiates a number of network connections. If any one of these connections times out, you’re going to be looking at your browser’s load progress indicator for a frustratingly long period of time. This is why reducing the number of HTTP requests a browser needs to make to finish loading a page is a fundamental step in optimizing web sites for speed. One (perhaps underutilized) way to do this is by embedding binary data such as images for oft-used icons inline with other assets using the data: URI scheme.

Send it data: direct

On the average web page, the most common kind of external resource is an image file. By using techniques like CSS sprites, which combine many images into a single physical file and then use CSS to display only a portion of the image at a time, you can greatly reduce the number of external resources your web pages call. However, these techniques all still require at least one additional HTTP GET request, along with its overhead and possible new TCP connection.

For the ultimate in performance optimization, we can get rid of even that HTTP request by embedding the image (or other external resource) directly into an HTML or CSS (or other kind of) file by using the data: URI scheme. The data: URI scheme has been defined by the IETF‘s RFC 2397 since 1998, though to date only some of the major browsers support it. Specifically, Gecko- and WebKit-based browsers such as Firefox, Safari, and Chrome, as well as Opera support its use, but not Internet Explorer 5–7. Along with “CSS tables,” Internet Explorer 8, however, is said to finally add some support for it.

When implementations of the data: URI scheme are finally ubiquitous, web developers—and more likely framework authors—can begin to use it to enhance the performance of their front-end code. Here’s how to do it.

data:‘s dirty details

The syntax of the data: scheme looks like this:

data:[<MIME-type>][;encoding],<data>

In other words, an inline binary resource begins with the string literal data:. Immediately following that is an optional MIME type and similarly optional encoding. If the encoding is specified, it is preceded by a semicolon (;). Omitting these tokens makes them default to values of text/plain;charset=US-ASCII. Then, finally, a comma (,) delimits these properties from the actual encoded binary data, which is placed at the end.

Since it’s just a good ol’ fashioned URI, you can use inline-ed data: anywhere you can put a URI reference in a document. That means you can use them in the <img> element’s src attribute, inside of CSS url() values, and more. With the rumored notable exception of IE8, you can even embed any other kind of binary data, not just images. (IE will likely restrict the kinds of binary data it supports inside of a data: URI for security purposes.)

Here’s an excerpt of what embedding a PNG image might look like inside a CSS style sheet. In this example, we set a list item’s list-style-image property to display a custom image in the list item’s marker box. (Of course, the encoded data that begins with iVBORw… in the code below would be longer in order to supply the full image.)


ul li { list-style-image: url(…); }

Getting data: from your images

There are a number of freely available tools that you can use which will construct part or all of a data: URI for you.

  • The DataURLMaker is a web form that takes an image file as input and produces an HTML <img> element whose src attribute is the data: URI you’ll want.
  • Ian Hickson also hosts a tool called the data: URI kitchen that does much the same thing.

Naturally, if you’ve got access to a server-side programming language, you can also generate these URIs on-the-fly yourself. WikiPedia has an example of running an image file through the PHP base64_encode() function to produce a data: URI. Other languages have similar functions.

Pros and cons: to data: or not to data:

Most web sites will probably not need this kind of supreme optimization. There are also maintainability problems associated with hardcoding resources as binary data into your web pages, of course, so there is a tradeoff. However, I can foresee many cases where this can still come in handy.

One such case is perhaps in creating a CSS framework. Such client-side frameworks are intended to be cacheable, and can include a standardized graphical vocabulary. Such visual vocabularies are already commonplace on the Web today, as in the case of icons for things like folders on a filesystem, creative commons licenses, or the RSS feed symbol. In these instances where change is unlikely or slow, maintainability may not be such a big issue. Therefore, directly encoding binary assets like these images into certain style sheets may make a lot of sense from a performance standpoint.

Another mitigating factor is the ability to post-process your source files, such as is often done when minifying JavaScript. These extra build steps may seem like the up-front effort isn’t worth it, but they are exactly the kinds of things forward-thinking developers should be doing to give their future selves a helpful hand.

Of course, since Internet Explorer still doesn’t play nice with the data: URI scheme, we can’t take full advantage of the potential benefits of using it today. Nevertheless, even if you can’t do this now, it’s nice to know about the capability should you need the performance boost in the future.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.japan-website.com/ pavan_patil

    Thats True! But I wonder why no comment yet…! ITs my pleasure to comment first.
    I think you have mis out something…. ! something about the feed
    feedURI = ‘feed:’ absoluteURI | ‘feed://’ hier_part

    the syntax for the ‘absoluteURI’ and ‘hier_part’ are defined in section 3 of [RFC2396]. A “feed” URI is basically the string ‘feed:’ or ‘feed://’ followed by a URI which when accessed over the network returns a representation of the data feed. If the “feed” URI string begins with the string ‘feed://’ then it MUST be followed by an authority with optional path and query string with the scheme for the URI for accessing the data feed being inferred as the “http” scheme. If the “feed” URI string begins with the string ‘feed:’ then it must be followed by an absolute URI which is the network accessible location of the data feed.

    Registry Easy

  • http://MeitarMoscovitz.com/ Meitar

    @pavan_patil: URI schemes in general have the syntax you describe, yes, but the data: URI scheme doesn’t have any provision for an authority or path section. I think this is because those things are implied by virtue of the data being inline. So, for instance, when you access http://example.com/index.html and it contains an inlined image, then there’s no second HTTP request made, and thus no need to specify an authority or path because those are already known to be example.com and /index.html, respectively.

  • Michael

    Data URIs have some other performance related downsides. The data is base 64 encoded, and this makes the size a lot bigger. Your UI thread is going to be occupied parsing this large chunk of data. Most sites put their images on a separate domain, so the browser will download it in parallel to rendering the page.

  • Drakim

    Also, Data URIs doesn’t cache like normal images. If you make a small small change to the CSS file which has a background image in the Data URI format, then the entire file, including the image, must be downloaded over again. This combined with the fact that base 64 encoding is very ineffective, can produce quite a slowdown to somebody with a slow internet connection.

    Data URIs have their use, especially if you are writing a game in JavaScript, but for regular webpage optimizing, I recommend CSS sprites to reduce the number of http calls, not Data URIs. A plus side of this is that you don’t shut out 70% of the market (IE users).