Why The data: URI Scheme Could Help Save Your Slow Site

    Meitar Moscovitz
    Meitar Moscovitz

    Whenever you load a web page, your web browser initiates a number of network connections. If any one of these connections times out, you’re going to be looking at your browser’s load progress indicator for a frustratingly long period of time. This is why reducing the number of HTTP requests a browser needs to make to finish loading a page is a fundamental step in optimizing web sites for speed. One (perhaps underutilized) way to do this is by embedding binary data such as images for oft-used icons inline with other assets using the data: URI scheme.

    Send it data: direct

    On the average web page, the most common kind of external resource is an image file. By using techniques like CSS sprites, which combine many images into a single physical file and then use CSS to display only a portion of the image at a time, you can greatly reduce the number of external resources your web pages call. However, these techniques all still require at least one additional HTTP GET request, along with its overhead and possible new TCP connection.

    For the ultimate in performance optimization, we can get rid of even that HTTP request by embedding the image (or other external resource) directly into an HTML or CSS (or other kind of) file by using the data: URI scheme. The data: URI scheme has been defined by the IETF‘s RFC 2397 since 1998, though to date only some of the major browsers support it. Specifically, Gecko- and WebKit-based browsers such as Firefox, Safari, and Chrome, as well as Opera support its use, but not Internet Explorer 5–7. Along with “CSS tables,” Internet Explorer 8, however, is said to finally add some support for it.

    When implementations of the data: URI scheme are finally ubiquitous, web developers—and more likely framework authors—can begin to use it to enhance the performance of their front-end code. Here’s how to do it.

    data:‘s dirty details

    The syntax of the data: scheme looks like this:


    In other words, an inline binary resource begins with the string literal data:. Immediately following that is an optional MIME type and similarly optional encoding. If the encoding is specified, it is preceded by a semicolon (;). Omitting these tokens makes them default to values of text/plain;charset=US-ASCII. Then, finally, a comma (,) delimits these properties from the actual encoded binary data, which is placed at the end.

    Since it’s just a good ol’ fashioned URI, you can use inline-ed data: anywhere you can put a URI reference in a document. That means you can use them in the <img> element’s src attribute, inside of CSS url() values, and more. With the rumored notable exception of IE8, you can even embed any other kind of binary data, not just images. (IE will likely restrict the kinds of binary data it supports inside of a data: URI for security purposes.)

    Here’s an excerpt of what embedding a PNG image might look like inside a CSS style sheet. In this example, we set a list item’s list-style-image property to display a custom image in the list item’s marker box. (Of course, the encoded data that begins with iVBORw… in the code below would be longer in order to supply the full image.)

    ul li { list-style-image: url(…); }

    Getting data: from your images

    There are a number of freely available tools that you can use which will construct part or all of a data: URI for you.

    • The DataURLMaker is a web form that takes an image file as input and produces an HTML <img> element whose src attribute is the data: URI you’ll want.
    • Ian Hickson also hosts a tool called the data: URI kitchen that does much the same thing.

    Naturally, if you’ve got access to a server-side programming language, you can also generate these URIs on-the-fly yourself. WikiPedia has an example of running an image file through the PHP base64_encode() function to produce a data: URI. Other languages have similar functions.

    Pros and cons: to data: or not to data:

    Most web sites will probably not need this kind of supreme optimization. There are also maintainability problems associated with hardcoding resources as binary data into your web pages, of course, so there is a tradeoff. However, I can foresee many cases where this can still come in handy.

    One such case is perhaps in creating a CSS framework. Such client-side frameworks are intended to be cacheable, and can include a standardized graphical vocabulary. Such visual vocabularies are already commonplace on the Web today, as in the case of icons for things like folders on a filesystem, creative commons licenses, or the RSS feed symbol. In these instances where change is unlikely or slow, maintainability may not be such a big issue. Therefore, directly encoding binary assets like these images into certain style sheets may make a lot of sense from a performance standpoint.

    Another mitigating factor is the ability to post-process your source files, such as is often done when minifying JavaScript. These extra build steps may seem like the up-front effort isn’t worth it, but they are exactly the kinds of things forward-thinking developers should be doing to give their future selves a helpful hand.

    Of course, since Internet Explorer still doesn’t play nice with the data: URI scheme, we can’t take full advantage of the potential benefits of using it today. Nevertheless, even if you can’t do this now, it’s nice to know about the capability should you need the performance boost in the future.