Introduction to Audio and Video in HTML5

A Bit of History

Up until recently multimedia content on the Web has, for the most part, been placed in web pages by means of third-party plugins or applications that integrate with the web browser. Some examples of such software include QuickTime, RealPlayer, and Silverlight.

By far the most popular way to embed video and audio on web pages is by means of Adobe’s Flash Player plugin. The Flash Player plugin was originally developed by Macromedia and is now maintained by Adobe as a result of their 2005 buy-out of the company. The plugin has been available since the mid-90s, but did not really take off as a way to serve video content until well into the 2000s.

Before HTML5, there was no standard way to embed video into web pages. A plugin like Adobe’s Flash Player is controlled solely by Adobe, and is not open to community development.

The introduction of the video and audio elements in HTML5 resolves this problem and makes multimedia a seamless part of a web page, the same as the img element. With HTML5, there’s no need for the user to download third-party software to view your content, and the video or audio player is easily accessible via scripting.

The Current State of Play

Unfortunately, as sublime as HTML5 video and audio sounds in theory, it’s less simple in practice. A number of factors need to be considered before you decide to include HTML5’s new multimedia elements on your pages.

First, you’ll need to understand the state of browser support. At the time of writing, the only browsers with a significant market share that don’t support native HTML5 video and audio are Internet Explorer 8 and earlier. Unfortunately, this is still a sizable slice of most sites’ audiences.

The other major browser makers offer HTML5 video support in versions now in wide use (Chrome 3+, Safari 4+, and Firefox 3.5+). The last version of Chrome without HTML5 video support (version 2) has a nonexistent market share, and the same is true for the nonsupporting versions of Safari and Opera.

Although IE’s market share is significant, you can still use HTML5 video on your pages today. Later on, we’ll show you how the new video element has been designed with backwards compatibility in mind, so that users of nonsupporting browsers will still have access to your multimedia content.

Video Container Formats

Video on the Web is based on container formats and codecs. A container is a wrapper that stores all the necessary data that comprises the video file being accessed, much like a ZIP file wraps or contains files. Some examples of well-known video containers include Flash Video (.flv), MPEG-4 (.mp4 or .m4v), and AVI (.avi).

The video container houses data, including a video track, an audio track with markers that help synchronize the audio and video, language information, and other bits of metadata that describe the content.

The video container formats relevant to HTML5 are MPEG-4, Ogg, and WebM.

Video Codecs

A video codec defines an algorithm for encoding and decoding a multimedia data stream. A codec can encode a data stream for transmission, storage, or encryption, or it can decode it for playback or editing. For the purpose of HTML5 video, we’re concerned with the decoding and playback of a video stream. The video codecs that are most pertinent to HTML5 video are H.264, Theora, and VP8.

Audio Codecs

An audio codec in theory works the same as a video codec, except it’s concerned with the streaming of sound, rather than video frames. The audio codecs that are most pertinent to HTML5 video are AAC and Vorbis.

What combinations work in current browsers?

It would be nice if browser support allowed us to choose a single container, video codec, and audio codec to create a standard way of embedding video using the newvideo element in HTML5. Unfortunately, it’s not quite that simple—although things are improving.

In Table 5.1, we’ve outlined video container and codec support in the most popular browser versions. This chart only includes browser versions that offer support for the HTML5 video element.

Table 5.1. Browser support for HTML5 video

Container/Video Codec/Audio Codec

Firefox

Chrome

IE

Opera

Safari

iOS Safari

Android

Ogg/Theora/Vorbis 3.5+ 3+ 10.5+
MP4/H.264/AAC 3-11 9+ 4+ 4+ C2.1+ [a]
WebM/VP8/Vorbis 4+ 6+ 9+ [b] 10.6+ 2.3+

[a] Versions of Android prior to 2.3 require JavaScript to play the video.
[b] IE9 supports playback of WebM video with a VP8 codec when the user has installed a VP8 codec on Windows.

Opera Mini and Opera Mobile currently offer no support for HTML5 video, but Opera has announced there are plans to include support in upcoming releases.

The Markup

After all that necessary information about containers, codecs, browser support, and licensing issues, it’s time to examine the markup of the video element and its associated attributes.

The simplest way to include HTML5 video in a web page is as follows:

<video src="example.webm"></video>

But, as you’ve probably figured out from the preceding sections, this will only work in a limited number of browsers. It is, however, the minimum code required to have HTML5 video working to some extent. In a perfect world, it would work everywhere — the same way the img element works everywhere — but that’s a little way off just yet.

Similar to the img element, the video element should also include width and height attributes:

<video src="example.webm" width="375" height="280"></video>

Even though the dimensions can be set in the markup, they’ll have no effect on the aspect ratio of the video. For example, if the video in the above example was actually 375×240 and the markup was as shown above, the video would be centered vertically inside the 280-pixel space specified in the HTML. This stops the video from stretching unnecessarily and looking distorted.

The width and height attributes accept integers only, and their values are always in pixels. Naturally, these values can be overridden via scripting or CSS.

Enabling Native Controls

No embedded video would be complete without giving the user the ability to play, pause, stop, seek through the video, or adjust the volume. HTML5’s video element includes a controls attribute that does just that:

<video src="example.webm" width="375" height="280" controls></video>

controls is a Boolean attribute, so no value is required. Its inclusion in the markup tells the browser to make the controls visible and accessible to the user.

Each browser is responsible for the look of the built-in video controls. Figure 5.1 to Figure 5.4 show how these controls differ in appearance from browser to browser.

Figure 5.1. The native video controls in Firefox 4

Figure 5.1. The native video controls in Firefox 4

Figure 5.2. … in IE9

Figure 5.2. … in IE9

Figure 5.3. … in Opera 11

Figure 5.3. … in Opera 11

Figure 5.4. … and in Chrome

Figure 5.4. … and in Chrome

The autoplay Attribute

We’d love to omit reference to this particular attribute, since its use will be undesirable for the most part. However, there are cases where it can be appropriate. The Boolean autoplay attribute does exactly what it says: it tells the web page to play the video as soon as possible.

Normally, this is a bad practice; most of us know too well how jarring it can be if a website starts playing video or audio as soon as it loads — especially if our speakers are turned up. Usability best practices dictate that sounds and movement on web pages should only be triggered when requested by the user. But this doesn’t mean that the autoplay attribute can never be used.

For example, if the page in question contains nothing but a video — that is, the user clicked on a link to a page for the sole purpose of viewing a specific video — it may be acceptable to allow it to play automatically, depending on the video’s size, any surrounding content, and the audience.

Here’s how you’d do that:

<video src="example.webm" width="375" height="280" controls autoplay></video>

The loop Attribute

Another attribute that you should think twice about before using is the Boolean loop attribute. Again, it’s fairly self-explanatory: according to the spec, this attribute, when present, will tell the browser to “seek back to the start of the media resource upon reaching the end.”

So if you created a web page whose sole intention was to annoy its visitors, it might contain code like this:

<video src="example.webm" width="375" height="280" controls autoplay loop></video>

Autoplay and an infinite loop! We just need to remove the native controls and we’d have a trifecta of worst practices.

Of course, there are some situations where loop can be useful: imagine a browser-based game, in which ambient sounds and music should play continuously as long as the page is open.

The preload Attribute

In contrast to the two previous attributes, preload could definitely come in handy in a number of cases. The preload attribute accepts one of three values:

auto

A value of auto indicates that the video and its associated metadata will start loading before the video is played. This way, the browser can start playing the video more quickly when the user requests it.

none

A value of none indicates that the video shouldn’t load in the background before the user presses play.

metadata

This works like none, except that any metadata associated with the video (for example, its dimensions, duration, and the like) can be preloaded, even though the video itself won’t be.

This particular attribute does not have a spec-defined default in cases where it’s omitted; each browser decides which of those three values should be the default state, which makes sense. It allows desktop browsers to preload the video and/or metadata automatically (having no real adverse effect) while permitting mobile browsers to default to either metadata or none, as many mobile users have restricted bandwidth and will prefer to have the choice of downloading the video or not.

The poster Attribute

When you go to view a video on the Web, normally a single frame of the video will display in order to provide a teaser of its content. The poster attribute makes it easy to choose such a teaser. This attribute, similar to src, will point to an image file on the server by means of a URL.

Here’s how our video element would look with a poster attribute defined:

<video src="example.webm" width="375" height="280" poster="teaser.jpg" controls></video>

Although the poster attribute is useful, there’s a bug in iOS3 (corrected in iOS4) that prevents playback of the video if this attribute is present. If you know that many of your visitors use iOS 3.x, you should either avoid using the poster attribute, or remove it for those devices specifically.

The audio Attribute

The audio attribute attribute controls the default state of the audio track for the video element, and currently accepts only a single possible value: muted. The spec states that other values are likely to be added in the future, for specifying the default audio track or volume, for example.

A value of muted will cause the video’s audio track to default to muted, potentially overriding any user preferences. This will only control the default state of the element—the user interacting with the controls, or JavaScript can change this.

Here it is added to our video element:

<video src="example.webm" width="375" height="280" poster="teaser.jpg" audio="muted"></video>

Adding Support for Multiple Video Formats

As we discussed earlier, using a single container format to serve your video is not currently an option, even though that’s really the ultimate idea behind having the video element, and one which we hope will be realized in the future. To allow inclusion of multiple video formats, the video element allows source elements to be defined so that you can allow every user agent to display the video using the format of its choice. These elements serve the same function as the src attribute on the video element; so if you’re providing source elements, there’s no need to specify a src for your video.

Taking current browser support into consideration, here’s how we might declare our source elements:

<source src="example.mp4" type="video/mp4">
<source src="example.webm" type="video/webm">
<source src="example.ogv" type="video/ogg">

The source element (oddly enough) takes a src attribute that specifies the location of the video file. It also accepts a type attribute that specifies the container format for the resource being requested. This latter attribute allows the browser to determine if it can play the file in question, thus preventing the browser from unnecessarily downloading an unsupported format.

The type attribute also allows a codec parameter to be specified, which defines the video and audio codecs for the requested file. Here’s how our source elements will look with the codecs specified:

<source src="example.mp4" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
<source src="example.webm" type='video/webm; codecs="vp8, vorbis"'>
<source src="example.ogv" type='video/ogg; codecs="theora, vorbis"'>

You’ll notice that the syntax for the type attribute has been slightly modified to accommodate the container and codec values. The double quotes surrounding the values have been changed to single quotes, and another set of nested double quotes is included specifically for the codecs.

This can be a tad confusing at first glance, but in most cases you’ll just be copying and pasting those values once you have a set method for encoding the videos (which we’ll touch on later in this chapter). The important point is that you define the correct values for the specified file to ensure that the browser can determine which (if any) file it will be able to play.

Source Order

In our example above, the MP4/H.264/AAC container/codec combination is included first. This is to ensure that the video will play on the iPad. On that device, a bug causes only the first source element to be recognized. It’s safe to assume that this bug is fixed in subsequent versions of the iPad, but for now it’s necessary to include the MP4/H.264 file first to ensure compatibility.

The first source element will be recognized by IE9, Safari, and older versions of Chrome, so that covers quite a large chunk of our HTML5-ready audience.

The next element in the list defines the WebM/VP8/Vorbis container/codec combination. This is supported by later versions of Chrome that will eventually drop support for H.264. In addition to Chrome, WebM video will also play in Firefox 4 and Opera 10.6.

Finally, the third source element declares the Ogg/Theora/Vorbis container/codec combination, which is supported by Firefox 3.5 and Opera 10.5. Although other browsers also support this combination, they’ll be using the other formats since they appear ahead of this one in the source order. The browsers that support only this combination are older versions of browsers whose current versions support other formats, so it will be possible to drop this format once those versions become sufficiently rare.

These three source elements are placed as children of the video element, so with our three file formats declared, our code will now look like this:

<video width="375" height="280" poster="teaser.jpg" audio="muted">
  <source src="example.mp4" type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
  <source src="example.webm" type='video/webm; codecs="vp8, vorbis"'>
  <source src="example.ogv" type='video/ogg; codecs="theora, vorbis"'>
</video>

You’ll notice that our code above is now without the src attribute on the video element. As well as being redundant, it would also override any video files defined in the source elements.

What about audio?

Much of what we’ve discussed in relation to HTML5 video and its API also apply to the audio element, with the obvious exceptions being those related to visuals.
Similar to the video element, the preload, autoplay, loop, and controls attributes can be used (or not used!) on the audio element.

The audio element won’t display anything unless controls are present, but even if the element’s controls are absent, the element is still accessible via scripting. This is useful if you want your site to use sounds that aren’t tied to controls presented to the user. The audio element nests source tags, similar to video, and it will also treat any child element that’s not a source tag as fallback content for non supporting browsers.

As for codec/format support, Firefox, Opera, and Chrome all support Ogg/Vorbis; Safari, Chrome, and IE9 support MP3; and every supporting browser supports WAV. Safari also supports AIFF. At present, MP3 and Ogg/Vorbis will be enough to cover you for all supporting browsers.

This is an excerpt from HTML5 & CSS3 for the Real World, by Alexis Goldstein, Louis Lazaris & Estelle Weyl.

Learn HTML5 Online

Get all SitePoint books and courses with a Learnable membership. Start building future-proof websites that are faster, more powerful, and easier to maintain.

  • http://www.chaludesign.com.ar Anita Chalu

    precious