HTML5 Video: Fragments, Captions, and Dynamic Thumbnails

Armando Roggio

Web and application developers who want to do more with online video may find that three little-known, or at least less often discussed, HTML5 video features may open many new and creative techniques to integrate video in new ways.

In this article I’ll describe: media fragments, the track element, and HTML5 video’s ability to integrate easily with other elements.

Media Fragments

Media fragments or media fragment URIs are a W3C recommendation created to enable some aspects of native video handling in web browsers.

At present, this feature can be used to start or end video playback at a particular instant in time. One could imagine this feature enabling a sort of video sprite that allowed an HTML game developer, as an example, to load a single video file, but easily play different sections in response to some player action.

In its simplest form, the media fragment start time is added to the video source URL. Notice in the following example, the “#t=20” after the source URL where the “t” represents a temporal media fragment.

<video controls>
    <source src="102614-video-sample.mp4#t=20">

In the code above the video would begin playback at 00:20 (assuming mm:ss). Let’s look at another example:

<video controls>
    <source src="102614-video-sample.mp4#t=6,20">

The example above would start playing at 0:06 and continue to play until 0:20.

The time values in the src URI may also be specified in hour-minute-second format (hh:mm:ss):

<video controls>
    <source src="102614-video-sample.mp4#t=00:00:20">

To demonstrate media fragments, I have a 27-second snorkeling video that has three fairly obvious transitions. The first section starts at the beginning of the video (00:00:00), the next section begins at approximately 00:00:06, and the third transition occurs at about 00:00:17.

In the demo, there is a button representing each of the video segments. I have also included two separate source files to ensure the video will play in most browsers.

Below you’ll find the video code along with the navigation:

<video id="frag1" controls preload="metadata" width="720px" height="540px">
    <source src="102614-video-sample.mp4"
            type='video/mp4;codecs="avc1.42E01E, mp4a.40.2"'
    <source src="102614-video-sample.webm"
            type='video/webm;codecs="vp8, vorbis"'

<div class="nav">
    <button data-start="0">Section One</button>
    <button data-start="6">Section Two</button>
    <button data-start="17">Section Three</button>

Data attributes have been added to the source elements and buttons to make it easier to insert the time-based media fragments with JavaScript. Effectively, the script loads a new source with a time-based media fragment when the button is clicked.

function mediaFragOne() {
    var video, sources, nav, buttons;
    video = document.querySelector('video#frag1');
    sources = video.getElementsByTagName('source');
    nav = document.querySelector('video#frag1+nav');
    buttons = nav.getElementsByTagName('button');

    for (var i = buttons.length - 1; i >= 0; i--) {
        buttons[i].addEventListener('click', function() {
            for (var i = sources.length - 1; i >= 0; i--) {
                    'src', (sources[i].getAttribute('data-original')
                    .concat('#t=' + this.getAttribute('data-start'))));

Here’s the demo:

(Please note that because the video files are hosted externally, there will be some delay as an individual file loads, so give the demo some time to display the video.)

See the Pen c376d7feb0826d02d244046ed0e7bd77 by SitePoint (@SitePoint) on CodePen.

Note: If the above demo isn’t working, try this external demo.

Adding Captions or Subtitles

HTML5 video includes a built-in means of presenting on-screen text timed perfectly to match the video content. This can be used to add video captioning for better accessibility, offer a translated transcript (a subtitle), provide a description of what is happening, or even present chapter or section titles.

This feature uses a track element to describe what kind of text is being added, and provide a source for the text.

In this example, a video, which includes spoken English, has a Spanish subtitle track that is displayed by default.

<video id="Subtitle" controls preload="metadata">
   <source src="102614-maui-with-words.mp4" type="video/mp4">
   <source src="102614-maui-with-words.webm" type="video/webm">
   <track src="102614-maui-es.vtt"
          label="Español Subtítulos"
          kind="subtitles" srclang="es" default>

Notice that the track element is placed inside of the video element, and that it has several attributes, include src, label, kind, srclang, and default.

  • src provides the URL for the timed text file. It is, for obvious reasons, required.
  • label is the track’s title. It may be presented to the user.
  • kind must have a value of either “subtitles”, “captions”, “descriptions”, “chapters”, or “metadata”.
  • srclang indicates the track text’s language, and is required when kind is set to “subtitles”.
  • default is a Boolean attribute telling the browser that this text track should load initially.

The text track file linked in the src is in Web Video Text Tracks Format (WebVTT). At its most basic, a WebVTT file needs to declare what it is and provide a series of cues with blank lines in between. Here’s an example:


00:00:03.000 --> 00:00:04.500
Este material de buceo

00:00:04.600 --> 00:00:07.900
fue filmada en el cráter Molokini

00:00:08.000 --> 00:00:09.500
Maui, Hawaii

Each cue in the WebVTT file may have a number or a name. The interval in which the text should be displayed on the screen is described in hour, minute, second, and millisecond format.

Finally, I should also note that in some browsers, including subtitles will add a closed-caption button to the video controls.

You can view the demo at this location for a working version in Chrome, or view the CodePen example below for one that works in Firefox.

See the Pen HTML5 Video with Subtitles by SitePoint (@SitePoint) on CodePen.

For a more comprehensive look at these features, check out Ankul Jain’s article covering HTML5’s track element.

Dynamic Thumbnails with Canvas

A significant advantage for using HTML5 video is that it can interact with other HTML elements in ways that third-party plugins cannot.

As an example, in 2010, Pete LePage, who works in developer relations for Google, described how to use HTML5 video and canvas together.

In LePage’s example, a video is added to the HTML document, a canvas element is created, and then the screen image is captured every five seconds and displayed on the screen. Here’s the relevant part of the HTML:

<video id="thumb" controls preload="metadata" width="750px" height="540px">
    <source src="102614-video-sample.mp4" 
            type='video/mp4;codecs="avc1.42E01E, mp4a.40.2"'>
    <source src="102614-video-sample-webmhd.webm"
            type='video/webm;codecs="vp8, vorbis"'>
<canvas id="canvas" 
        width="750px" height="540px"
<div id="screenShots"></div>

The JavaScript from LePage’s demonstration includes several event listeners, variables, and functions:

var video = document.getElementById("thumb");
video.addEventListener("loadedmetadata", initScreenshot);
video.addEventListener("playing", startScreenshot);
video.addEventListener("pause", stopScreenshot);
video.addEventListener("ended", stopScreenshot);

var canvas = document.getElementById("canvas");
var ctx = canvas.getContext("2d");
var ssContainer = document.getElementById("screenShots");
var videoHeight, videoWidth;
var drawTimer = null;

function initScreenshot() {
  videoHeight = video.videoHeight; 
  videoWidth = video.videoWidth;
  canvas.width = videoWidth;
  canvas.height = videoHeight;

function startScreenshot() {
  if (drawTimer == null) {
    drawTimer = setInterval(grabScreenshot, 1000);

function stopScreenshot() {
  if (drawTimer) {
    drawTimer = null;

function grabScreenshot() {
  ctx.drawImage(video, 0, 0, videoWidth, videoHeight);
  var img = new Image();
  img.src = canvas.toDataURL("image/png");
  img.width = 120;

In the demo, the canvas element is set to display: none, which means we only see the resized thumbnails, not the original canvas image. The demonstration can take a moment to load, but it does show how relatively simple it can be to get HTML5 video to work with other HTML elements.

View the dynamic thumbnails demo here


So that’s a summary of 3 HTML5 video features maybe you haven’t used yet. If you know of any other interesting and little-known tips on HTML5 video, we’d love to hear about them in the comments.

Credits: Music used in the example videos is Thaiz Itch’s “Etude No.5 – 5. SA-GA-MA-PA-NI-SA”. Video is from the my recent trip to Maui, Hawaii.