More HTML5 Semantics: Content Types & New Elements

Louis Lazaris
Louis Lazaris


The following is an extract from our book, HTML5 & CSS3 for the Real World, 2nd Edition, written by Alexis Goldstein, Louis Lazaris, and Estelle Weyl. Copies are sold in stores worldwide, or you can buy it in ebook form here.

Our sample site is coming along nicely. We’ve given it some basic structure, along the way learning more about marking up content using HTML5’s new elements.

In this chapter, we’ll discuss even more new elements, along with some changes and improvements to familiar elements. We’ll also add some headings and basic text to our project, and we’ll discuss the potential impact of HTML5 on accessibility.

Before we dive into that, though, let’s take a step back and examine a few new—and a little tricky—concepts that HTML5 brings to the table.

A New Perspective on Content Types

For layout and styling purposes, developers have become accustomed to thinking of elements in an HTML page as belonging to one of two categories: block and inline. Although elements are still rendered as either block or inline by browsers, the HTML5 spec takes the categorization of content a step further. The specification now defines a set of more granular content models. These are broad definitions about the kind of content that should be found inside a given element. Most of the time they’ll have little impact on the way you write your markup, but it’s worth having a passing familiarity with them, so let’s have a quick look:

  • Metadata content: This category is what it sounds like—data that’s not present on the page itself, but affects the page’s presentation or includes other information about the page. This includes elements such as title, link, meta, and style.

  • Flow content: This includes just about every element that’s used in the body of an HTML document, including elements such as header, footer, and even p. The only elements excluded are those that have no effect on the document’s flow: script, link, and meta elements in the page’s head, for example.

  • Sectioning content: This is the most interesting—and for our purposes, most relevant—type of content in HTML5. In the last chapter, we often found ourselves using the generic term “section” to refer to a block of content that could contain a heading, footer, or aside. In fact, what we were actually referring to was sectioning content. In HTML5, this includes article, aside, nav, and section. Shortly, we’ll talk in more detail about sectioning content and how it can affect the way you write your markup.

  • Heading content: This type of content defines the header of a given section, and includes the various levels of heading (h1, h2, and so on).

  • Phrasing content: This category is roughly the equivalent to what you’re used to thinking of as inline content; it includes elements such as em, strong, cite, and the like.

  • Embedded content: This one’s fairly straightforward, and includes elements that are, well, embedded into a page, such as img, object, embed, video, and canvas.

  • Interactive content: This category includes any content with which users can interact. It consists mainly of form elements, as well as links and other elements that are interactive only when certain attributes are present. Two examples include the audio element when the controls attribute is present, and the input element with a type attribute set to anything but “hidden“.

As you might gather from reading the list, some elements can belong to more than one category. There are also some elements that fail to fit into any category (for example, the head and html elements). Don’t worry if any of this seems confusing. The truth is, as a developer, you won’t need to think about these categories in order to decide which element to use in which circumstance. More than anything, they’re simply a way to encapsulate the different kinds of HTML tags available.

More New Elements

In addition to the structural elements we saw in Chapter 2, HTML5 includes a number of other semantic elements. Let’s examine some of the more useful ones.

The figure and figcaption Elements

The figure and figcaption elements are another pair of new HTML5 elements that contribute to the improved semantics in HTML5. The figure element is explained in the spec as follows:

The figure element can […] be used to annotate illustrations, diagrams, photos, code listings, etc. […] A figure element’s contents are part of the surrounding flow.

Think of charts, graphs, images to accompany text, or example code. All those types of content might be good places to use figure and potentially figcaption.

The figcaption element is simply a way to mark up a caption for a piece of content that appears inside of a figure.

In order to use the figure element, the content being placed inside it must have some relation to the main content in which the figure appears. If you can completely remove it from a document, and the document’s content can still be fully understood, you probably shouldn’t be using figure; you might, however, need to use aside or an alternative.

Let’s look at how we’d mark up a figure inside an article:

  <h1>Accessible Web Apps</h1>

  <p>Lorem ipsum dolor … </p>

  <p>As you can see in <a href="#fig1">Figure 1</a>,

  <figure id="fig1">
    <figcaption>Screen Reader Support for WAI-ARIA</figcaption>
    <img src="figure1.png" alt="JAWS: Landmarks 1/1, Forms 4/5 … ">

  <p>Lorem ipsum dolor … </p>

Using figcaption is fairly straightforward. It has to be inside a figure element and it can be placed either before or after the figcaption content. In the example here, we’ve placed it before the image.

The mark Element

The mark element “represents a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context.” Admittedly, there are very few uses we can think of for the mark element. The most common is in the context of a search, where the keywords that were searched for are highlighted in the results.

The spec also mentions using mark to draw attention to text inside a quote. In any case, you want to use it to indicate “a part of the document that has been highlighted due to its likely relevance to the user’s current activity”.

Avoid confusing mark with em or strong; those elements add contextual importance, whereas mark separates the targeted content based on a user’s current browsing or search activity.

To use the search example, if a user has arrived at an article on your site from a Google search for the word “HTML5,” you might highlight words in the article using the mark element like this:

<h1>Yes, You Can Use <mark>HTML5</mark> Today!</h1>

The mark element can be added to the document either using server-side code, or on the client side with JavaScript after the page has loaded. Search content, for example, can be derived from a URL using search.php?query=html5, for example. In that case, your server-side code might grab the content of the variable in the query string, and then use mark tags to indicate where the word is found on the page.

The progress and meter Elements

Two new elements added in HTML5 allow for marking up of data that’s being measured or gauged in some way. The difference between them is fairly subtle: progress is used to describe the current status of a changing process that’s headed for completion, regardless of whether the completion state is defined. The traditional progress bar indicating download progress is a perfect example of this.

The meter element, meanwhile, represents an element whose range is known, meaning it has definite minimum and maximum values. The spec gives the examples of disk usage, or a fraction of a voting population—both of which have a definite maximum value. Therefore, it’s likely you would avoid using meter to indicate an age, height, or weight—all of which normally have unknown maximum values.

Let’s look in more detail at progress. The progress element can have a max attribute to indicate the point at which the task will be complete, and a value attribute to indicate the task’s status. Both of these attributes are optional. Here’s an example:

<h1>Your Task is in Progress</h1>
<p>Status: <progress max="100" value="0"><span>0</span>% </progress></p>

This element would best be used with JavaScript to dynamically change the value of the percentage as the task progresses. You’ll notice that the code includes span tags, isolating the number value; this facilitates targeting the number directly from your script when you need to update it.

The meter element has six associated attributes. In addition to max and value, it also allows use of min, high, low, and optimum.

The min and max attributes reference the lower and upper boundaries of the range, while value indicates the current specified measurement. The high and low attributes indicate thresholds for what is considered “high” or “low” in the context. For example, your grade on a test can range from 0% (min) to 100% (max), but anything below 60% is considered low and anything above 85% is considered high. The optimum attribute refers to the ideal value. In the case of a test score, the value of optimum would be 100.

Here’s an example of meter, using the premise of disk usage:

<p>Total current disk usage: <meter value="130" min="0" max="320" low="10" high="300" title="gigabytes">63 GB</meter></p>

In the figure below, you can see how the meter element looks by default in Chrome and Firefox.


For better accessibility, when using either progress or meter, you’re encouraged to include the value as text content inside the element. So if you’re using JavaScript to adjust the current state of the value attribute, you should change the text content to match.

The time Element

Dates and times are invaluable components of web pages. Search engines are able to filter results based on time, and in some cases, a specific search result can receive more or less weight by a search algorithm depending on when it was first published.

The time element has been specifically designed to deal with the problem of humans reading dates and times differently from machines. Take the following example:

<p>We'll be getting together for our next developer conference on 12 October of this year.</p>

While humans reading this paragraph would likely understand when the event will take place, it would be less clear to a machine attempting to parse the information.

Here’s the same paragraph with the time element introduced:

<p>We’ll be getting together for our next developer conference on <time datetime="2015-10-12">12 October of this year</time>.</p>

The time element also allows you to express dates and times in whichever format you like while retaining an unambiguous representation of the date and time behind the scenes, in the datetime attribute. This value could then be converted into a localized or preferred form using JavaScript, or by the browser itself (although no browsers at the time of writing support this behavior).

In earlier versions of the spec, the time element allowed use of the pubdate attribute. This was a Boolean attribute, indicating that the content within the closest ancestor article element was published on the specified date. If there was no article element, the pubdate attribute would apply to the entire document. But this attribute has been removed from the spec, even though it did seem to be useful. In his in-depth article on the time element, Aurelio De Rosa provides an alternative for the now dropped pubdate attribute, if you want to look at another method for achieving this.

The time element has some associated rules and guidelines:

  • You should not use time to encode unspecified dates or times (for example, “during the ice age” or “last winter”; this is because the time element does not allow for ranges).

  • The date represented cannot be “BC” or “BCE” (before the common era); it must be a date on the Gregorian Calendar.

  • If the time element lacks a valid datetime attribute, the element’s text content (whatever appears between the opening and closing time tags) needs to be a valid datetime value.

Here’s a chunk of HTML that includes many of the different ways to write a datetime value according to the spec:

<!-- month -->

<!-- date -->

<!-- yearless date -->

<!-- time -->

<!-- floating date and time -->

<!-- time-zone offset -->

<!-- global date and time -->

<!-- week -->

<!-- duration -->
<time>4h 18m 3s</time>

The uses for the time element are endless: calendar events, publication dates (for blog posts, videos, press releases, and so forth), historic dates, transaction records, article or content updates, and much more.