More HTML5 Semantics: Changes to Existing Features

Louis Lazaris


The following is an extract from our book, HTML5 & CSS3 for the Real World, 2nd Edition, written by Alexis Goldstein, Louis Lazaris, and Estelle Weyl. Copies are sold in stores worldwide, or you can buy it in ebook form here.

While new elements and APIs have been the primary focus of HTML5, this latest iteration of web markup has also brought with it changes to existing elements. For the most part, any changes made have been done with backwards-compatibility in mind, to ensure that the markup of existing content is still usable.

We’ve already considered some of the changes (the doctype declaration, character encoding, and content types, for example). Let’s look at other significant changes introduced in the HTML5 spec.

The Word “Deprecated” is Deprecated

In previous versions of HTML and XHTML, elements that were no longer recommended for use (and so removed from the spec), were considered “deprecated.” In HTML5, there is no longer any such thing as a deprecated element; the term now used is “obsolete.”

Obsolete elements fall into two basic categories: “conforming” obsolete features and “non-conforming” obsolete features. Conforming features will provide warnings in the validator, but will still be supported by browsers. So you are permitted to use them but their use is best avoided.

Non-conforming features, on the other hand, are considered fully obsolete and should not be used. They will produce errors in the validator.

The W3C has a description of these features, with examples.

Block Elements Inside Links

Although most browsers handled this situation well in the past, it was never actually valid to place a block-level element (such as a div) inside an a element. Instead, to produce valid HTML, you’d have to use multiple a elements and style the group to appear as a single block.

In HTML5, you’re now permitted to wrap almost anything in an a element without having to worry about validation errors. The only block content you’re unable to wrap with an a element are other interactive elements such as form elements, buttons, and other a elements.

Bold Text

A few changes have been made in the way that bold text is semantically defined in HTML5. There are essentially two ways to make text bold in most browsers: by using the b element, or the strong element.

Although the b element was never deprecated, before HTML5 it was discouraged in favor of strong. The b element previously was a way of saying “make this text appear in boldface.” Since HTML is supposed to be all about the meaning of the content, leaving the presentation to CSS, this was unsatisfactory.

According to the spec, in HTML5, the b element has been redefined to represent a section of text “to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood.” Examples given are key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.

The strong element, meanwhile, still conveys more or less the same meaning. In HTML5, it represents “strong importance, seriousness, or urgency for its contents.” Interestingly, the HTML5 spec allows for nesting of strong elements. So, if an entire sentence consisted of an important warning, but certain words were of even greater significance, the sentence could be wrapped in one strong element, and each important word could be wrapped in its own nested strong.

Italicized Text

Along with modifications to the b and strong elements, changes have been made in the way the i element is defined in HTML5.

Previously, the i element was used to simply render italicized text. As with b, this definition was unsatisfactory. In HTML5, the definition has been updated to “a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text.” So the appearance of the text has nothing to do with the semantic meaning, although it may very well still be italic—that’s up to you.

An example of content that can be offset using i tags might be an idiomatic phrase from another language, such as reductio ad absurdum, a latin phrase meaning “reduction to the point of absurdity.” Other examples could be text representing a dream sequence in a piece of fiction, or the scientific name of a species in a journal article.

The em element is unchanged, but its definition has been expanded to clarify its use. It still refers to text that’s emphasized, as would be the case colloquially. For example, the following two phrases have the exact same wording, but their meanings change because of the different use of em:

<p>Harry’s Grill is the best <em>burger</em> joint in town.</p>
<p>Harry’s Grill <em>is</em> the best burger joint in town.</p>

In the first sentence, because the word “burger” is emphasized, the meaning of the sentence focuses on the type of “joint” being discussed. In the second sentence, the emphasis is on the word “is,” thus moving the sentence focus to the question of whether Harry’s Grill really is the best of all burger joints in town.

Neither i nor em should be used to mark up a publication title; instead, you should use cite.

Of all the four elements discussed here (b, i, em, and strong), the only one that gives contextual importance to its content is the strong element.

Big and Small Text

The big element was previously used to represent text displayed in a large font. The big element is now a non-conforming obsolete feature and should not be used. The small element, however, is still valid but has a different meaning.

Previously, small was intended to describe “text in a small font.” In HTML5, it represents “side comments such as small print.” Some examples where small might be used include information in footer text, fine print, and terms and conditions. The small element should only be used for short runs of text. So you wouldn’t use small to mark up the body of an entire “terms of use” page.

Although the presentational implications of small have been removed from the definition, text inside small tags will more than likely still appear in a smaller font than the rest of the document.

For example, the footer of The HTML5 Herald includes a copyright notice. Since this is essentially legal fine print, it’s perfect for the small element:

<small>&copy; SitePoint Pty. Ltd.</small>

A cite for Sore Eyes

The cite element was initially redefined in HTML5 accompanied by some controversy. In HTML4, the cite element represented “a citation or a reference to other sources.” Within the scope of that definition, the spec permitted a person’s name to be marked up with cite (in the case of a quotation attributed to an individual, for example).

The earlier versions of the HTML5 spec forbade the use of cite for a person’s name, seemingly going against the principle of backwards compatibility. Now the spec has gone back to a more similar definition to the original one, defining cite as “a reference to a creative work. It must include the title of the work or the name of the author (person, people, or organization) or a URL reference, or a reference in abbreviated form.”

Here’s an example, taken from the spec:

<p>In the words of <cite>Charles Bukowski</cite> -
<q>An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way.</q></p>

Description (not Definition) Lists

The existing dl (definition list) element, along with its associated dt (term) and dd (description) children, has been redefined in the HTML5 spec. Previously, in addition to terms and definitions, the spec allowed the dl element to mark up dialogue, but the spec now prohibits this.

In HTML5, these lists are no longer called “definition lists”; they’re now the more generic-sounding “description lists” or “association lists.” They should be used to mark up any kind of name-value pairs, including terms and definitions, metadata topics and values, and questions and answers.

Here’s an example using CSS terms and their definitions:

  <dd>The element(s) targeted.</dd>
  <dd>The feature used to add styling to the targeted element, defined before a colon.</dd>
  <dd>The value given to the specified property, declared after the colon.</dd>