Programming - - By Louis Simoneau

Is HTML5 Dirty?

Last week, I wrote about Google’s mod_pagespeed. Some of the module’s features brought out some negative reactions in the comments; notably, the fact that mod_pagespeed can be set to remove quotes from around attributes in your markup, and remove unnecessary attributes (such as type="text" on an input element).

The commenters noted that they were uncomfortable having the server output invalid markup, or “destroying their good intentions.”

Ah, but it’s not invalid! Or at least, it doesn’t need to be.

One of the more interesting (and at least a little controversial) aspects of the HTML5 specification is its relaxing of several constraints on the exact syntax of your markup. The idea was to reduce as much of the complexity of an HTML document as possible, while maintaining backwards compatibility. As it turns out, browsers have always supported unquoted attributes, and they’ve always defaulted to an input type of text in the absence of a type attribute (or in the presence of a type attribute they don’t understand; this is why the new input types like "number" or "email" are backwards compatible). As a result, the simplest input attribute that’s fully backwards compatible is:

<input>

This works in every browser, and correctly displays a text input box. This is why the authors of the HTML5 spec went ahead and made that the minimum required by the specification to create that element. Quotes around attribute values are only required if there’s a space in the value; element and attribute names are case-insensitive, and many attributes have a default value that will be assumed if the attribute is absent (this is the case with the type attribute of the input element). That’s the whole idea behind the spartan HTML5 doctype: it was the minimum number of characters required to trigger standards mode in older browsers.

Even once you understand that, markup like <INPUT type=text> still looks wrong, doesn’t it? But, as Jeremy Keith argues extremely well in his Fronteers 2010 keynote, that’s a question of coding style, and the specification should be style-agnostic.

So, coming back to mod_pagespeed. If it can improve your performance by stripping out a bunch of needless bulk from your code in a way that every browser ever made will be able to parse without problem—then I say let’s do it. The good news is that the HTML5 spec has been built with this in mind, so those of us who care about doing things “the right way” can gain some peace of mind.

note:Want more?

If you want to read more from Louis, subscribe to our weekly tech geek newsletter, Tech Times.

Sponsors