XHTML – An Introduction
HTML as we know and love it, has for many reasons become rather lax and unruly. If you’ve diligently tested your Web pages on different browsers only to find that your carefully crafted masterpiece looks great in IE5x, but becomes an illegible monster in Netscape 4x, then welcome to the club.
What can we do?
Well, we could spend all of our time whining about browser conformity, proprietary tags and standards. Or, we could take a pro-active stance and support the World Wide Web Consortium’s first recommendation for XHTML: XHTML 1.0
This article takes a "quick start" approach aimed at the HTML author who wants to further their skills, and it concludes with links to more detailed information.
What is XHTML?
According to the W3C:
"XHTML 1.0 is a reformulation of HTML 4.01 in XML, and combines the strength of HTML 4 with the power of XML.
"XHTML 1.0 is the first major change to HTML since HTML 4.0 was released in 1997. It brings the rigor of XML to Web pages and is the keystone in W3C’s work to create standards that provide richer Web pages on an ever increasing range of browser platforms including cell phones, televisions, cars, wallet sized wireless communicators, kiosks, and desktops."
Sound good so far? Then read on…
Where do you start?
If you’ve no experience with XML you could be forgiven for being a little intimidated by it. But if you can code your pages in HTML, you’ll be pleased to know that learning XHTML will be extremely easy. It will also provide you with a superb introduction to XML along the way.
Essentially XHTML is just a stricter version of HTML 4.01 with a few considerations that you should be aware of as you mark up your pages.
Three flavors to choose from!
As you may know, the eXtensible Markup Language is not a markup language at all, but a way of defining markup languages by use of a Document Type Definition, or DTD. XHTML is one such language and there are three different DTDs to choose from.
- Strict – disallows use of all deprecated tags and attributes such as the
- Transitional – is far more forgiving and supports all those deprecated yet browser supported tags you most likely use every day.
- Frameset – is exactly the same as the transitional DTD but replaces the document body with frame attributes.
You’ll probably want to use the transitional DTD as it provides the most forgiving environment for an introduction to XML and XHTML.
The main differences between HTML and XHTML
The specification requires that your documents be "well formed", which means that you have to pay special attention to certain aspects of your code. Below are the key points you need to be aware of.
1. Nested elements
Firstly you need to tidy up the way you treat your page elements. XHTML does not tolerate incorrect nesting so something like this:
<b><p>I'd probably have gotten away with it too if it weren't for you pesky W3C folks</b></p>
won’t pass muster at the W3C’s Validation service but
…will be just fine. The same applies to all your markup tags.
2. Case Sensitivity
Both tags and their attributes are case sensitive in XHTML. The simple and strict rule is that all tags and attributes must be written in lower case. For example,
<A HREF="myPage.html">Some page</a>
will get you roasted alive by the XHTML Validator, but
<a href="wellFormed.html">Well formed page</a>
will work perfectly.
Most HTML designers leave out the end tags to certain elements such as </p> If you didn’t know <p> even had an end tag, you’re not alone. Here are the tags most likely to catch you out:
<th> <tr> <td> <li>
What about images and line breaks?
Good question. These elements are similar, and all require an end tag. That’s the way XML works, and of course XHTML is no exception even if there is no end tag in the HTML equivalent. You deal with this by including the end tag in its opener. Here’s an example:
<p>XHTML is strict but not really hard</p> <img src="somePic.gif" /><br /> <p>See what I mean?</p> <hr />
The trick is to leave a space before the closing tag so as not to confuse non-XHTML browsers.
There are a couple of things you should be aware of when you’re dealing with attributes. The first is that all your attributes must be enclosed within "double-quotes".
The second is that for those attributes that in HTML have no value such as
<ul compact> you must specify one. It’s done like this:
Other attributes to watch for are:
ismap="ismap" declare="declare" nowrap="nowrap" compact="compact" noshade="noshade" checked="checked"
I hate to say it, but this is the point where XHTML becomes a bit of a pain. Most of the above is just a matter of disciplining yourself and developing good coding habits, but there are a few problems here that require special mention. They’ll almost certainly cause you trouble if you’re unprepared!
- Ampersands can be a problem within attributes as well. As a general rule of thumb, just use the corresponding HTML entities for &, <, > characters and make sure that you validate your pages properly.
Use id instead of name
The name attribute is now deprecated in favour of the new and prefered id attribute. Although it’s supported, you’ll get warnings when it comes to validation if you use name on a map tag, for instance.
A simple XHTML document
Okay, enough with the do’s and don’ts. If you’re eager to get going here’s a simple XHTML document to get you started.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="https://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>XHTML is easier than you thought!</title> </head> <body> <hr /> <p>As long as you remember the rules and guidelines above<br /> you'll soon be writing well formed documents. No really, you will! </p> <hr /> </body> </html>
A detailed explanation of the declarations at the top of the document is beyond the intended scope of this "quick start" guide (and quite unneccessary for most designers), but here’s the simple version.
Lines 1 Tells the browser that we’re using XML 1.0 and gives its encoding as 8-bit Unicode.
Line 2-4 States the DTD we’re using, which in this case is the transitional version.
Line 5 Declares in the
<html> tag the XHTML name space and language attributes.
And there you have it, you’re all set to start writing well-formed, standards-compliant XHTML pages! All you need do is use the code above as your basic template and start getting into good coding habits from the outset. If you find that you can’t validate every page on your site properly then don’t worry, it’s a pretty tough call. As long as you’re making an effort to validate as much as possible you’re doing a good job.
Further information and resources
XHTML 1.0: The Extensible HyperText Markup Language
Tutorials and articles
XHTML on the Mozilla Developer Network
An XHTML Roadmap for Designers