HTML vs XHTML - The conclusion

The web is full of HTML vs. XHTML discussion. But after reading a lot of articles, I still can’t decide on what is right - that is, what’s the “bottom line”.

First, some points about HTML vs XHTML are worth noting…

  1. Always use “strict” doctype, doesn’t matter if it’s HTML 4.01 or XHTML
  2. If you serve XHTML as “text/html”, then follow the guidelines of “Appendix C” of the W3C XHTML spec.
  3. DO NOT serve XHTML as XML (application/xhtml+xml)
  4. Even if you serve XHTML as “text/html”, it is not good because the browser will see it as a “tag soup” and not as valid HTML. (http://hixie.ch/advocacy/xhtml)

These and other related points do not seem to give a definitive answer to the question HTML or XHTML?

The point that I am trying to raise is what should a page author do TODAY - that is, considering that XHTML is dead and HTML 5 is comming, what is THE CORRECT WAY (forward compatible) to write the markup of a MODERN (web 2.0) web page.

That is one of the biggest benefits of XHTML. If only HTML would do that as well then it would be far easier to fix all the errors in web pages before they go live on the web. By having pages refuse to display while they contain errors it is much much easier to find and fix the errors. The problem with just about all web pages written iin HTML is that they contain hidden errors - and that should be unacceptable to any web professional.

With IE9 now having added support for XHTML all we need do now is to wait for IE8 and earlier to die and serving pages as XHTML will become the preferred way for web professionals to deliver their web pages. HTML will then be used only by hobbyists who are less concerned with getting the code right than they are with getting their content onto the web…

No, that’s not quite right, Stephen. Using application/xhtml+xml or application/xml doesn’t matter much at all. (Nothing in fact, except in some versions of IE). What makes a document XHTML rather than XML is the XML namespace declaration in the root element’s start tag:

<html xmlns="http://www.w3.org/1999/xhtml">

You could even use text/xml as the MIME type, although it’s more-or-less deprecated because of its odd rules about character encoding.

The MIME type only says it’s some application of XML. It’s the namespace that specifies which application. If you serve something as application/xhtml+xml and omit the namespace declaration (or don’t get it exactly right), it’s not XHTML.

That’s a pretty strange statement to make. I know it might technically be true, but I don’t think it’s helpful. Given that more than half the people out there surfing the web have got IE up to and including v8, it’s not good to give any advice that could be misinterpreted as saying it’s OK to use xhtml+xml. And when you say it “will work in all browsers”, what you mean is that in IE it will simply display the source code - I wouldn’t call that “working”.

First use the simpler HTML 5 doctype so you don’t have to hassle with things like HTML or XHTML doctype? or Strict or Transitional?

I guess, but I happen to like a nice, full doctype. Mostly so I can have snarky <!–comments–> saying stuff like “strict, b*tches!” or whatever : ) I’ve only used that doctype for quickie pages and a page I wrote on my own site about using doctypes.

Second, your markup will not be seen as tag soup and you still can use all the best practices of XHTML.

Most of us do that with HTML4, I believe. Sure, HTML4 lets you leave off closing tags for elements such as p and li, but Best Practices says we should properly close them, so we do.
In fact, because of this leniency, I will also send my pages through the w3c validator and tell it to pretend my page is XHTML. I scroll past all the unclosed meta link and img tags to see if I have any real errors that HTML4 validator may have ignored.

Q: Do ALL browsers (IE 6 & 7 also) trigger Standards Mode with the HTML 5 doctype?

I dunno if ALL browsers do but I know IE6 and 7 do, as well as the other modern browsers. Browsers often look to see IF there’s a reference URL in the doctype, but they don’t actually bother to read it… <!Doctype HTML5> therefore doesn’t need anything more than that… browsers just see “full doctype” and that’s good enough for them.

XHTML is indeed dead - read this: http://www.w3.org/2009/06/xhtml-faq.html
You’re confused: XHTML2 is dead, officially. XHTML alone is not dead. However, none of us harbour illusions of XHTML-as-real-XML-as-HTML unless we work in a specialised environment (there are people who are storing data as XML files and displaying them on various devices as XHTML… just not regular web pages who may be visited by IE users).
For your viewing pleasure:

@xhtmlcoder

The conclusion was slightly wonky; you cannot [fully] use HTML5…

By “using” HTML 5, I only mean a couple of things…
First use the simpler HTML 5 doctype so you don’t have to hassle with things like HTML or XHTML doctype? or Strict or Transitional?

Second, your markup will not be seen as tag soup and you still can use all the best practices of XHTML.

whenever it evolves into something tangible [2020+] - it will have to support XHTML anyway.

HTML 5 does support XHTML (as XHTML 5) … just that it’s different from earliar XHTMLs in that it is not intended to be sent as “text/html”.

SO…conclusion time…
<conclusion>
USE HTML 5 doctype, send your pages as “text/html” and stick to
all the “best practices” of XHTML (except this one: <br />)
</conclusion>

Q: Do ALL browsers (IE 6 & 7 also) trigger Standards Mode with the HTML 5 doctype?

@Stevie
Thanks for your answers … this is getting interesting and a bit confusing again.

First, take “Tag Soup”

All that XHTML means is making your spot tags self-closing (eg <br />), whereas tag soup is when your code is a complete mess…

The Wikipedia article (http://en.wikipedia.org/wiki/Tag_Soup) has listed a lot of meanings for “Tag Soup” - but the one I am referring to is #5 “Use of XHTML syntax in a document which is served as HTML”.

So, <note #1>
“Tag Soup” is NOT just about messy code - a very well written XHTML markup served as “text/html” IS ALSO tag soup.

Moving on…

Who says that XHTML is dead? XHTML5 is in development in parallel with HTML5.

XHTML is indeed dead - read this: http://www.w3.org/2009/06/xhtml-faq.html

According to W3C, XHTML 5 is actually the XML serialization of HTML 5. This is different from the “legacy” XHTML (1, 1.1 & 2).

XHTML 5 is NOT intended to be sent as “text/html”. There are NO guidelines for XHTML 5 as XHTML 1 had in the infamous appendix C.
Glance at this “official” WHATWG wiki article…
http://wiki.whatwg.org/wiki/HTML_vs._XHTML which says…

"Note that XHTML 1.0 previously defined that documents adhering to the compatibility guidelines were allowed to be served as text/html, but HTML 5 now defines that such documents are HTML, not XHTML. "

So, <note #2>
XHTML (1 & 2) IS DEAD. XHTML 5 is XML serialization of HTML 5 which is entirely different from previous XHTML versions.

W3C Recommendation 26 January 2000, revised 1 August 2002

You call that freshly oven baked??

Man, I’m never visiting you for dinner.

Originally it didn’t cater for latter MIME dietary requirements as there wasn’t one specifically for XHTML hence mainly why the menu got revised…

[ot]Anyway, you usually read fortune cookies using the r command:

This cookie has a scrap of paper inside. It reads:–More–
Spinach, carrot, and jelly – a meal fit for a nurse!

;)[/ot]

The conclusion was slightly wonky; you cannot [fully] use HTML5 it is non normative and in either case [IF] (when)ever it evolves into something tangible [2020+] - it will have to support XHTML anyway.

Yes, client-side scripting differs in various places if you use real x(ht)ml compared to ‘text/html’ as does various other things with the DOM.

Refer to the freshly oven baked tasty treat ‘fortune cookie’ of the day: http://www.w3.org/TR/xhtml1/#C_11

You should always avoid tag soup as far as humanly possible. Tag soup describes bad code, so it’s pretty much guaranteed that anything that can be called tag soup could be done better some other way.

MMmm… tag soup and spaghetti code… making me hungry.

The Tag Soup mention is referring to the fact that, as the browser (who sees “text/html” as the MIME type) goes through the XHTML, it sees things like the closing slashes as errors. So, the browser considers it technically invalid HTML, but browsers are pretty good at ignoring little errors like that, and they render the page fine. The reason we still write our tags like so <br /> (with the space before the slash) is because some older browsers had worse error-rendering and would choke on <br/> (I don’t believe any modern browser does though).

Do we have any more tasty food references? Oh yeah, cookies. Mmmm. Reminds me: if you’re writing “real” XHTML, as I understand correctly, your Javascript has to be different compared to Javascript that can be run on HTML4. Or that’s misinformation I got from some argument on these forums.

It’s difficult to get a definitive answer when you’re asking for people’s opinions!

My view is that, unless you are going to make use of other XML technologies that can’t be integrated into HTML, there is no advantage in using XHTML over HTML, and there’s more work involved. And you can’t do that unless you want to make your site an IE-free zone, which means that in practical terms there is no reason to use XHTML.

Yes, use the Strict doctype, and yes, close all your content-containing elements. But having to close spot elements is nonsense. Why is the construction <br></br> helpful, given that it can’t contain anything? It isn’t - <br> does the job just fine, it’s unambigulous and it’s much harder to get wrong, so use it, and don’t make life unnecessarily complicated for yourself.

There are articles which say that if XHTML is served as “text/html”, then it is seen as “Tag Soup” - so don’t use XHTML. Then there are some others that say it is NOT wrong to use XHTML, just use the “strict” doctype.

I think fauXHTML and tag soup are different things. All that fauXHTML means is making your spot tags self-closing (eg <br />), whereas tag soup is when your code is a complete mess - particularly when you have presentational markup but also when you have squillions of <div>s that are completely unnecessary.

Q: Now that XHTML is dead (as of December 2009), should we at all use it?

Who says that XHTML is dead? XHTML5 is in development in parallel with HTML5.

Q: Is it OK to let browsers see our markup as Tag Soup - i mean from the “Best Practices” viewpoint?

You should always avoid tag soup as far as humanly possible. Tag soup describes bad code, so it’s pretty much guaranteed that anything that can be called tag soup could be done better some other way.

Q: What if we completely abandon both HTML 4.01 and XHTML, and start using new HTML 5 with its new simpler doctype? (Two popular websites I know of, that currently use HTML 5 are Google and LinkedIn)

I’m not sure how you plan to “abandon HTML 4.01” and use HTML 5 … pretty much any code that’s valid as HTML 4.01 Strict is also valid as HTML 5!

Or if you just mean change the doctype, I don’t see any reason why not. I haven’t got round to thinking about that yet, partly because I don’t see any need to, but equally because the doctype is hard-coded into each of the 1000+ pages on my main website and I can’t be bothered to change them! Why don’t I think it is worth using HTML 5? It is only worth using the doctype if you are actually making use of HTML 5 features - and support for those is still somewhat flaky, so you’re in danger of making a site that doesn’t all work properly in some browsers.

And as for Google - Google is not an example of good practice!

Q: I don’t understand how HTML vs XHTML depends on the web page at hand - i mean how and in what ways a web page can dictate whether HTML or XHTML is right for it?

Leaving aside the unlikely issue of using genuine XML technology, whether to use XHTML or HTML comes down to authorial preference. What’s on the page shouldn’t make a difference, and if anyone says it does then it is almost certain that they are using ignorance to justify their own preference.

Q: Isn’t using XHTML all about author’s yearning to write strict markup, to always use “Best Practices”, to always use “latest & greatest” technology?

When I see people exhorting the use of XHTML, my first thought is “Can anyone say ‘bandwagon’?” … and that has been my view for about 10 years. But what once looked like a bandwagon to be an ‘early adopter’ now looks like it’s become a runaway train. IE’s complete inability to use XHTML properly has scuppered the potential benefits of XHTML (which, even if we could use it as XML I think are pretty slim and would not be relevant on any of my websites). Now a lot of people use XHTML because they think it sounds better than HTML, and it’s got more buzzword cachet. The problem with being an early adopter is that sometimes you back the wrong horse.

Q: Can’t we use normal HTML 4/5 with all the “Best Practices” that we learned for XHTML?

<barack obama>Yes, we can!</bob the builder>

[Q]: Can I draw this conclusion:
“DO NOT USE XHTML, just use HTML 4.01 Strict, or new HTML 5 with known best practices of XHTML”

Sounds fair to me :cool:

Actually by “definitive answer”, I mean whether or not to use XHTML at all.

There are articles which say that if XHTML is served as “text/html”, then it is seen as “Tag Soup” - so don’t use XHTML. Then there are some others that say it is NOT wrong to use XHTML, just use the “strict” doctype.

Q: Now that XHTML is dead (as of December 2009), should we at all use it?

Q: Is it OK to let browsers see our markup as Tag Soup - i mean from the “Best Practices” viewpoint?

Q: What if we completely abandon both HTML 4.01 and XHTML, and start using new HTML 5 with its new simpler doctype? (Two popular websites I know of, that currently use HTML 5 are Google and LinkedIn)

… but they tend to depend on the side of the bed you woke up on…

Q: I don’t understand how HTML vs XHTML depends on the web page at hand - i mean how and in what ways a web page can dictate whether HTML or XHTML is right for it?

Q: Isn’t using XHTML all about author’s yearning to write strict markup, to always use “Best Practices”, to always use “latest & greatest” technology?

Q: Can’t we use normal HTML 4/5 with all the “Best Practices” that we learned for XHTML?

[Q]: Can I draw this conclusion:
“DO NOT USE XHTML, just use HTML 4.01 Strict, or new HTML 5 with known best practices of XHTML”

These and other related points do not seem to give a definitive answer to the question HTML or XHTML?

Well, think about it: what’s the definitive answer for
coffee or tea?
ford or chevy?
blond or brunette?
dog or cat?
beer or whiskey/whisky?

well, of course there are absolute definitive answers for the above*, but they tend to depend on the side of the bed you woke up on, or what kinds of fun drugs you’ve added to your coffee.

*answers: chocolate milk, toyota, black, cat, depends on the brewery/distillery

Personally I write in HTML4.01 Strict except for those sites of ours that are either legacy or have scripts who have this habit of generating HTML with little /'s everywhere.

…to write the markup of a MODERN (web 2.0) web page.

I’ll disagree with you there.

I believe Stephen was referring to the primary media type for XHTML Family documents actually being; ‘application/xhtml+xml’ and that ‘text/html’ was (possibly justifiable for vanilla in some backwards compatible cases). Mainly for legacy browsers or ones that couldn’t cope - so that something loosely resembling a typical HTML page rendered - transitional fallback.

Also meaning ‘text/html’ would not be suitable for XHTML documents that added/included elements and attributes from ‘foreign namespaces’.

I suppose the answer is still the same we have the Microsoft browser dominance to mainly thank for this thread… Leaving HTML 4.01 as the only safe bet.

Thanks for all the responses… exciting

First, @Stomme

You’re confused: XHTML2 is dead, officially. XHTML alone is not dead

XHTML - as we know it and have been using it - is dead.
People use XHTML to “force” themselves to write strict & “best practices” markup. Sadly it ends up being tag soup for the browser.
It is this use of XHTML that is dead. The original intent and purpose of XHTML - to be “extensible” so that we can use other XML technologies with it - is salvaged by XHTML 5.

Some will argue that the original purpose was also to bring some sanity to the HTML markup by enforcing strictness - but i think we can do that with HTML too.

<conclusion>
if you JUST want to write strict markup, use HTML (4/5) and “force” yourself to exercise best practices with it.
</conclusion>

Next, @felgall

Statement 3 is wrong…

Statement #3 only highlights the fact that it is advised not to serve your XHTML as “application/xhtml+xml” - because a slightest bit of error (for example, invalid markup in a comment entered by someone) will break the entire page. And this is not acceptable for pages facing general public.

The one to avoid using is serving XHTML as HTML

Instead, why not avoid XHTML altogether.

…unless you have a very good reason for doing so.

Please tell me at least one “very good” reason where it would be absolutely necessary to use XHTML and then send it as “test/html”.

Statement 3 is wrong. That MIME type is serving it as XHTML and not as XML. Serving it as XML will work in all browsers but then it gets treated as XML instead of HTML. Serving it as XHTML also works provided you don’t need it to be able to display in IE8 or earlier.

The one to avoid using is serving XHTML as HTML unless you have a very good reason for doing so.

Felgall has an interesting take on the evils of the HTML5 doctype:

The whole thread is worth reading before you get too excited about HTML5.