Choose the correct Doctype

Aarik · March 8, 2012, 5:26pm

Anyone please let me know how to choose the correct Xhmtl doc-type, whether i have to choose strict or transitional.

RyanReese · March 8, 2012, 7:38pm

Transitional is stupid and not really practical for todays use much…it’s basically for those in a transitional period to strict. So always choose strict doctypes :).

Now, as for which oen to pick. You can either use XHTML or HTML. If you use an XHTML doctype, make sure you feed a meta tag which serves the page as text/xhtml because otherwise you aren’t REALLY doing XHTML. It’s FAKE XHTML. And if you DO SERVE IT as text/xhtml, then IE can’t play along. Which means it’s not feasible today.

So based off those facts, you only have one choice really (and the choice I always do). HTML Strict. Enjoy this doctype.

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” “http://www.w3.org/TR/html4/strict.dtd”>

felgall · March 8, 2012, 7:52pm

Transitional indicates that the page still contains HTML 3.2 tags (the previous standard that was replaced in 1997). Such tags were sometimes necessary as late as 2005 (when Netscape 4 died out) in order to make up for the deficiencies in old browsers with their lack of support for CSS. There has been no reason whatever for creating new web sites using transitional since then. The only sites still using transitional are supposed to be those created before 2005 that are still in the process of reworking the code to get rid of the HTML 3.2. Once all the 3.2 is gone those sites can then change their doctype to strict (which indicates that the site only uses HTML 4 or XHTML 1 tags - depending on which of the two doctypes is used).

XHTML 1 transitional which contains HTML 3.2 tags should be served as HTML since the HTML 3.2 tags are not a part of XHTML. The pages should be switched to use the strict doctype before attempting to serve the pages as XHTML. Serving pages as XHTML is still a few years away from becoming practical though since there are still too many people using IE8 and earlier which don’t support XHTML to use it yet. There is nothing wrong with using an XHTML doctype if you plan on switching to using XHTML once IE8 dies out though - as long as you are aware that there are significant differences between the way JavaScript interacts with XHTML and the way it interacts with HTML and so all of your javaScript will need to be reviewed when you do switch from HTML to XHTML to check that the scripts still work.

Aarik · March 9, 2012, 11:06am

Lots of complication for choosing the doc type!

Stomme_poes · March 10, 2012, 8:36pm

Warning: the following text is all gobblegobblegobble. Thank you for flying!

A note:

First, another vote for Strict. If you are creating a new web page, there is no reason to use anything other than strict. This is for YOUR benefit, though, not the browser’s.
This is because browsers do very little with doctypes. In fact, a browser will only ever do one thing with your doctype: check to see if your doctype indicates that your web page was written in the 1990s or modern times. This is because browser vendors found it convenient to use the state of your doctype to decide if they should use outdated CSS rules when displaying the page or not. This is called “doctype switching.” It’s clever, and a bit silly.

The “new” so-called HTML5 doctype ( <!doctype html> ) is considered “standard” by all the user agents you care about, so this counts as a “modern” doctype. You may safely use it if you’ve been thinking of it.

So what’s a doctype for, and why does it matter if it’s just for doctype switching? It’s for YOU. It’s for the HTML validator(s), which is a tool useful to YOU as a developer. The validator helps check if your document is following the rules proposed by your chosen doctype.

Adding a meta tag will not (necessarily) set what how your document is read. You can make a meta tag saying
<meta http-equiv=“content-type” content=“foo/bar; charset=blah”>
and you could still have flowers and puppies.

This is because unless your server serving the page isn’t telling the browser what the page is, browsers ignore this line (same with the charset: if the server is setting the charset in a header, the browser can and will ignore the meta tag, because the server has precedence).

The reason you don’t put gobbeldygook in that tag is because what if the server ISN’T sending the right MIME type? Then you’re in trouble. Supposedly, if “foo/bar” or even “foo” is not understood, the agent isn’t supposed to even try to display the information at all. It should give a MIME type error.

If you have “text/foo”, then the agent can fall back to “text/plain” if it has to. If it does, you’ll see the HTML tags on the screen

And lastly this is why even if you wanted to serve a page as XHTML, first the server would have to send the correct MIME type, and if it didn’t and you used
text/xhtml
any user agent who doesn’t understand that can and will fall back to text/plain. Which you don’t want.

Instead you’d have type=“application/xhtml+xml” and here, the agent should understand “application” is something that’s not text, and if it doesn’t understand “xhtml” it can fall back to “xml” which often still does what you want.

So.
Using an XHTML strict doctype and a meta tag stating this document is application/xhtml+xml served from a server who sends along the HTTP Header stating the document is really text/html, then what you have… is an HTML document with a bunch of silly slashes, which are considered errors, and ignored. Weee…

What felgall said about how scripts work in real XHTML… they’ve made even more changes if you are using XHTML5 (so far as I can gather, you have XHTML5 if you are sending application/xhtml+xml to a user agent who has an HTML5 parser). It will parse the page as XHTML using some of the new rules added to XML and XHTML. I tried to read what those differences were but me engrish not bestest there.

felgall · March 10, 2012, 11:05pm

Not really.

Unless you have a specific reason to choose otherwise you should use:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN”
“http://www.w3.org/TR/html4/strict.dtd”>

If you have a reason for using a different doctype then you wouldn’t be asking which one to use.

samanime · March 10, 2012, 11:53pm

Actually,

My vote now is to use:
<!DOCTYPE html>

It’s the HTML5 DOCTYPE, though it’s perfectly acceptable for HTML 4.01 as well. It’s not truly understood by older browsers, but it won’t hurt them.

Stomme_poes · March 11, 2012, 11:19am

I’ll generally agree with samanime, but when I go to the validator, I tell it to act as if the page is XHTML1.0 Strict. Cause I want it to yell at me if I forgot to close a </p> or a </li> or didn’t quote some attribute.

felgall · March 11, 2012, 6:12pm

It is actually the short version of the SGML doctype for HTML 2 through HTML 4. HTML 5 doesn’t have an SGML doctype because it doesn’t follow the SGML standard - instead it currently has an HTML tag that looks like the short version of the HTML 2 doctype. Presumably once HTML 5 gets past the working draft they will add a different doctype to it so as to identify the difference between that and the current version since there are so many tags and attributes there that are alternatives for the same thing and obviously they will not all still be there once the standard is completed.

You can’t validate against a specific version of HTML if you use the short version of the doctype. (That’s a second reason why they’ll need a different doctype once HTML 5 actually exists).

Stomme_poes · March 11, 2012, 7:26pm

They won’t, though. They’ve said they don’t believe in versions anymore and call HTML a “living standard”. Kinda worthless for developers but, there you have it.

You can tell the validator to validate as HTML4 or XHTML1 if you want. I find HTML5 is very loose and allows you to write in 3.2 style. So I validate as XHTML. Works for me.

samanime · March 12, 2012, 3:10pm

I always validate as HTML 4.01 Strict (I don’t add the self closing />). =p

And like Stomme says, they’ve mentioned that the HTML 5 DOCTYPE is going to stay as it is. It’s not going to be given all that extra stuff any more.

system · March 12, 2012, 3:18pm

Which is precisely why I just use XHTML 1.0 Strict in the first place, and you couldn’t pay me to use that HTML 5 idiocy that once again I cannot believe ANYONE is dumb enough to even WANT to use in the first place.

But again, it’s crafted for all the people who never pulled their heads out of 1997’s backside and until recently were just sleazing out HTML 3.2 and slapping a tranny on it. Now they use the HTML 5 lip-service; net change zero, they’re still just whoring it out. It’s certainly not meant for anyone who embraced strict, minimalism, separation of presentation from content, or any other sane coding practice or progress of the past decade!

samanime · March 12, 2012, 5:05pm

It’s not meant for them, but that doesn’t mean they can’t use it. =p

Stomme_poes · March 12, 2012, 8:03pm

Yeah, make the poor stupid browser do all the work of finding those erroneous slashes everywhere and remembering to ignore them as Mostly Harmless.

I kid, I doubt it makes a dent in rendering time (if it did, Standardistas would be b*tching), but I feel it’s a waste of typing for me.

Why other things like closing p’s and li’s and quoted attributes don’t, I dunno.

felgall · March 12, 2012, 8:52pm

Well the HTML 5 doctype is also the short version of the HTML 2 doctype so you’d expect it to be very loose in what it allows since it doesn’t identify what version of HTML the page is written in.

Of course HTML 5 is very much like HTML 2 in many ways so using the same doctype for both is probably quite reasonable.

I’m sure that those writing validators will come up with HTML 5 strict and transitional doctypes that pages can use to avoid having to specify the version of HTML manually even if those throwing tags at the HTML 5 label to see what they can get to stick have stated that they will not be doing so.

As for using an XHTML doctype - it makes perfect sense to do so if you intend to actually use XHTML once all the browsers support it. Once IE8 is dead using XHTML proplerly for creating web pages will be possible and then the closing / will be required in order for the page to display at all.

system · March 13, 2012, 6:13am

For me it’s about clarity and consistency; something that HTML has always lacked being based on SGML. I have a big multi-attribute empty tag, using just > means if I scroll down so all I can see is that closing bracket, I don’t know if that’s a tag opening something or closing something… always using /> on empty tags means when I take something like:


<img
  src="images/something.png"
  width="320" height="480"
  alt="some random image"
  class="trailingPlate"
>

While elsewhere I might have:


<a
  href="#someReallyLongPointlessURLMakingMeWantSeparateLines"
  title="say where this link is going because there is something wrong with the content"
  class="trailingPlate"
>

… and scroll down so all I can see is:


  class="trailingPlate"
>

I don’t know if that’s a close or an open to a tag. Simply adding the shorttag slash to the endings of empty elements makes it clearer what’s going on.

The difference between > and /> is important for code clarity for me. It’s one of the things that made me move to XHTML in the first place despite my distaste for extra characters and tags.

Clarity also being why I hate the “let’s stuff all our attributes onto one illegible line” rubbish.

Stomme_poes · March 13, 2012, 7:47am

Since the browser does not care, I see why this doesn’t matter. Since the validator allows the looser rules of HTML5, I would always tell the validator to pretend it’s something stricter. So, you’re right that it is looser, but not because it’s a short version of HTML2 but because that’s what teh WHATWG (and later W3C) have stated are the rules. They just went back on a lot of things, explicitly. I’m sure this was easy most of the time since browsers knew how to be easy on the rules deep in their parsers.

Browsers don’t know or care the difference between Strict and Transitional. They don’t know the difference between XHTML and HTML if the server is telling them text/html anyway. They certainly don’t care what version of HTML it is. Versions are for us developers, so when we talk to each other about rules, we are all on the same page. I find Mozilla’s plans to drop versions similarly retarded. Someone has a new bug? “What version of FF are you using?” “Dunno, FF.” Arg.

Aha, now I see why you disliked Perl

Clarity for someone is a good enough reason for me, so that makes sense.