HTML vs XHTML

Hey there. This is Debbie, and this is my first time on Sitepoint.

I am trying to learn web-design and am curious what people think about using HTML versus XHTML.

The book I’m reading talks about HTML 4, HTML Transitional, HTML Strict, XHTML Transitional, and XHTML Strict. (Think I got those right?!) :confused:

What is the best one to learn?

Debbie

Since my laptop is running that inferior OS, I installed IE9b. It does apparently support xhtml as application/xhtml+xml. If it supplies an error message, I don’t know how to see it. Instead of not rendering bad syntax at all, it renders to the first error, then nothing. This can be confusing, to say the least.

I have made four very simple demos (no attempt to extend the DTD). One is error free, one has a missing closing tag, one has interlaced elements, and one I’m not telling. :wink: The last can be difficult to spot even with an error message; IE9b is just plumb confusing.

See xhtml demos.

cheers,

gary

LOL! I’m with you on “the garbage” being added. Although I have to admit being guilty of it myself, up to about 2 months ago.

The fix with JS encodeURI function will not work in my case.
Because that url I posted before is coming straight from the Indeed.com XML API feed. Unconverted.
I’m inserting all the converted XML resultset values into an XHTML <div> template on the server. And then print/echo the headers and resultset back to the browser in the XmlHttpRequest.
For the job url (as posted before), starting from the second equal sign, I urlencode the equal signs only, for each job url in the resultset. Because Indeed.com escapes the job url (see the & in the job url), except for the equal signs.
But XHTML forbids more than 1 equal sign in attribute values, so it breaks the XHTML template.

Whew! Way to make a girl freak out?! :o

The book basically covers XHTML served as HTML and not as real XHTML and so apart from the doctype it uses and a few extra / it is actually teaching easy to maintain HTML (since there are lots of things that you can leave out in HTML that are required in XHTML and which the book teaches you to include which when left out make the code harder to maintain).

Oh, okay.

By learning the XHTML syntax for web pages served as HTML first you end up with better HTML than you would if you were to learn HTML along with all the shortcuts it allows (such as leaving out <head>, <body>, </p> etc tags and letting the browser figure out where those tags should be inserted).

That’s why I bought that book, because I have been taught that XHTML is supposed to be better formatted and all than back in the old days of sloppy HTML’ing.

So basically once you have finished writing your page using the XHTML taught in that book simply replace all /> with > and change the doctype at the top to the HTML 4.01 strict version and your page will be using well written HTML 4.01 strict.

Can you please include the exact code that I change and what the new code looks like? (:

Right now I have this…

[COLOR=“Green”]<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html xmlns=“http://www.w3.org/1999/xhtml” lang=“en” xml:lang=“en”>

&lt;head&gt;
&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /&gt;

&lt;title&gt;TITLE&lt;/title&gt;

&lt;link type="text/css" rel="stylesheet" href="Layouts_02_CreateSectionsUsingDIVs.css" /&gt;
&lt;/head&gt;

[/COLOR]

(Sorry, I don’t mean to be so helpless, but I know that the above syntax is VERY touchy and that if you get one thing off it will not work properly!)

Thanks for the clarification! :slight_smile:

Debbie

No. It should be exactly as I wrote it with strict in the second line.

What do you mean exactly?

Can you paste some code here to correct what I am doing wrong in that area?

and removing the closing slash on empty elements.

So instead of this…

<br />

or

<img src=“image.jpg” alt=“image” />

I should be doing it like this…

<br>

or

<img src=“image.jpg” alt=“image”>

Is that correct?

If there is more, the W3 validator ought to catch it. Syntactically there are only minor differences in the practical sense.

[B]Why didn’t the W3 Validator complain with the way things were written in XHTML Strict?

(Or am I supposed to be checking my code like it is HTML?)
[/B]

Don’t knock the book on that account; I have a javascript book from an author known to favor html, but the examples are xhtml syntax. :shrug: It doesn’t make any difference for what I’m studying, nor will it for what you’re studying except for the silliness of it.

cheers,

gary

Okay, I just freaked out because a couple of you guys started making me feel like I was doing it all wrong?! :eek:

Debbie

Only worry about using the html DTD, and removing the closing slash on empty elements. If there is more, the W3 validator ought to catch it. Syntactically there are only minor differences in the practical sense.

Don’t knock the book on that account; I have a javascript book from an author known to favor html, but the examples are xhtml syntax. :shrug: It doesn’t make any difference for what I’m studying, nor will it for what you’re studying except for the silliness of it.

cheers,

gary

I don’t see what the big difference is between the two.

About the only differences I see are things like putting “/ >” at the end of empty elements and typing attributes in lower-case and quoting attributes.

The differences seem pretty insignificant to me, so why not use the newer XHTML?

Debbie

XHTML1.0 was adopted in January, 2000, and html4 was adopted in December 1999. Not much difference in age, eh?

The two have different purposes, but both may be served as text/html. For myself, I use xhtml simply because for a long time I supported an intranet site that used real xhtml, served as application/xhtml+xml, and got into the habit of using xhtml syntax, plus all my development tools reflect that syntax.

For someone just starting, there is no good reason to use anything other than html4 strict DTD, or the developing html5 generic DTD.

cheers,

gary

You are saying I should use HTML 4 strict?

XHTML is not supported by Internet Explorer versions prior to IE9.

Really??:eek:

So if I use XHTML then most people using Internet Exlporer won’t be able to see my web pages?

(I have been studying from a Sitepoint book and they seemed to imply that I should use XHTML.)

Debbie

HTML 4 strict.

The transitional ones are for old web pages that still have some HTML 3.2 tags that haven’t been replaced yet.

XHTML is not supported by Internet Explorer versions prior to IE9.

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN”
http://www.w3.org/TR/html4/strict.dtd”>
<html>

<head>
<meta http-equiv=“Content-Type” content=“text/html; charset=UTF-8”>

<title>TITLE</title>

<link type=“text/css” rel=“stylesheet” href=“Layouts_02_CreateSectionsUsingDIVs.css”>
</head>

IE has had an XML parser for quite a few browser versions now (it was introduced in IE3 or IE4 I think). It is proper handling of XHTML that is what MS has only just added in IE9.

The simplest solution then is to update the server side code to encode the URLs before returning adding them into the template.

They are not just fooling the validator, they are fooling themselves if they do that since their page is invalid when they do that, it is just that what the validator sees is the valid page prior to the garbage being added.

The querystring in a URL should always be properly encoded for use with XHTML (and doing so with HTML doesn’t hurt either). You don’t have to use the HTML entity codes to do it either as URLs accept a shorter version of encoding than that. The JavaScript encodeURI() function may be the easiest way of quickly encoding those strings - just set up your own simple web page with a form to paste in the URL and a button to convert the content using encodeURI() and then you can copy and paste from there back into the web page source you are fixing.

XHTML is not supported by Internet Explorer versions prior to IE9.
Really? I could have sworn I read that MS had no intentions of ever implementing an XML parser even in ie9.

A shame if so. I take great pride in my pages breaking in hilarious ways for those curious ignoramuses who still use IE. They can all go suck a fat one as far as I’m concerned. Same goes for all those who claim this to be a valid reason for the superiority of HTML. application/xhtml+xml FTW!

That’s what I’m already doing. That was resolved easily :

I just pointed out another real life issue when using XHTML strict. As per the topic title. The 2+ equal signs in an attribute value.

One more caveat to take into account with XHTML Strict :
More than 1 (one) equal (=) sign in an attribute value is kissing “well-formed” goodbye.
And accepting “not well-formed” as your new buddy(ette).

E.g. with XHTML Strict, the url in the following code block will not validate. In the href= attribute, an error will be thrown from the 2nd equal sign and those following it. ( NOT counting the equal sign glued to href. So the one glued to indpubnum is throwing the first “not well-formed” error. )

<a title="Master security Architect" href="http://www.indeed.com/viewjob?jk=2d503d7424708a28&amp;indpubnum=9092264779851944&amp;atk=15b4e29th0k4159i" rel="external nofollow">Master security Architect</a>

The exact same url will be accepted of course if you fall back to a less strict (non-XHTML) DTD.
Now comes the funny bit:
Simply encoding all 3 equal signs (from ?jk= onwards) as either %3D or

& # 6 1 ; (without the 4 spaces in between)

will break the url, resulting in a “This job was not found. Please try another job search.” message.
The problem is caused by encoding the first equal sign following ?jk, as either %3D or

& # 6 1 ; (without the 4 spaces in between)

If you leave this one alone and encode the last 2 in the url, everything works. The job will be found, and the XHTML document will be well-formed.

There is also the “problem” with the discarded target attribute, making people fooling the XHTML validity checker by adding the target attribute to the document’s <a> elements in the DOM, through JavaScript.

XHTML is a can of worms, most of the time.

The example of valid html could be even more sparse:

<title>title</title>
<p>lorem ipsum

The DTD is not required, except by the validator to know which grammar to check.

The http-equiv is an alternative to the server response header.

The tags for the html, head, and body are optional.

The end tag for p is optional. The p element is required only in the sense that body must contain one or more of a block element or a script.

The title element is required.

My example is valid html, but not good practice. It is invalid xhtml, thus xhtml teaches better habits.

cheers,

gary

Oops!! My bad!! :blush:

The book I spent all weekend reading “Head First: HTML with XHTML & CSS”. (You’d think I’d know what I was reading?!) :lol:

Well, they definitely imply that you should use XHTML for like 8 reasons.

Don’t be jealous, though, because I also bought a Sitepoint book!! (Just didn’t get to it yet.)

I guess my confusion was because I bought another Sitepoint book a while ago - something about doing design w/o tables - and I think it pretty much said go with XHTML as well, even though the book was about CSS primarily.

From the little I know, HTML 4 is supposed to be pretty ancient. I mean they created XHTML like back in 2000, right?!

And I - for one - am not going to worry about IE6.

Debbie