XHTML 1.0 Transitional = XML Parsing Error: Entity 'nbsp' not defined

The new W3C Validator was released today, and the new version is causing some errors to appear in my pages that weren’t there before.

We are using the XHTML 1.0 Transitional as our doctype, and the validator is throwing errors with all of my special characters, like   and © saying “XML Parsing Error: Entity ‘nbsp’ not defined”

They do not show up as errors if forcing the XHTML 1.0 Strict doctype–which seems backwards to me :expressionless:

Any idea on how to keep this from happening without changing our doctype?

Try using the numerical Unicode character entities rather than the named ones.

For example,


&#169 ;

instead of © and


&#160 ;

instead of   (but with no spaces between the number and the semicolon)

I haven’t tested this (since I don’t develop or deploy using a Transitional DOCTYPE), so it may or may not work.

Also try checking to see that the character encoding you’re saving your Web pages as matches what you declared them as and that they also match the HTTP headings sent by your server.

These both validate for me:

Hopefully that helps. I would have to see your code to help further.

Cheers,
Micky

You should never used named entitiy references with XHTML.

Those are defined in the DTD, and a non-validating XML parser (which is what browsers use) is not required to read a DTD. Using a named entity reference other than <, >, &, " or '* in XHTML may cause a fatal well-formedness error in user agents.

You will get away with this practice if you serve your documents as HTML, but that means you require your XHTML to be served as HTML, which is harmful.

*) You mustn’t use ' if you serve your document as text/html; use ' instead.

@AutisticCuckoo: I wish I had your knowledge when it comes to XHTML!!!

:slight_smile:

I swear I have read your XvHTML FAQ like a ka-zillion times, and I still find myself serving XHTML Strict pages as HTML.

Seems like most people on the net serve-up harmful pages, myself included.

Sitepoint also breaks these rules?

You know what would be nice: A set of example pages with DTD’s and source that follow the REAL rules of XHTML… I do not think I have ever seen a true XML XHTML strict document.

Edit:

I guess a better question is this: What is wrong with the examples I provided? Could you tell me how one would make them a true XHTML document? Or, I guess I have never fully understood why it is so harmful to server XHTML as HTML… I mean, I think I understand why you would consider it wrong/harmful, but… never have I encountered any problems when it comes to doing things the harmful way.

Am I making any sense?

Hmm, I think it would be great to see a tutorial where someone takes a “valid” but “harmful” XHTML strict, existing, page (maybe from a well-known website)… and converts it to a TRUE XML XHTML document… that would be interesting to see.

I personally learn best by example/visuals.

Visit my blog (there’s a link in my sig) with a browser that supports XHTML and says so, e.g., Opera 9 or Firefox. I’m using silly content negotiation to serve XHTML as application/xhtml+xml to user agents that claim to prefer that, and HTML as text/html to the rest.

Try serving them as an application of XML and look at them in different browsers. Since you’re using PHP, you can do this by adding this before your doctype declaration:

<?php
  header('Content-Type: application/xhtml+xml');
  echo '<?xml version="1.0" encoding="iso-8859-1"?>', "\
";
?>

Your samples may work in some browsers (IIRC, Firefox uses a hard-coded list of HTML entities even for XHTML and Opera 9 seems to do the same), but it will fail in others (Opera 8?).

It’s not harmful as such. But if you write your ‘XHTML’ markup, CSS and/or JavaScript in such a way that the document will work only if served as HTML, then you’re engaging in a practice that I would label harmful. If you claim it’s XHTML but it doesn’t work when served and interpreted as XHTML, then you’re doing something wrong.

As long as your documents comply fully with Appendix C of the XHTML 1.0 specification, you can serve them as text/html. That’s not harmful, only pointless (from a technical point of view).

Ahhh, thanks for the clarification and tips! :slight_smile:

So, would it be wise of me to convert my DTD’s to HTML 4.01 strict? I guess I would have to read-up on the differences between that and the XHTML strict/transitional ways that I have been currently coding in… This is when your FAQ comes in handy! :smiley:

Kinda frustrating because it seems like many books, professors, and other folks have taught me to code using XHTML dtd.

I guess my main concern: Will my CSS render differently using HTML strict DTD? Will that avoid quirks mode?

Sorry for such noobie questions. :frowning:

Thanks a billion AutisticCuckoo!
Cheers,
Micky.

Just thought I would post a couple of examples of confusing info to me:

http://forum.mootools.net/viewtopic.php?id=1673&p=3#post-21696

A couple of things. First - and this may or may not be related to you problem - use a Strict doctype. Never use a Transitional doctype. Let me explain way:

The Transitional DTD includes everything in the Strict DTD plus deprecated elements and attributes: Transitional DTD = Quirks mode while Strict DTD = Strict mode. Quirks mode and strict mode are the two ‘modes’ modern browsers use to interpret CSS (visual layout or styles). Documents with Transitional DOCTYPEs (or no DOCTYPE at all) are displayed using the “quirks” mode. This mode emulates legacy bugs and behaviors of version 4 browsers. Documents with Strict DOCTYPEs are displayed using the “strict” mode. This mode follows W3C specifications as closely as possible.

When Netscape 4 and Explorer 4 implemented CSS, their support did not match the W3C standard (or each other). To make sure that their websites rendered correctly in the various browsers, web developers had to implement CSS according to the wishes of these browsers against the W3C specifications (remember “This website is optimized for Netscape 4”). When standards compliancy became important browser vendors faced a tough choice. They had to:

  • Allow web developers who knew their standards to choose which mode to use. - Continue displaying old pages according to the old (quirks) rules.

In other words, all browsers needed two modes: quirks mode for the old rules, strict mode for the standard. Choosing which mode to use requires a trigger, and this trigger was found in ‘doctype switching’ (AKA Transitional vs Strict).

[End XHTML Lesson]

A lot of Web Standards books will recommend that you begin coding with Transitional and when you are more comfortable with the syntax rules can move to Strict. That assumes that you 1) Code your pages for only 1 browser and 2) Code complete garbage. Since that is rarely the case that is bad information. If you code in Transitional your pages will look different in every browser as they attempt to emulate legacy behaviour and you will go crazy writing hacks. Code in Strict and you can actually debug your pages effectively.

Anyway, right the second thing, it looks like you have a syntax error on line 13 of your HTML: it looks like the trailing “,” after “type: ‘fade’” (Firebug throws this warning: trailing comma is not legal in ECMA-262 object initializers).

And:

http://stickmanlabs.com/lightwindow/

Your DOCTYPE, just use STRICT or TRANSITIONAL, anything else is just lazy. I have no intention of making this thing work any other way, that would be promoting what I think is bad CSS.

(Note: I think the above folks are refering to XHTML DTD’s and not HTML 4.01 strict DTD’s???)

The lightwindow site fails validation horribly… He also uses custom attributes for certain elements (“caption=” and “params=” within the A tag)… the author says:

As for the custom attributes, xhtml is xml, if you need it validate you can allways add your own dtd based off of your normal doctype and then you would be fine.

Seems like all these Ajax coders prefer the strict XHTML DTD’s… why? Will the code not work as HTML 4.01 strict?

Arrrrrrgh!!!

Lol, ok… sorry… had to vent.

Cheers,
Micky

It doesn’t matter much whether you use HTML 4.01 Strict or XHTML 1.0 Strict as long as the latter complies with Appendix C. I personally prefer HTML 4.01 Strict because it feels more ‘honest’, but that’s just me.

If you’ve been writing XHTML markup and serving it as text/html, the only things you need to change are:

  • remove any XML declaration
  • replace the doctype declaration with the one for HTML 4.01 Strict
  • remove any xml:lang attribute
  • remove the ‘/’ from self-closing tags like <br />

The differences between Strict and Transitional are far more profound. Basically, most purely presentational element types and attributes have been removed in the Strict DTDs. The behavioural target attribute is also gone.

It frustrates me, too, when alleged experts recommend XHTML indiscriminately.

Since you have served your XHTML markup as HTML, there will be no difference at all. You’ve been using HTML all the time. There are very few cases when XHTML vs HTML affects CSS rendering and you’re not likely to have encountered them. (It’s possible that some of your pages wouldn’t render as intended if you were to serve them as an application of XML, though.)

As for standards mode vs quirks mode, that depends on your doctype declaration. Make sure to include the FSI – the URI to the DTD – at the end:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

LOL :slight_smile:
The only thing left is the Frameset DTD, and that’s only applicable for framset pages anyway. The interior pages within a frameset normally needs a Transitional DTD which allows the target attribute.

Since these coders generally want their pages to work in Internet Explorer, they’re serving their pretend-XHTML pages as HTML. Being honest and using HTML markup won’t affect the Ajax code.

AutisticCuckoo… THANKS!!! Great info!

You have really helped clear-up some confusion for me… I can not thank you enough for your time. :smiley: :tup: :spf:

Your knowledge is golden. :wink:

Now that I have a better understanding of what-is-what, I will start doing things the right way.

I sure do wish this knowledge was more wide-spread. Glad there are folks like you spreading the word.

Have a great day/night!
Cheers,
Micky

Thanks, Micky! I’m glad I could be of help. :slight_smile: