I can’t get a page to validate to WC3 HTML4 Strict and I think the reason is that I’m using PHP includes (in files with and HTML extension although testing with the file extension changed to .php still failed the validation). Trying different DTDs hasn’t helped but removing the first PHP include makes the page validate.
The error message is:
"Sorry, I am unable to validate this document because on line 24 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
The error was: utf8 “\x96” does not map to Unicode"
But I can’t see anything in that line that looks wrong. It’s just a reglar <p>text</p> ?
But if it’s a bad char, then how come the page validates when I remove the line that calls the PHP include?
Also, when I compeletly remove that line it still fails at line 23 so I don’t think it’s line 23, I think it’s being caused by something earlier in the code which the PHP include is.
<div id="navigationTabs">
[COLOR="Red"]<?php include('phpincludes/navinc.php'); ?>[/COLOR]
</div>
</div>
<div id="mainContent">
<div id="leftColumns">
<div id="leftleftColumns">
<h1 class="largerRedText">Algeria</h1>
[I][COLOR="Green"]**This is line 23 ----->[/COLOR][/I] <p>There is great potential in Algeria for UK companies because the market is just beginning to open up. Companies that persevere and decide to enter the market now will reap the dividends. The economic fundamentals are strong and there are plans for significant government spending over the next 5 years, in particular on infrastructure projects. </p>
</div>
You are not providing all the code, and what you do provide validates.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title></title>
<meta name="created" content="Tue, 26 Oct 2010 15:02:41 GMT" >
<meta http-equiv="content-type" content="text/html;charset=utf-8" >
<meta name="description" content="" >
<meta name="keywords" content="" >
</head>
<body>
<div>
<div id="navigationTabs">
<?php include('phpincludes/navinc.php'); ?>
</div>
</div>
<div id="mainContent">
<div id="leftColumns">
<div id="leftleftColumns">
<h1 class="largerRedText">Algeria</h1>
**This is line 23 -----> <p>There is great potential in Algeria for UK companies because the market is just beginning to open up. Companies that persevere and decide to enter the market now will reap the dividends. The economic fundamentals are strong and there are plans for significant government spending over the next 5 years, in particular on infrastructure projects. </p>
</div></div></div>
</body>
</html>
I had to add some div open and close tags, but this validates.
I would suspect something else in your code has an issue, or running something through my text editor eliminated an invalid character.
according to an app called unicodeChecker, the decimal 150/hex 96 is:
• Character Name: <control>
• Unicode 1.0 Name: START OF GUARDED AREA
• Block: Latin-1 Supplement
• Designated in Unicode 1.1
• start of guarded area
• guarded area, start of
• area, start of guarded
my guess is, that isn’t the character that’s in the document, but for one reason or another a byte is getting interpretted as that. anyway, it doesn’t matter that much about the character itself i don’t think.
the line you’ve marked as 23 may well be line 23 in your un php processed file, but it won’t be line 23 in the outputted by php html, which is what the validator sees therefore is talking about. access the webpage in your browser, view source, select all the source and paste it into your text editor. now see what line 23 is. it is a weird byte that’s getting in there somehow. it’ll probably be in/from your hpincludes/navinc.php file. e.g. one thing i find is copying and pasting code from web pages sometimes ends up with odd charecters in there. in BBEdit, the text editor i use, there’s a menu option called zap gremlins, and this replaces all odd characters with a bullet point. that’s the sort of problem you’ve got.
> But if it’s a bad char, then how come the page validates when I remove the line that calls the PHP include?
because that odd byte will be coming from the code that’s being included. navinc.php is where the problem is coming from.