Wierd characters show on basic web page

I am just starting a web site, using the 2008 version of Build Your Own WEB Site the Right Way. I typed in the markup, but it looks like there are no errors. However, when I display it as a WEB site, there are weird characters preceding the heading. Looking at the page source, the characters are there, but not on my notepad edit. Any ideas where they come from?

I am putting my money on it is your website/server at fault (especially it you don’t see it off-line). Do you have a link to the website in question?

Usually, it is not necessary to use UTF-8 encoding via the BOM as you could use META <meta http-equiv=“Content-Type” content=“text/html; charset=utf-8” /> or <?xml version=“1.0” encoding=“UTF-8”?> if you write XHTML grammar, etc.

Albeit it sounds like your Server is not behaving or is set-up wrong. I’ve seen somebody else post a similar question on these forums and their web server was setup wrong so they asked the Web Site Administrators to fix it - then it worked.

The only main downside is if you are just writing English is you cannot enter certain special characters directly in your code, i.e. € , £ you would have to use a ‘Character Entity Reference’. Some languages like KJC (Korean, Japanese and Chinese) are multi-byte rather than plain ASCII so you would basically need something like UTF-8 for those.

Now, if it is the Server doing this then you should ask for it to be fixed; you shouldn’t have to suffer if it’s their fault - which by what you have said it sounds like it to me.

Don’t worry Steve about posting the reply in the wrong place I’ve called the Moderators to move it back here - there was no need to start a new topic. :wink:

Yes, it’s the BOM as I suspected: http://www.w3.org/International/questions/qa-utf8-bom I assume you are writing all you HTML within Notepad are you using Windows XP or higher?

Well, you don’t really need the BOM and if you are following the book (which I don’t own) I think it asks you to save as ‘Encoding UTF-8’ it shouldn’t really cause these issues though.

Does this only happen when you upload the pages to your website or when you have them off-line too?

You would possibly benefit from a better editor than Notepad for making the pages or removing the BOM. My brain is completely-dead at the moment due to the joys my keyboard and mouse malfunctioning.

But if you can expand a little upon what I’ve asked regrading your setup and web browser that would be good. :slight_smile:

Hi xhtmlcoder,

I am writing the HTML within Notepad in Windows XP. And this only happens when I upload the pages to my website. In Notepad, I save it as UTF-8, which is probably causing the problem, but the book says to select Unicode (UTF-8). But in the Importance of UTF-8 paragraph, he says if I neglect to save it as UTF-8, it is unlikely I will notice any difference, unless someone else else tries to view the site (say, Korean), he will notice. Well, this is going to Native Americans, who probably use English or at least know it, will they see it different? If you think not, I won’t save it as UTF-8 and see what happens.

xhtmlcoder asked for a code sample. Here is the start of my source from Notepad:

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html xmlns=“http://www.w3.org/1999/xhtml”>
<head>

and here is the start of the source behind the site, with the weird characters:

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html xmlns=“http://www.w3.org/1999/xhtml”>
<head>

Thanks,

Steve

I suspect it’s down to Character Encoding are you saving the files as UTF-8 via Notepad? I suspect either a BOM has somehow been placed there or you have a mismatch of encoding.

Do you have an example of your faulty ‘source code’ you could show us?