This is unlikely to be an Apache problem (tho I’m not ruling it out).
The text itself: what what it created in (basically I’m asking if it was made in Word or some other Windows product), and how was it saved?
When a browser cannot show a character outside US ASCII, it usually means one of three things don’t match:
the way the document was saved (a text editor can save as a certain charset)
the way the HTML page states the charset
the HTTP header the server states
The server will always be able to override the meta tag in the HTML page, so even if the meta tag were wrong, you’d be ok if Apache’s sending the page out as UTF-8 or whatever you wanted.
However if the page was saved in some Windows program as one of the Windows charsets, then no matter what Apache says, the browser will have trouble. The browser will try to use the charset stated in the HTTP Header.
If PHP is serving the page then I would think you’d have to go make sure PHP isn’t doing anything weird with the encoding.
But check the documents first. One thing you can test is manually typing out character entities:
I write in UTF-8. I had some Swedish on a page, På svenska, and while I normally manually write out special chars, this one time I didn’t (for the å).
My colleague put everything on the server and set the server and the database to ISO-8859-1. The å wouldn’t show up, but all my other characters did, because they were written like &# 229; (without the space). Once I changed that to a decimal character entity, it appeared! Yes, the db and server should have been utf-8 but for some reason everyone @work thinks Latin 1 is excellent choice for international-language pages (they want to include Greek. hahahahahaha).
If your chars show up when you type them out like that, then definitely it’s a charset mismatch, likely with the document itself.
Ah! It was indeed the document itself. I was using an application called EditPlus, and I guess the default encoding (at least, I’ve never touched the encoding before) was Ansi. So, I changed it to utf-8 and that did the trick.
Great pickup! I knew it was in the language specification but, as an English-only adult (after studying three other languages through my school years), I really don’t have a clue about language specification (and have to look it up all the @#$% time!).
you’re right about that being an error, and the HTML validator would have complained I think, but so long as Apache’s sending out a charset Header, the browser wouldn’t choke or anything. It would just ignore the meta tag either way.
Yes Live Headers is one of my few plugins, it’s very nice : )
Linda: glad it worked. If you’re getting files from others, you’ll need to ensure they too save as the correct charset. We have that trouble still with people sending us files written in Word : (