Character Encoding - IIS6 vs IIS7

During the migration from IIS6 to IIS7.5 I am losing a war with characater encoding. There are about 50 sites being migrated, and for the most part there are little problems, but I do have a multi-lingual site that is showing some major differences between the two servers.

I am performing some tests, using a single page with the following meta tag:
<meta http-equiv=“content-type” content=“text/html; charset=iso-8859-1”>

Auto Detection by Browser

•Firefox on IIS6 - Western ISO8859-1
•Firefox on IIS7 - Western ISO8859-1 (but the BOM shows at the top of the page)
•IE 8 on IIS6 - Western European (ISO)
•IE 8 on IIS7 - Unicode (UTF-8)
Both browsers display the page from IIS6 properly, and correctly call the CHARSET. But IIS7 seems to be changing something, and I can’t locate the problem. IE8 is detecting UTF-8 and while FireFox is obeying the “http-equiv” tag, the presense of the BOM is suggesting it is a Unicode page.

•Next I opened the page in Notepad++ to make sure it was not unicode. Sure enough, it was plain text.
•I then opened any included pages (connection scripts, etc…) and they too are all plain text.
•Content is pulled from MySQL - where is it stored in LATIN1

Why is the BOM making an appearance?
Server Differences

•IIS6 > IIS7.5 (Windows 2008 R2)
•PHP 5.2.11 (Isapi) > PHP 5.2.14 (FastCGI)
•MySQL 5.0.91 (Manual Install) > MySQL 5.0.91 (installer version)
(Both Databases Programs are setup as UTF-8 for Default, however this database is setup as Latin1)

Why does the problem only exist on the IIS7 box?
Where is the BOM coming from? PHP, FastCGI, IIS7 itself, MySQL Installer?

Does anyone have any suggestions? I’m stumped! I don’t have any time restrictions - I can keep the IIS6 box alive as long as I need to, but I would like to get to the bottom of this.

Check the http headers – those could be overriding the declaration. Everything in IIS is running through ASP.NET which “thinks” in utf8.

I have checked the header, tried to set the headers, all to no avail, however your second line is something I never really considered!

As I double checked all my sites on that server - IE8 is declaring everything as UTF-8, despite the ISO8859-1 meta being added. FireFox is reading the META and adjusting accordingly, but YES the BOM is appearing on all those sites, so even Firefox is being told it is UTF-8.

OK - Likely an ASP.NET issue. At least that is something to go on! Thanks for that.

What a nightmare.

Using LiveHeaders with Firefox, there is no sign of UTF-8 being sent in the header, yet the BOM is still existing. I have opened every single included file with Notepad ++ and there is no BOM. These are ASCII files.

I went into IIS7 config, and looked at ASP.NET Globalization. Sure enought - it was all set to UTF-8. I changed it all to ISO-8859-1, restarted IIS7, no luck. I did this on both the site level, and the global/server level without success.

Surely IIS7 can server up ISO-8859-1 encoded pages and not force everything into UTF-8 by including the BOM?

I’d really doubt it – MS is about as good as they come with globalization. That said, have you checked your IE settings? Or checked on a different computer. Might as well eliminate all variables. In addition, perhaps you could post a url for people to look at to see if they have similar problems.

EDIT:

Re-read. Have you checked PHP & fastCGI?

wwb_99: Thanks for your input. Based on what you were saying, and the fact that there isn’t much talk on the net about it, I figured I “must” have done something differently.

Sure enough - I found it and the problem is now solved.

Turns out it was my “db connection” script. I never thought to check that, since it does not have any output (and has never caused me grief in the past). All the sites use a DB, so all sites were using a connection script.

When I moved the sites from the IIS6 server to the IIS7 server, the database info had changed. Rather than open the scripts on the server to edit them, I used a Control Panel File Manager to open them, make a couple changes and re-save them. That File Manager was re-saving them as UTF-8 and entering the Byte Order Mark (BOM)!

All is good. Thanks for your assistance.