Charsets, Headers, and AJAX

I am having a little trouble with characters being transcoded correctly when received through AJAX.

I have a simple PHP script that when executed outputs the character “—” (an m-dash). When I visit the page directly, it outputs the character fine, but when it is loaded through AJAX, a “?” (as in unknown character) appears in its place. This occurs in all browsers tested so far (FF, Opera, and Safari).

The only difference I can think of between the two requests are the headers. The response headers are the same, but the request headers are different.

Request headers for direct page load:

Accept:application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Cache-Control:max-age=0
Referer:http://localhost/path/to/my/file.php
User-Agent:blahblah

Request headers for AJAX page load:

Cache-Control:max-age=0
Content-Type:application/x-www-form-urlencoded; charset=UTF-8
Origin:http://localhost
Referer:http://localhost/path/to/my/ajax/loading/file.html
User-Agent:blahblah

Response headers (same on both):

Cache-Control:no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Connection:Keep-Alive
Content-Type:text/plain
Date:Sat, 03 Jul 2010 05:38:41 GMT
Expires:Thu, 19 Nov 1981 08:52:00 GMT
Keep-Alive:timeout=5, max=94
Pragma:no-cache
Server:Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8l DAV/2 PHP/5.3.1
Transfer-Encoding:Identity
X-Powered-By:PHP/5.3.1

What could be causing this problem, and how could it be fixed?

Hmm. Oddly enough, when I specifically serve the PHP file with the Latin-1 charset (before I didn’t specify a charset), the character comes though correctly. But, at first, under UTF-8, it didn’t appear correctly.

Then, I just decided to convert everything from Latin-1 to UTF-8, including the database from which the character was loaded and the html encoder. (There are lots of places that you have to change the encoding/charset!) Now it all works well :slight_smile:

Yes, there sure are a lot of places. The database charset and collation, your text editor, the web page, HTTP headers.

But as long as you pick one that works for what you need and stay consistant you should be OK.

I sure it’s a Character Encoding issue. The Definitive Guide to Web Character Encoding

How to fix? Not so sure. Are you testing this with localhost files? Is everything UTF-8?