I am trying to store the following string to a mySQL table, using an Ajax call (the mySQL database is set up to use utf-8):
Academia-Gate — the Nanny State & The Professors: My Brief Email Exchange With The Co-Chair of the “Cry Wolf” Project
The string arrives at my php Ajax handler function looking like this:
Academia-Gate — the Nanny State & The Professors: My Brief Email Exchange With The Co-Chair of the “Cry Wolf” Project
When I try to store it to my mySQL table, the string is stored like this:
Academia-Gate ??" the Nanny State & The Professors: My Brief Email Exchange With The Co-Chair of the “Cry Wolf” Project
I.e. with ??" in place of the long dash near the beginning.
I have tracked this down to something that happens during a call to mysql_real_escape_string.
The following code:
$str = 'Academia-Gate — the Nanny State & The Professors: My Brief Email Exchange With The Co-Chair of the “Cry Wolf” Project ';
$str = mysql_real_escape_string($str, $conn_id);
…changes the contents of $str to:
Academia-Gate ‚Ä\\" the Nanny State & The Professors: My Brief Email Exchange With The Co-Chair of the ‚ÄúCry Wolf‚Äù Project
I.e. it replaces >>>‚Äî<<< with >>>‚Ä\"<<<, so that it is then stored incorrectly in the mySQL table.
It appears that mysql_set_charset(‘utf8’); is a PHP command rather than a mySQL command. Running it does not appear to permanently change the values of the mySQL variables referenced by SpacePhoenix.
Good question. That was my first thought as well. I asked about this on the jQuery forum. I was advised to send the same string back to my Javascript calling routine via Ajax response. I did so and found that when the string came back to my Javascript code, via Ajax response, it was perfectly correct. I was told that this means that the string was not, in fact, corrupted in transit.
I am using UTF-8. My framework (CodeIgniter) is configured to use UTF-8 for all database transactions. Using the mySQL “Show Variables;” command shows that the database is also set up as UTF-8:
The odd thing is that everything works fine with the encoding of the double quotes around “Cry Wolf” - it’s just the encoding of the emdash that’s causing an anomaly.