This query is fast, I do it on each page load for every site that has non latin chars. Right now you should not
be trying to change collation in the db as it might break something. Just try it ans see if it works.
Are you using the MySQL or MySQLi interface? Both have a mysqli_set_charset function which should be used right after the connection to MySQL has been established, this sets the charset for the connection that has been established to use whatever charset you have set it to use.
I’m using mysqli. I was wondering how phpBB does it. Russian characters come out right in phpBB. I formatting the output of the topic_title using htmlspecialchars(), and I’m using UTF-8 as the $charset parameter. So I don’t think that would be causing me any problems.
As a side note, using mysqli::set_charset() is the recommended way of setting the character set for communication with the MySQL server over a query using SET NAMES.
This is the preferred way to change the charset. Using mysqli::query() to execute SET NAMES … is not recommended.
I guess going forward I’ll incorporate a call to mysqli::set_charset() every time I make a connection to the DB just as a precaution to prevent any problems. Live and learn!
Thanks for the advice. I did read elsewhere that it is a good idea to set the header to UTF-8 explicitly. But I also read somewhere that Apache will examine the HTML for the charset and then send it to the browser based on whatever is specified in the code.
The HTML is generated by PHP. UTF-8 is specified in the HTML head as mentioned above and according to Firefox’s Page Info (Tools -> Page Info) the page is encoded as UTF-8. It’s reaching the browser as UTF-8.
My hunch is that the database collation is what is screwing things up and the data connection is not transmitting in UTF-8.
It is a little confusing. We have the charset in the HTML, the possibility of a browser header specifying the charset, then we have the MySQL server character set, the database collation, and finally the field collation. There’s a lot of places where something get go amiss.
I’d be interested in hearing where that came from – I’ve never ONCE seen apache by default send UTF-8 just because you output it in the meta. In fact i’ve never heard of Apache parsing the files it’s sending unless it was for SHTML. That’s REALLY none of it’s business in a way… It’s also why on all of my servers I put in this .htaccess when doing utf-8 templates for people:
Hell, by the time Apache HAS the markup it’s too late to send the connection header. Part of why in PHP you HAVE to call the header function before you output anything else.
I also suggested Opera because it will show you the ACTUAL mime-type in the content header… the view->character encoding is pretty useless in FF… and with the web developer toolbar being broken on reporting header info on anything newer than FF 3.5… Though it’s Information > response headers should tell you what’s going on…
You are correct though, there are SO MANY possible points of failure along the chain it’s ridiculous.
Long back in my case it was encoding of the add file that was causing that problem…
The problem was there when i use to copy and paste sql but when i use to import sql file using utf-8 setting it use to work fine
There is not as much documentation on how to use Opera as I would like. I went into View -> Developer Tools -> Page Information and that said the encoding used by Opera was UTF-8 and that the MIME type was text/html.
Encoding (used by Opera): utf-8 (utf-8)
Is that what you are looking for? Or should I be looking for something else and if so, where do I find it? I’ve looked all over and can’t find anything else.
Another site on the same server is served in ISO-8859-1 as specified in the HTML meta according to Opera’s Page Information.
If I’m not looking at the right thing, let me know.
I expect that this:
will work to serve PHP, CSS, and JS as UTF-8, too?
Let me know where.
I checked a couple HTTP Header checker websites and it looks like there is no content-type header being sent by the server I am on.
That was what I was suggesting looking at… do you have the web dev toolbar installed for FF? What does it report?
NEVER heard of Apache much less PHP sending content-type other than iso-8859-1 as the default unless you edit it to do so. The header checkers saying none is effectively the same thing… if “information>Response headers” in the web dev toolbar for FF doesn’t list content-type, you’re sending none, so the browser is best guessing… Smarter browsers like Opera and Chrome may actually switch to UTF-8 because of the Meta – FF it’s probably a coin toss given it’s still Netscape 4’s sweetly retarded cousin (life is like a box of open source) … but if there’s no header IE will default to ISO-8859-1 on IE6/newer assuming not in quirks mode, and IE5/earlier will actually default to windows-1252.
Do you have a link to the page in question?
Oh, and yes, that change will cover all those files – NOT that I’d bother with .CSS since by the specification it’s only supposed to have ASCII7 in it in the first place even if you change the encoding.