Most discussions about character encoding recommend that you set your pages’ encoding on the server, with a meta declaration as a backup. When checking the response headers on many sites, I see this:
Content-Type: text/html; charset=UTF-8
On my own server, I just see
So I’m wondering how to ensure that the server is sending out pages with the charset also defined.
I have a managed VPS with WHM/CPanel. I’ve looked everywhere for a way to set this, but haven’t found much intelligible info. Am I barking up the wrong tree? This is not a desperate situation, just something I’d like to understand better. My pages work find as they are, but since I often read that the response headers are important I thought I’d follow up on this.
Thanks logic_earth. I’ve seen that page before, but what I’m really after is info on how to access http.conf and how to edit it. I’ve sort of found out how to do this in WHM, via the Includes Editor. There’s an option to add new stuff at the top or bottom of the file, and I’m not sure which is best. Aslo not sure if I can just add
Content-Type: text/html; charset=utf-8
and be done with it, or if there’s some other code I need to add around it. You can take code from a .htaccess file and place it here. I’m not sure if the line above would be enough in a .htaccess file, or if something else is needed.
My aim really is to target all sites on the server in one hit. I’m not sure what you mean about loading once. Isn’t a header sent out each time a page is requested? (I’m just stabbing in the dark here, I admit. But I’d love to understand this better. :D)
[SIZE=3][FONT=georgia]In the case of targeting all sites on the server then using the
in your http.conf is the best place to put it. This will direct all pages for all virtual hosts on the server to be of the UTF-8 default media type character set. Keep in mind that must save your html pages to be utf-8 encoded. It is also recommended that you set the language for your pages as automatic browser detection will use this if specified. To add the language do:
. If you use a database then the character set there also needs to be set to UTF-8 collation. The default collation of MySQL is ‘latin1_swedish_ci’. If you forget to set the collation to UTF-8 before importing data you will need to go back and set it to UTF-8 and then have to re-import the data. This article may assist your understanding of MySQL’s UTF-8 needs?
One final point is that if you have an Apache directive such as AddDefaultCharset that does not need to run each time a page is requested then put it into your http.conf or a vhost.conf file that gets loaded once when Apache start up. If you want directives such as a rewrite directive that needs to run every time a resource is requested then you need to put it into the .htaccess file. Putting directives like AddDefaultCharset are considered wasteful (memory and processing) if put into .htaccess files.
Thanks Steve. The article is over my head, I’m afraid. As a starting point, is there an easy way to tell what encoding (if any) the server may be using for the databases? It’s easy to check the html page headers, but I’m not so sure about the databases.
I’m not sure how important all of this is. Everything works fine on my sites, but I’ve always heard that it’s the server’s job to send out the encodings, so thought I should check up on this.
in PHPMyAdmin or MySQL Workbench or using MySQL on the command line you can run the following SQL:
[COLOR=#000000]SHOW VARIABLES LIKE 'character_set_database';[/COLOR]
and then run
SHOW VARIABLES LIKE 'character_set_client';
these will show you the current character sets for your MySQL server and then the seconds shows the character set configuration in the client you are using. When you USE a database you can then run this SQL
SHOW CHARACTER SET;
SHOW COLLATION LIKE 'utf8%';
; the first determines the character set of the database in use (you will know that it is use when you can run a SELECT statement in a given table and it runs correctly then you issue the aforementioned SQL. The second SQL tidbit will output all possible UTF-8 supported collations currently in your server.
I don’t use PHPMyAdmin, but I do use MySQL Workbench. In Workbench you can see the collation that is assigned to each table when you issue a Create or Alter command (via the GUI) This is a pretty easy way to see it; although doing the SQL commands I’ve shown you above will provide a complete picture in just a few lines of SQL.
I don’t know if this is telling me what the server is determining or whether it’s just the setup for the current database (which is basically set up by the CMS I’m using, as DBs are outside my ken at this stage).
I’ve (nervously) connected to the host account via the command line before, but am not sure how to get to the point you described above. Presumably there are commands first to connect with MySQL …?
Just to go back a bit, what did you mean by that second sentence? I always include the meta element, so is that what you meant? I wasn’t sure how important that is if the server has specified the encoding. I thought it was really for when the page is offline etc.
Certain editors allow you to save documents in a particular encoding such as UTF-8. Note pad on Windows will not do this but if you search for text encoding Mac then you should find an application that you can use to save your .html/.php files as UTF-8 encoded. if you don’t do this it won’t matter if you’ve set UTF-8 in the header the document will not be in UTF-8. The meta element simply tells the brower what encoding to switch to rather than auto detecting it; however just because you set this does not mean that the html document is encoded as UTF-8 that is why you need to save your html/php files with UTF-i8 encoding. By encoding it you will have the header, Apache, the html file, the database and the text encoding of the html file to all be aligned in UTF-8 and will be serving it up