Collation for spanish characters and PHP

Hello,

I am building a web application in PHP and MySQL that expects the use of spanish letters (á,é,í,ó,ú,ñ,ü). I have my collation set to latin1_general_ci and the default charset of apache is utf8… Is this a problem? Should I use utf8 as a collation in mysql??? PHP doesnt support utf8 so if i need to modify data from the database, will this be a problem?? Any comments on how to avoid having random characters in my the web site are welcomed.

Thank you!

that combination should be fine, expect if someone enters a utf-8 character that latin1_general doesn’t support. if you want to avoid that situation, you need to set the character set of the web page.

This is not exactly the answer but I’m posting here in order to inform and help future readers of the solution to a problem you might encounter along that path.

I too went through storing accented char in a MYSQL db, from a web app in PHP.
You need to set the DB charset and collation, and indicate encoding in you web pages. I went with utf8 everywhere.
I hadn’t realized that the default charset was latin1 which might be fine, but I recreated my db specifying utf8 and an ut8 collation (spanish_ci for me).
Things were ok except for 2 problems :

  • when browsing the db with phpmadmin the accented characters would not display correctly (they were fine when retrieved and display on my web pages with php), which I found suspect
  • searches (select statements) with MATCH AGAINST or LIKE would not be accent insensitive, which was a problem (and it was supposed to be with the choosen collation).
    After a couple of hours I found this post that explained something I had not looked into at all :
    forums dot mysql dot com slash read dot php?103,46870,47245

Even though everything had the correct encoding (utf8) : web pages, tables, etc. ONE thing was still in latin1 :

character_set_server

You can check that by running the following in mysql :

show variables like 'character_set%';

So what fixed ALL my problems listed above was just to add the following line after I created the db connection in PHP :

mysql_query("SET NAMES 'utf8'");

Once you do that, you will have to REENTER the data in your DB.

Note: if you want to verify that your table has the expected encoding you can run the following in phpmyadmin:

show create table your_table_name;

I hope this will help. I lost a couple of hours on that before I got it right so I’m posting it hoping it will help people get it right faster.

I am copying the content of the post where I found that explanation in case it becomes u available:

As is often the case, I guess your program is using default mysql client library, which is compiled with latin1 character set.

In that case, even if you enter valid utf8 values, the client interpretes the values as latin1. Then tell the server ‘Hey, these values are encoded in latin1’. When the values are stored in the table, MySQL thinks ‘Oh, the values passed from the client is latin1 but the table is utf8. I have to convert it!’ That’s why the values becomes non-utf8.

By executing “set names utf8”, it sets the client character set variables to utf8. So the values sent from the client program are stored ‘as is’ in the table because the client character set and the table’s character set is the same.

This is FAQ. I think this behavior and workaround should be documented somewhere in the manual. Or client library compiled with different character sets should be downloadable.

forums dot mysql dot com slash read dot php?103,46870,47245
[site doesn’t let me post url]

I know this theread is old, but it was very usefull for me today. I had the exact same problem with brazilian portuguese language and the suggestion of adding the line ‘mysql_query(“SET NAMES ‘utf8’”);’; after the connection string was a perfect and easy solution,

So I want to say “thank you” to banzai301!

Marcelo