Unicode & Collation Issue

Hello,

I am trying to create a database with lyrics, which will contain songs from many different languages. Now i am trying to make a “mini DB” just to make sure i’ve got everything right before proceeding with the actual database.

So i have created a new database from phpmyadmin named “test” with collation utf8_unicode_ci and MySQL connection collation utf8_unicode_ci.

I create one single table in it with only one column:

CREATE TABLE test(
  title VARCHAR(100) NOT NULL
);

I enter some values in the database:

$strings[] = mysql_real_escape_string('çadër Ç Γιώργος партλάρας Латинопартизани');
$strings[] = mysql_real_escape_string('zomg asάρας Латинопа');
$strings[] = mysql_real_escape_string('zomër Ç Γιώργος Νταλ');
$strings[] = mysql_real_escape_string('zoартизаниmg asdf natën μπίνς çλάρα');

foreach($strings as $string){
  mysql_query("INSERT INTO test (title) VALUES ('$string')");
}

All fine so far, the values are stored in that funky utf-8 format. If i try to retrieve them they display fine too:

$query = mysql_query("SELECT * FROM test");

while($row = mysql_fetch_array($query)){
  echo $row['title'] . '<br /><br />';
}

Results:

zo&#1072;&#1088;&#1090;&#1080;&#1079;&#1072;&#1085;&#1080;mg asdf nat&#235;n &#956;&#960;&#943;&#957;&#962; &#231;&#955;&#940;&#961;&#945;

zomg as&#940;&#961;&#945;&#962; &#1051;&#1072;&#1090;&#1080;&#1085;&#1086;&#1087;&#1072;

zom&#235;r &#199; &#915;&#953;&#974;&#961;&#947;&#959;&#962; &#925;&#964;&#945;&#955;

&#231;ad&#235;r &#199; &#915;&#953;&#974;&#961;&#947;&#959;&#962; &#1087;&#1072;&#1088;&#1090;&#955;&#940;&#961;&#945;&#962; &#1051;&#1072;&#1090;&#1080;&#1085;&#1086;&#1087;&#1072;&#1088;&#1090;&#1080;&#1079;&#1072;&#1085;&#1080;

Now i add a small form to search the database:

<form action="test.php" method="post">
  <input type="text" name="search" />
  <input type="submit" value="Search">
</form>

And the search code:

$search = mysql_real_escape_string($_POST['search']);
echo 'search: ' . $search . '<br><br>';

$query2 = mysql_query("SELECT * FROM test WHERE title LIKE '&#37;$search%'");
if(mysql_num_rows($query2) > 0){
  while($row2 = mysql_fetch_array($query2)){
    echo 'result: ' . $row2['title'] . '<br /><br />';
  }
}

Now the problem is that things seem to be case sensitive. If i search for “ç”, i don’t get result #3 above that has Ç, its uppercase letter.

search: &#231;

result: zo&#1072;&#1088;&#1090;&#1080;&#1079;&#1072;&#1085;&#1080;mg asdf nat&#235;n &#956;&#960;&#943;&#957;&#962; &#231;&#955;&#940;&#961;&#945;

result: &#231;ad&#235;r &#199; &#915;&#953;&#974;&#961;&#947;&#959;&#962; &#1087;&#1072;&#1088;&#1090;&#955;&#940;&#961;&#945;&#962; &#1051;&#1072;&#1090;&#1080;&#1085;&#1086;&#1087;&#1072;&#1088;&#1090;&#1080;&#1079;&#1072;&#1085;&#1080;

Same happens with “Ν” and “ν”, which are the same letter in Greek, and i’m sure with other characters as well. In the header of the page i have this line:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

I’m using:
Apache/2.2.14 (Win32)
MySQL client version: 5.1.41
PHP 5.3.1

Any help is much appreciated
Thanks

Hi again, basically i figured it out. Immediately after DB connection added SET NAMES ‘utf8’.

But i have another question now. Am i storing the data in the proper way? Now it is stored normally in human readable form, while earlier it was stored in funky characters. What is the proper way it should be stored? I’m guessing the 2nd? Do i at least have the storage method right?

Thanks