SitePoint Sponsor

User Tag List

Results 1 to 5 of 5

Hybrid View

  1. #1
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    utf8 encoding: Can I really have it all?

    Hello!

    I'm developing my first application that will have some non-English characters and I've been investigating the world of encoding. My accented characters look fine in my web application both before and after putting them in the database, but the actual database characters don't look as they should. On the one hand, I don't see this as a problem (I can only assume that my database understands that it's UTF-8, converts it to latin1, then converts it back to UTF-8 when I call for the data); however, it would be nice if I could have the data appear clean in both place.

    After doing do some googling, I found that I can change a table's encoding as:

    ALTER TABLE table CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

    However, after doing so, I had the opposite problem: manually entering accented characters in the database gave me junk on the screen (php's mb_detect_encoding() showed that my application is in fact using utf8 encoding() for the accented characters).

    Any help in getting my characters to look correct in both places would be great.

    Thanks so much,

    Eric

  2. #2
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    When you connect to a database the client connection uses a predefined default character set. This can be changed after connecting to the server using:

    SET NAMES UTF8;

    Try this and see if this helps?

  3. #3
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thank you for the quick reply. I logged into phpMyAdmin and wrote SET NAMES UTF8; However, the characters still look the same. Any other thoughts would be appreciated.

  4. #4
    SitePoint Enthusiast
    Join Date
    Jul 2007
    Location
    San Sebastian, Spain
    Posts
    93
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Unfortunately the characters that have already been added have been added using a different characterset. If you change the character set to latin1 and then issue your select (I am assuming latin1 was the previous default) you will see the entries correctly. I have put together a small demo of what I mean below.

    Code:
    mysql> create table charsettest(name varchar(20) character set utf8);
    Query OK, 0 rows affected (0.04 sec)
    
    mysql> desc charsettest;
    +-------+-------------+------+-----+---------+-------+
    | Field | Type        | Null | Key | Default | Extra |
    +-------+-------------+------+-----+---------+-------+
    | name  | varchar(20) | YES  |     | NULL    |       | 
    +-------+-------------+------+-----+---------+-------+
    1 row in set (0.00 sec)
    
    mysql> set names latin1;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> insert into charsettest(name) values ('AllÚ');
    Query OK, 1 row affected (0.00 sec)
    
    mysql> select * from charsettest;
    +-------+
    | name  |
    +-------+
    | AllÚ | 
    +-------+
    1 row in set (0.00 sec)
    
    mysql> set names utf8;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> select * from charsettest;
    +---------+
    | name    |
    +---------+
    | All├ę | 
    +---------+
    1 row in set (0.00 sec)
    
    mysql> insert into charsettest(name) values ('Logro˝o');
    Query OK, 1 row affected (0.00 sec)
    
    mysql> select * from charsettest;
    +----------+
    | name     |
    +----------+
    | All├ę  | 
    | Logro˝o | 
    +----------+
    2 rows in set (0.00 sec)
    
    mysql> set names latin1;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> select * from charsettest;
    +---------+
    | name    |
    +---------+
    | AllÚ   | 
    | Logro?o | 
    +---------+
    2 rows in set (0.00 sec)

  5. #5
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Muchas gracias, ahora entiendo...



Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •