SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Which collation to choose for a column?

    Hello! I am using mySql 5.0.45
    By default the collation of text columns is set to latin1_swedish_ci

    I am wondering if I should change these to utf8, in particular on particular column I know for sure that text will be in utf8 encoding. I parse rss feeds and know that the output text will be in utf8.

    Should I change the collation to utf8? Also, why there are so many different types of utf8 variations - like utf8_danizh, utf8_latvian, etc...
    I thought utf8 is supposed to be a universal standard encoding format that solves this exact problem of choosing an encoding - almost any character from almost any language can be represented in utf8. Then why are there so many utf8 collation choices in mysql and which one should I choose?


    Last question, what if I just stick with latin1_swedish_ci? Whould it really make any difference?
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  2. #2
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,578
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Collation specifies the ordering of characters. Collation is used when your query has an ORDER BY clause. When set to latin1_swedish_c1, you're telling it when I ask you to sort some words in this table, sort it according to the ordering used in the Latin alphabet according to Swedish customs.

    Collation and character set are not the same, but are related.

  3. #3
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    lampcms.com, You messed up collation and encoding.
    Encoding of the column must match actual data encoding.
    So, to set right columns encoding is most important thing.
    It is better to set it to whole table at once, or even database.

    Collation holds response for sorting data and depends on encoding.
    Collation name always starts from encoding name. And default value is %encoding%_general_ci
    So, you shouldn't worry about collation at all. After you set encoding to utf8, collation automatically will be set to utf8_general_ci, which will suit you wery well.

    Edit: oops. Dan already sated it

  4. #4
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thank you. I understand it now. I see in the phpMyAdmin is says:
    MySQL charset: UTF-8 Unicode (utf8)

    I guess this means I am already using utf-8 for all my tables by default.
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  5. #5
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    yes, for tables you create with no encoding specified
    but prevously you mentioned it is latin1_swedish_ci

    to see actual encoding you can execute this query in phpmyadmin:
    SHOW CREATE TABLE tablename
    if there is 'utf8' in the last line - everything is ok. unless you have to change it.
    need to do it with precautions if there is some data already

    also, don't forget to change client's default encoing, with 'SET NAMES utf8' query

  6. #6
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    THanks again. Last question: how to I find out the charset of an already existing table? I am worried that since a table was imported from an old mysqldump, maybe it was set to something other than the utf8, but I don't see any way to view the table charset in phpMyAdmin
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  7. #7
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    you missed it from my previous message
    Quote Originally Posted by Shrapnel_N5 View Post
    to see actual encoding you can execute this query in phpmyadmin:
    SHOW CREATE TABLE tablename

  8. #8
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think SHOW CREATE TABLE tablename will attempt to create a new table, but I want to see the encoding of an already existing table.
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  9. #9
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    thant's what this query does
    three times is enough for you to understand?


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •