SitePoint Sponsor

User Tag List

Results 1 to 19 of 19
  1. #1
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Reading the Euro symbol

    Hi Everybody,
    I am reading the Euro symbol () from a data stream I am parsing and inserting into a MySQL db. The problem is, whenever the symbol occurs, garbage is written into the field in the database. I presume I need to do a str_replace, but when i search for the character, its not found.

    I guess its encoded differently? How do I search for it and replace it with '€' ?


    thanks in advance.

  2. #2
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You must set up proper encoding probably.

  3. #3
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    You must set up proper encoding probably.
    What do your mean ?

  4. #4
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I mean that MySQL db must know, which encoding has data it holds. and which encoding has data you send.
    here I posted a small checklist to ensure you set all encodings properly

  5. #5
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by livewire1974 View Post
    I am reading the Euro symbol () from a data stream I am parsing ...
    Which encoding does this data stream have?

  6. #6
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kyberfabrikken View Post
    Which encoding does this data stream have?
    I'm not sure, reading from an Excel file, how do I find out?

  7. #7
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    it doesn't really matter.
    you are already suppose that data is in utf8, don't you?
    so, tell mysql that your data has this encoding

  8. #8
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by livewire1974 View Post
    I'm not sure, reading from an Excel file, how do I find out?
    How do you read the excel file? Which library are you using?

  9. #9
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    it doesn't really matter.
    you are already suppose that data is in utf8, don't you?
    so, tell mysql that your data has this encoding
    yes, I do this with mysql

    Code:
    mysql_query("SET NAMES 'utf8' COLLATE 'utf8_unicode_ci'");

  10. #10
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kyberfabrikken View Post
    How do you read the excel file? Which library are you using?
    I am using this project to read the file.

    http://sourceforge.net/projects/phpe..._Excel_Reader/

  11. #11
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How do you check that symbol in the database?
    Is it done with proper encoding set?

  12. #12
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    How do you check that symbol in the database?
    Is it done with proper encoding set?
    To check the DB, I am using the mysql query browser. The table is set-up as utf8 with utf8_general_ci collation.

  13. #13
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Does this mysql query browser support utf-8?
    Your "garbage" seems strange to me, because if utf-8 used on the whole data path, no recoding involved and any symbol must remain the same, no matter source encoding.

    How does your garbage look?

  14. #14
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    Does this mysql query browser support utf-8?
    Your "garbage" seems strange to me, because if utf-8 used on the whole data path, no recoding involved and any symbol must remain the same, no matter source encoding.

    How does your garbage look?
    I'm making a little bit of progress. Basically, it seems Excel uses a charset cp1250. So, I told my browser to output this, and it does.

    So, I guess I now need to convert the cp1250 code for the symbol to utf-8 ?

  15. #15
    SitePoint Addict
    Join Date
    Jul 2008
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I sorted this using the following line to convert the charset

    PHP Code:
    $product iconv('Windows-1252''UTF-8//TRANSLIT'$product); 
    Fixed now

  16. #16
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You already figured it out, but yes you need to convert into utf-8 manually. You don't need the //TRANSLIT part since UTF-8 is capable of representing all the characters that exists in cp-1252.

  17. #17
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Fixed now
    It can be done by proper client charset setting too
    SET NAMES cp1250
    should do the trick
    SET NAMES itself were invented to do such things

  18. #18
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,019
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    The PHP manual lists the use of:

    PHP Code:
    mysql_set_charset('utf8',$link_name); 
    as the preferred way to set the character encoding for a MySQL database connection. It should be used right after the establishment of the connection to the server but before the selecting of a database to work with.
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  19. #19
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    It can be done by proper client charset setting too
    SET NAMES cp1250
    should do the trick
    SET NAMES itself were invented to do such things
    Technically yes, but I would recommend doing the conversion in php, as livewire figured out hi self. The connection charset is a global setting, so by setting it to cp1252, you would have to make everything in the application use this charset.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •