SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    Umm. PHP Guru....Naaaah jaswinder_rana's Avatar
    Join Date
    Jul 2004
    Location
    canada
    Posts
    3,193
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Encoding Problem

    PHP5/MSSQL/IIS/XP/ADODB

    I am trying to store a string extracted from a URL into database.
    The string, on the web page, looks like this
    Unless you’re slightly crazy

    But, when I store it into database, it becomes this
    Unless you’re slightly crazy


    How can I correct this?

    Thanks
    ---------------------------
    Errors = Improved Programming.
    My Site

  2. #2
    SitePoint Wizard siteguru's Avatar
    Join Date
    Oct 2002
    Location
    Scotland
    Posts
    3,629
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    What charset/encoding is set for the database?

    What charset/encoding is set for the page submitting/receiving this data?
    Ian Anderson
    www.siteguru.co.uk

  3. #3
    Umm. PHP Guru....Naaaah jaswinder_rana's Avatar
    Join Date
    Jul 2004
    Location
    canada
    Posts
    3,193
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Page has Meta tag as

    <META HTTP-EQUIV='Content-Type' CONTENT='text/html; charset=UTF-8'>

    So UTF-8 for page.


    I am not sure how to check encoding on MSSQL Server.
    ---------------------------
    Errors = Improved Programming.
    My Site

  4. #4
    SitePoint Wizard siteguru's Avatar
    Join Date
    Oct 2002
    Location
    Scotland
    Posts
    3,629
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    From http://dbcrusade.blogspot.com/

    To check for the existing collation, check the properties of database and look for a row with heading "Collation"
    by default collation is set to "SQL_Latin1_General_CP1_CI_AS". This collation renders database as case-insensitive.
    I suspect your database is as above, thus UTF-8 characters could cause problems. Might be easier to change the codepage/encoding of the sending page.
    Ian Anderson
    www.siteguru.co.uk

  5. #5
    Umm. PHP Guru....Naaaah jaswinder_rana's Avatar
    Join Date
    Jul 2004
    Location
    canada
    Posts
    3,193
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I know if I were to change the collation TYPE of DB or TABLE, i'll have to rebuild it.

    I can do that as I haven't yet entered any data in the table yet. So, I can rebuild the table and change collation for THAT column.

    But, i am not which collation will support UTf-8.

    I mean, if "SQL_Latin1_General_CP1_CI_AS" doesn't support UTF-8, which one does?

    Thanks
    ---------------------------
    Errors = Improved Programming.
    My Site

  6. #6
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Collation is completely irrelevant - It governs the sorting order of characters.

    HTTP headers always trump META-tags, so when your page is served by a web server, it's ignored. Open your page, and select View > Character Encoding from the menu (In Firefox) to tell the actual encoding.

  7. #7
    Umm. PHP Guru....Naaaah jaswinder_rana's Avatar
    Join Date
    Jul 2004
    Location
    canada
    Posts
    3,193
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    it says "UTF-8".

    I tried to use iconv() to convert the text to ISO-8859-1, but it return "Detected an illegal character in input string"
    ---------------------------
    Errors = Improved Programming.
    My Site

  8. #8
    Umm. PHP Guru....Naaaah jaswinder_rana's Avatar
    Join Date
    Jul 2004
    Location
    canada
    Posts
    3,193
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Or I can use str_replace to replace this specific string, but it doesn't work.

    Do you know ASCII code for that character? it's not single quote.

    Unless you’re slightly crazy
    The character between "you" and "re"
    ---------------------------
    Errors = Improved Programming.
    My Site

  9. #9
    SitePoint Zealot
    Join Date
    Sep 2007
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

  10. #10
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That character doesn't exist in ASCII-range. If I'm not mistaken, it's a nasty little bugger, because it's one of the few characters, which differ between ISO-8859-1 and cp-1252. Windows' native encoding on western European systems, is cp-1252, but PHP expects ISO-8859-1. Most of the time, they are interchangeable, so it won't matter, but with a few characters, it does.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •