SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Enthusiast
    Join Date
    Sep 2004
    Location
    Scotland
    Posts
    40
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    How to convert UTF-8 (hex) value to Unicode Character?. Trying for > a week, pls help

    UTF-8 (hex) -> Unicode Character ?.

    Example2
    Unicode Character = “ a”
    UTF-8 (hex) = “efbd81”

    Example1
    Unicode Character = “ 黪 “
    UTF-8 (hex) = “e9bbaa”


    Notes:
    I have a mySql database containing 7000+ records. In a particular column they stored all the hex values represent its respective Unicode Character like above example. I need to search the database by passing hexvalue as my search string in order to get the above Unicode Character.

  2. #2
    SitePoint Member
    Join Date
    Jun 2006
    Location
    Chicago or Urbana/Champaign, depending
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Check out both http://www.ltg.ed.ac.uk/~richard/utf...fbd81&mode=hex and http://www.ltg.ed.ac.uk/~richard/utf...9bbaa&mode=hex. According to those, efbd81 & e9bbaa aren't valid hex representations of a UTF-8 character (nor a UTF-16 character, for that matter).

    But overall, I'm not quite sure what you're asking. If you're trying to convert efbd81 to a unicode character, then you're having problems because efbd81 isn't a valid hex representation of a UTF-8 character. EF, BD and 81 are representations of three separate UTF-8 characters. EFB & D81 are valid hex representations of a UTF-8 character as well. If you need to convert EF, BD, 81, EFB or D81 to a unicode character, you would need to convert it to decimal first (http://php.net/manual/en/function.hexdec.php) and then you could use this function to convert it to a unicode character: http://ftzdomino.blogspot.com/2009/0...uivalents.html.

  3. #3
    SitePoint Enthusiast
    Join Date
    Sep 2004
    Location
    Scotland
    Posts
    40
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your reply. following two are unicode characters a and 黪 Reference http://www.fileformat.info/info/unic...ff41/index.htm, pls take a look under Encodings. it is UTF-8 (hex) 0xE9 0xBB 0xAA (e9bbaa).

    How & Where i got those two hex values?

    In MySql Database, they 've a builtin function HEX( 黪 ). I use this functions to genarate hex value, For every unicode characters which is stored in varchar column collation utf8_unicode_ci. so i have an unique UTF-8 (hex) values for all the unicodes.

    So i am looking for a function where by it can convert my input unicode characters to this format UTF-8 (hex) 0xEF 0xBD 0x81 (efbd81). so i would able to search any unicode character which is stored in the DB.

  4. #4
    SitePoint Member
    Join Date
    Jun 2006
    Location
    Chicago or Urbana/Champaign, depending
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I apologize, I was wrong. However, I can't find a way to do this using PHP. My only suggestion is to use MySQL itself for the conversion:

    PHP Code:
    // if you're using PDO or mysqli, change the function names
    $result mysql_query('SELECT HEX( 黪 )');
    $row mysql_fetch_row($result);
    $hex $row[0]; 
    This is not a pretty or efficient solution, in my opinion. Maybe someone else has a better idea?

  5. #5
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Check the comments of http://php.net/chr or http://php.net/ord for what you need.

  6. #6
    SitePoint Enthusiast
    Join Date
    Sep 2004
    Location
    Scotland
    Posts
    40
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Arrow

    Thanks for everyone help. Done, Finally. Let me share with what I have. Following PHP functions works great for MySQL utf8_unicode_ci data conversion to UTF-8 (hex)

    Conclusion:
    1. Using JavaScript "encodeURI" to post value to php
    2. In PHP request the above-submitted value with out any conversion.
    3. Use the following functions to convert the request value
    4. This will return UTF-8 (hex), that is the unique reference for the value in DATABASE


    PHP Code:
    function hex_chars($data) {
        
    $hex '';    
        for (
    $i=0$i<strlen($data); $i++) {
            
    $c substr($data$i1);
            
    //$hex .= '{'. hex_format(ord($c)). '}';
            
    $hex .= hex_format(ord($c));
        }    
        return 
    $hex;
    }

    function 
    hex_format($o) {
        
    $h strtoupper(dechex($o));
        
    $len strlen($h);
        if (
    $len == 1)
            
    $h "0$h";
        return 
    $h;



Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •