SitePoint Sponsor

User Tag List

Results 1 to 2 of 2
  1. #1
    SitePoint Evangelist
    Join Date
    Aug 2004
    Posts
    428
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    good practice for Character encoding?

    Can someone inform me of what is considered good practice when dealing with character conversion from the server side.

    A couple of days ago I had to to create a Swap function as a character wasn't being displayed as a user hoped for.

    PHP Code:
        function render() {
            
    $SearchReplace = array(
                array( 
    '//i','é'),
                array( 
    '//i',''//word 
            
    );
            
            
    $search=$replace=array();
            foreach(
    $SearchReplace as $key=>$value){
                
    $search[]  = $SearchReplace[$key][0];
                
    $replace[] = $SearchReplace[$key][1];
            }

            
    $this->input preg_replace($search,$replace,$this->input);
          } 


    Also I will be receiving input from text copied from MS word into a form... how do i handle special characters added by MS word.

    putting a full string into htmlentities isn't helpful as I accept html formating to be mixed with text. I've also heard rumors that htmlentities is old and dying so ... how should i handle these common issues.



    my current plan:
    expand my swap function: basically put all characters in the following document: http://www.evolt.org/article/ala/17/21234/
    do i have to look up the hex value for all these characters? Or can I copy and paste from the website the weird character and paste it into my regex?



    links from my research that others may find useful:
    http://www.greywyvern.com/code/php/utf8_html
    http://us2.php.net/manual/en/functio...ties.php#59886
    http://shiflett.org/archive/178 - wtf ??
    http://www.alistapart.com/stories/emen

  2. #2
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I know it's a while since you posted this, but I just stumbled upon it during a search.
    http://www.phpwact.org/php/i18n/charsets is a really good source for utf-8 vs php. It might answer some of your questions.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •