SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Addict
    Join Date
    Jun 2004
    Location
    Montreal
    Posts
    275
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Language Encoding

    Hi,

    is there a way to detect a string encoding so I can know if it is japanese or russian?

  2. #2
    Team SitePoint santouras's Avatar
    Join Date
    Jul 2006
    Location
    planet earth
    Posts
    276
    Mentioned
    16 Post(s)
    Tagged
    0 Thread(s)
    my utility belt tells me its to the bar batman

    read the manual then google it then do a search THEN post....

  3. #3
    Unobtrusively zen silver trophybronze trophy
    paul_wilkins's Avatar
    Join Date
    Jan 2007
    Location
    Christchurch, New Zealand
    Posts
    14,729
    Mentioned
    104 Post(s)
    Tagged
    4 Thread(s)
    here's another toolset for your utility belt.

    A composite approach to language/encoding detection

  4. #4
    SitePoint Addict
    Join Date
    Jun 2004
    Location
    Montreal
    Posts
    275
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, is there any easy way to scan a string and see if it contain non ISO-8859-1 character?

    the mb_detect_encoding() is ok but it did not separate Russian from French as both are detected as UTF8?

    What I need is to only allow ISO-8859-1 character!

  5. #5
    Unobtrusively zen silver trophybronze trophy
    paul_wilkins's Avatar
    Join Date
    Jan 2007
    Location
    Christchurch, New Zealand
    Posts
    14,729
    Mentioned
    104 Post(s)
    Tagged
    4 Thread(s)
    Code PHP:
    public function encodeToIso($string) {
         return mb_convert_encoding($string, "ISO-8859-1", mb_detect_encoding($string, "UTF-8, ISO-8859-1, ISO-8859-15", true));
    }


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •