SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Enthusiast
    Join Date
    Feb 2007
    Location
    Los Angeles
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Anyone Familiar with Foreign Language Encoding?

    I recently received an email from one of the contact pages of a website and it came through with "not garbled" characters but just nonsense characters. I know that it is happening because of the character set I used for the page. So I switched it to UTF-8 and tested it. This time the email came through as "garbled characters" which is correct however, now I am face with another situation where I can't read it because I don't have the right encoding. Does anyone have the same issues? I know this would only pretain to anyone who is anticipating different ethnic communities to visit the website.

    There must be a way around this. Please note that all emails comes to my POP3 in Microsoft Outlook 2007. As far as I am concern, there is no encoding settings in this version. I then tried to forward it to my Yahoo email account and view it from online as I know Yahoo "DOES" have encoding options but to no success, the encoding did not make the email message readable still.

    Please if anyone know a good solution/answer to this, I am all ears.

  2. #2
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Could you include the scrambled text here? (And say which encoding that was used when decoding it...)
    Simon Pieters

  3. #3
    SitePoint Enthusiast
    Join Date
    Feb 2007
    Location
    Los Angeles
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Without UTF-8 it looks like this:
    "#46356;#51144;#51064;#51012;" with ampersand infront of each # symbol.

    With UTF-8 it looks like this:
    我è¦�去玩游戲。 ä½ å�¯ä»¥åŽ»å—Žï¹–

    I tried to use every encoding settings I could find and still can't read it.

  4. #4
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by NRG77 View Post
    Without UTF-8 it looks like this:
    "#46356;#51144;#51064;#51012;" with ampersand infront of each # symbol.
    What does "without UTF-8" mean? "Automatic selection"?

    In any case, if & is placed before each, and you then parse it as HTML, you get "디쟈인을" which looks like Korean and translates to "D [cya] person" with Google Translate.

    Quote Originally Posted by NRG77 View Post
    With UTF-8 it looks like this:
    我è¦�去玩游戲。 ä½ å�¯ä»¥åŽ»å—Žï¹–
    Seems like it was encoded as EUC-KR. (Decoding the Korean text encoded in EUC-KR as UTF-8 results in the above.)
    Simon Pieters

  5. #5
    SitePoint Enthusiast
    Join Date
    Feb 2007
    Location
    Los Angeles
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Right, but the problem still remains that I can't read them in my Outlook mailbox. The only way around this is if I copy and paste the &#23423 characters into html and parse it again. But that is a hassle. So I was thinking if there is a correct way to take in the input of foreign language from any users which can be read in outlook.

  6. #6
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, to me it sounds like a bug in Outlook. My only advice is thus to either file a bug report to MS and hope they fix it to the next release, or switch to another email client.
    Simon Pieters


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •