SitePoint Sponsor

User Tag List

Results 1 to 3 of 3

Hybrid View

  1. #1
    SitePoint Guru
    Join Date
    Sep 2007
    Posts
    971
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Language display problem

    Does anyone know why squares display on my website when using different language, such as Nome de usuário
    Animated Chatrooms - www.121chatrooms.net

  2. #2
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    We need more information.

    Are you viewing the website through google translate?

    Does your website have built-in multilingual features?

    Is a visitor trying to use those features, or is their computer simply set to another language?
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  3. #3
    om nom nom nom Stomme poes's Avatar
    Join Date
    Aug 2007
    Location
    Netherlands
    Posts
    10,269
    Mentioned
    50 Post(s)
    Tagged
    2 Thread(s)
    Does anyone know why squares display on my website when using different language, such as Nome de usuário
    Usually this means your browser is assuming one charset yet the supplied text or document is another one. Any of the below could be causing this:

    - the server might not send headers stating the charset, and your browser made a bad guess

    - the server may be sending a header stating a charset, but the document was saved under a different charset (the browser will listen to the server, except sometimes IE6 will use a compliated set of ingenious Soviet-based heuristics to make an edjumacated guess at the saved document's charset, the basis of the utf-7 exploit)

    - the server and document may agree, but you stubbornly set an incompatible charset within your browser (I'm not sure where this setting is anymore on browsers, used to be under View->Character set or similar but everyone's so damned pleased with themselves for having found nefarious ways of hiding menus for the lawlz it wouldn't surprise me if the likes of Mozilla have actually just removed it "because it's too complicated for users and only web developers need it so we recommend you use a plugin").
    This should *NOT* matter what language the browser is set to! unless this sets the expected charset to Stubborn

    So if I have a document saved as Latin-1 (iso 8859-1) and I go to a Japanese language website and copy some text there and paste it into my document, I may or may not get squares/questionmarks/diamonds instead of the correct characters because there should not be any Latin-1 representation for Japanese characters.

    Saving the file as iso-2022-jp (or any of the other ones) might not fix the Japanese text, if my editor made sh*t up when originally getting the Japanese text and now it might be trying to represent garbage as 2022 and I am made of fail.

    Usually the solution to all our problems including hunger AIDS and peace in the middle east is to encode and save and store and represent everything as utf-8. This is a charset capable of showing whichever letters of whichever language you want to have. Most charsets are limited and specific to one or a few alphabets. UTF-8 by its powers combined turns into a giant mecha, fights evil and the NSA and WINS.

    Make sure your document is saved as UTF-8. No BOM and nothing weird.
    Make sure your meta tag in the HTML matches (this only exists in case your server is being silly and not sending out a header).
    Make sure your server is correctly saying "WITNESS! THIS DOCUMENT IS UTF-8 YO".
    Make sure your browser isn't set to stubborn. Usually auto-detect is fine.
    Make sure your database, if you're using one, is not being retarded about charset.
    Don't blindly copy and paste strange things and add it to the document.
    Don't mix stuff from Windows editors and stuff from *nix. This has caused me no amount of grief, though this has nothing to do with character boxes and everything to do with whitespace hell. If you're getting documents from others maybe written in Word or something, find a program to sanitize the Windowy-ness out of it first if possible. I think someone sells a kind of bug spray program that kills off magic quotes and control characters and ^Ms but you'd have to check your local MegaMall to see.

    Also amphetamines.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •