SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Addict
    Join Date
    Mar 2004
    Posts
    260
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Language settings for Russian website

    Hello there, it's the first time that I'm going to do a small site in Russian (a translation of an existing English site). I've trawled the Internet for a definitive answer on what the language settings have to be in the HTML code, but I'm not so sure now...

    I think that I need:

    <META http-equiv="content-type" content="text/html; charset=utf-8">

    Do I also need:
    <html lang="ru">
    If so, what exactly does this do?
    I've never put <html lang="en"> in any of my English sites...

    Many thanks

  2. #2
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by spirelli View Post
    I think that I need:
    <META http-equiv="content-type" content="text/html; charset=utf-8">
    This is only correct if you save your HTML files with a UTF-8 encoding. It will be ignored if your web server sends the encoding information in the real Content-Type HTTP header.


    Quote Originally Posted by spirelli View Post
    Do I also need:
    <html lang="ru">
    If so, what exactly does this do?
    It declares that the natural language of the document is Russian. Yes, you should definitely use this attribute.

    Quote Originally Posted by spirelli View Post
    I've never put <html lang="en"> in any of my English sites...
    You should! It can be used by screen readers to select the correct synthesizer library, and it is also used by search engines to classify your content so that users can search for pages in a particular language.
    Birnam wood is come to Dunsinane

  3. #3
    SitePoint Addict
    Join Date
    Mar 2004
    Posts
    260
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Many thanks.

    I've also seen:
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1251">

    What's the difference to the utf-8? Which one should I use?
    Thanks!

  4. #4
    om nom nom nom Stomme poes's Avatar
    Join Date
    Aug 2007
    Location
    Netherlands
    Posts
    10,266
    Mentioned
    50 Post(s)
    Tagged
    2 Thread(s)
    With the windows charset, any characters outside that win-1251 range you want to put on your web site will look like a big ugly box with four squirlies in it (that's what FF shows anyway) while Safari has a little boxie with a ? and other browsers just have the ? and someone has a black diamond with a ? in it. And we don't all have a Windows machine so we don't all have whatever all Windows has in its charset.
    Though as I understand it the windows charset is mostly ASCII which is a subset of utf-8 characters-- utf-8 will just recognise more characters and utf-8 covers cyrillic (what you're using) Greek and the slavic letters with the thingies on the s's : )

    Although Tommy is "ru" the right one? I might be confusing it with something else but I thought the russian would be the py? Or рф?

    *edit n/m, it's "ru"

    *edit2 you may also want to have a meta tag with the language as well. I've heard filthy rumours that some user agents look at the lang attribute on the html tag (like you have) but that others only look for the meta tag. To cover my butt on my sites I've had this (excuse the XHTML, legacy):
    Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="nl" lang="nl">
      <head>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
        <meta http-equiv="content-language" content="nl" />
       <title>Verzekering</title>
    I put it and the charset meta before the title cause my titles are (usually) also Dutch.
    As my copy of JAWS has Finnish and two sorts of English, but no Dutch, I can attest to the difficulty listening to a language pronounced really wrong. Even when you can see and read where you are, you have trouble following : ( I actually have to translate my forms into English to check functionality : (

  5. #5
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by spirelli View Post
    What's the difference to the utf-8?
    UTF-8 is an international standard that allows you to use literal characters for any Unicode character. You won't need any entity references (like &hellip;) or numeric character references (like &#38;#8230;).

    Windows-1251 is a Microsoft-specific encoding based on – but not equivalent to – the standard ISO 8859-5. It only allows a small number of characters (about 220) to be represented literally. For anything else you need to use entity references or NCRs.

    Quote Originally Posted by spirelli View Post
    Which one should I use?
    If at all possible, use UTF-8. If not, use ISO 8859-5 rather than the proprietary Microsoft encoding.
    Birnam wood is come to Dunsinane


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •