SitePoint Sponsor

User Tag List

Results 1 to 11 of 11
  1. #1
    SitePoint Evangelist english-test.net's Avatar
    Join Date
    Jul 2003
    Location
    Leipzig, Germany, Germany
    Posts
    438
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    How to convert cyrillic letters?

    Hello, could anybody please tell me how I can use Cyrillic letters within my HTML document? How can I convert Cyrillic letters? Many thanks.
    Torsten

  2. #2
    gingham dress, army boots... silver trophy redux's Avatar
    Join Date
    Apr 2002
    Location
    Salford / Manchester / UK
    Posts
    4,838
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    if i'm not mistaken, you would need to specify a character set for the page that contains the cyrillic letters, and encode said letters with their numeric references http://www.w3.org/TR/REC-html40/charset.html

    but i'll admit i'm a bit hazy on this...
    re·dux (adj.): brought back; returned. used postpositively
    [latin : re-, re- + dux, leader; see duke.]
    WaSP Accessibility Task Force Member
    splintered.co.uk | photographia.co.uk | redux.deviantart.com

  3. #3
    SitePoint Evangelist english-test.net's Avatar
    Join Date
    Jul 2003
    Location
    Leipzig, Germany, Germany
    Posts
    438
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hey Redux, Many thanks for the info.

  4. #4
    SitePoint Zealot
    Join Date
    Jan 2002
    Posts
    103
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm fairly certain that utf-8 also supports cyrillic letters. Anyone know what languages utf-8 doesn't support yet?

    Code:
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  5. #5
    gingham dress, army boots... silver trophy redux's Avatar
    Join Date
    Apr 2002
    Location
    Salford / Manchester / UK
    Posts
    4,838
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    CRA is right, of course.
    nice little unicode chart for cyrillic http://www.unicode.org/charts/PDF/U0400.pdf
    Edit:

    probably even more useful, as it gives the decimal code as well as the hex one: http://orwell.ru/info/cyr.htm
    re·dux (adj.): brought back; returned. used postpositively
    [latin : re-, re- + dux, leader; see duke.]
    WaSP Accessibility Task Force Member
    splintered.co.uk | photographia.co.uk | redux.deviantart.com

  6. #6
    SitePoint Evangelist S7even's Avatar
    Join Date
    Jun 2002
    Posts
    481
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The russian iso encoding is iso-8859-5

    What are the advantages/disadvantages of using UTF-8 instead of the iso encodings?

  7. #7
    SitePoint Zealot
    Join Date
    Jan 2002
    Posts
    103
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by S7even
    What are the advantages/disadvantages of using UTF-8 instead of the iso encodings?
    I like utf-8 because:

    1) Multiple langauges on the same document: http://www.w3.org/TR/2000/REC-xml-20001006#sec-lang-tag

    You'd define the primary language in the html tag
    Code:
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    but if you have other languages in your document, you can also define them as they appear.

    2) Going forward, if you're working in xhtml (which is xml), utf-8 and utf-16 are the only encodings that xml processors are required to support, and everything else is optional. And the xml prolog (usually optional) is required if you use an encoding other than utf-8 or utf-16
    Last edited by CRA; Aug 18, 2003 at 03:07.

  8. #8
    SitePoint Evangelist english-test.net's Avatar
    Join Date
    Jul 2003
    Location
    Leipzig, Germany, Germany
    Posts
    438
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If your general content is in English and you want to post only a few words in Cyrillic you can use a simple coding system which I can send you via email.

  9. #9
    SitePoint Evangelist S7even's Avatar
    Join Date
    Jun 2002
    Posts
    481
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by CRA
    I like utf-8 because:

    1) Multiple langauges on the same document: http://www.w3.org/TR/2000/REC-xml-20001006#sec-lang-tag

    You'd define the primary language in the html tag
    Code:
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    but if you have other languages in your document, you can also define them as they appear.

    2) Going forward, if you're working in xhtml (which is xml), utf-8 and utf-16 are the only encodings that xml processors are required to support, and everything else is optional. And the xml prolog (usually optional) is required if you use an encoding other than utf-8 or utf-16
    Ok, so if you had a paragraph with russian you would write something like: <p xml:lang="ru"> ?

    Aren't there any disadvantages? All platforms/browsers support it?

  10. #10
    SitePoint Evangelist english-test.net's Avatar
    Join Date
    Jul 2003
    Location
    Leipzig, Germany, Germany
    Posts
    438
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    мне хорошо

  11. #11
    SitePoint Zealot
    Join Date
    Jan 2002
    Posts
    103
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by S7even
    Ok, so if you had a paragraph with russian you would write something like: <p xml:lang="ru"> ?

    Aren't there any disadvantages? All platforms/browsers support it?
    Yep. You can even do languages in sentence fragments:
    Code:
    I'd like to see a map, <span xml:lang="de">bitte</span>.
    Off Topic:

    And if you have a website on a popular topic, it always helps your google rankings to define the language since language is one of the popular ways people use to narrow down their search results.


    Should work across platforms. Browsers are supposed to respect the character encoding specified in the document, and then they'll go to either the user preference or the default (which is usually either mac encoding, windows encoding, or iso-8859-1). But I know netscape 4+ and windows ie 4+ and the modern browsers support utf-8.
    Code:
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    Now, where utf-8 support is really gonna degrade is with the fonts your visitors will have installed in their systems.

    A sampler page to see if your browsers' fonts support multiple langauges: http://www.columbia.edu/kermit/utf8.html

    And a page on browser/fonts: http://register.consilium.eu.int/utf...lp/htechEN.htm


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •