SitePoint Sponsor

User Tag List

Results 1 to 2 of 2
  1. #1
    SitePoint Zealot lord's Avatar
    Join Date
    Nov 2003
    Location
    sLOVEnia
    Posts
    117
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Right declaration for language ...

    Heya

    I finish with my first page, where I made transform from html 4.01 to xhtml 1.1
    I manage to validate it, but I have one question about the declaration of language, that I use in my page.

    this is my code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1250" />
    .......

    well you can see that language is set to EN (english) and charset is win 1250.
    I am from Slovenia (central europe) and I will also use some other characters, that are not used in EN. Like: ,,č

    I look in w3c, but didn't find nothing usefull. Could anybody tell me if this is ok, or should be better to use some other declaration?

    thanx
    maxx

  2. #2
    SitePoint Guru bronze trophy blufive's Avatar
    Join Date
    Mar 2002
    Location
    Manchester, UK
    Posts
    853
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If your content is multilingual, I believe you could use the (x)html lang attribute to mark the relevant bits. For example, <p lang="sl">[paragraph in slovenian]</p>.

    In terms of character encoding, you need to know what encoding is used by the software you're using to edit the HTML files. If it's using windows-1250 (which seems likely, given that it's the main central european character set as far as windows is concerned), then you're probably sorted.

    You could also look at using character entities for unusual (to American eyes) characters. Of the three you list, only one appears to have named entities - &scaron; ("š") and &Scaron; ("Š") - see http://www.htmlhelp.com/reference/ht...s/special.html

    The other two can be accessed using the numeric form ("č" is & #269; or &#x10D;, for instance) You'll need to look stuff up, but the unicode website (http://www.unicode.org/charts/) will help out here. You probably want "latin extended-A", and maybe some way of converting hex to decimal.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •