SitePoint Sponsor

User Tag List

Results 1 to 16 of 16
  1. #1
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Adding Japanese text in a page of English.

    I've been asked to include some Japanese characters into a page of a site I wrote a while back using charset=ISO-8859-1.

    It isn't possible/practical to rewrite the site using another charset (UTF-8 / 16 /whatever) just to be able to include a couple of Japanese characters, so can I change charsets for just a <span> or is there a better way to achieve this?

    I've thought of creating a graphic of the characters and just plonking that in the page at the appropriate place, (I realize it wouldn't 'scale' with control+ and -) but would rather do it properly!

    Thanks for any input.

  2. #2
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No you can't, since the character encoding is for the entire page. Now, why isn't it possible/practical to re-save the page as UTF-8?

  3. #3
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks Dan,

    I'm afraid this is an area where I have to admit my ignorance.

    The Japanese characters I was sent to insert are in a word document and seem to be using charset=Shift_JIS . I'm totally at a loss as to how to map these to UTF-8.. (I use Open Office and don't actually have a copy of Word) If I open the doc in an actual copy of word, does word have an option to save as a different charset?

    I could switch the charset for the one page that needs to include these characters to Shift_JIS but I presume(?) that will mess up the English characters.

    I think that changing the charset from ISO-8859-1 to UTF-8 will retain the English text intact since as far as I can see the basic ascii character set is mapped the same in both. So, thinking about it more my problem is converting Shift_JIS to UTF-8. I haven't a clue what the Japanese characters mean so might be writing something totally different!

    Am I on the 'right' track?

  4. #4
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    From what I can tell you are on the right track. One thing you could probably do is use UTF-8 for the character encoding and then write the page in Japanese, then use the lang attribute for those sections of the page that are in English (or the other way around). I'm not sure if it would work in your particular case though.

    Only one way to find out, of course.

  5. #5
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Only one way to find out, of course.


    It's on my 'to do' list for tomorrow,I'll post my thoughts after the experiment(s).

    Mahalo!

  6. #6
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Worked..

    Changed the charset to utf-8 and couldn't see any corrupted characters in the existing site.

    First, I tried to cut/paste Japanese characters from word doc into my text editor (notepad++) which just produced question marks. (I haven't yet found a switch to tell notepad++ to use utf-8 as it's default character set, but presume there is one somewhere?)

    The I tried in Dreamweaver, & the Japanese characters showed up.

    Uploaded as a new include file and it works..


    http://www.issw2008.com/index.php?p=japanese

    I've checked the result in Firefox/Opera/IE windows and Safari/Mac and all looks good so far.



    Thanks!

  7. #7
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    the lang attribute.

    The site doesn't have a lang attribute, but googling/reading about it I see that it (perhaps, there seems to be much dissent) helps user agents, search engines and AU text readers.

    Again I'm finding the actual 'implementration' of this hard to get firm facts about. The site is xhtml, it appears that I can set the lang of the whole site by adding to the <html xmlns="http://www.w3.org/1999/xhtml">
    tag thus:

    <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">


    Now the whole site is defined as being in English, so I can indicate that the Japanese 'bits' are different thus:


    <p lang="jp" xml:lang="jp">日本語</p>

    or

    <p>Japanese <span lang="jp" xml:lang="jp">日本語</span>.</p> (If 'within' a containing element, such as this <p></p>)

    or

    <div lang="jp" xml:lang="jp">
    <p>日本語</p>
    <p>日本語</p>
    </div> (if more than one consecutive Japanese element)

    I haven't tried to add this yet, but please let me know if I have got the right 'idea'.

    Thanks.

    (ps. Are there any speech readers out there that can read Japanese out loud...?)

  8. #8
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try this:

    HTML Code:
    <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
    Then declare the languages as needed in the copy.

  9. #9
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Isn't that exactly what I said above?

  10. #10
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I swapped the order around. If I recall correctly (and I could be wrong here since it's been a while since I read the spec) the order the attributes appear in is VERY important.

  11. #11
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    'My' order didn't seem to cause problems & seemed to work in firefox/ie/safari/opera on Win and Safari on a Mac, and the validator at the w3c didn't throw a wobbly, it declared the page(s) to be valid XHTML.

    Just in case, I've revised my word order to match yours. Looking at loads of sites they all seem to use 'your' word order, but I haven't found 'chapter and verse' as to why one is better than the other. Presumably in a spec doc somewhere the parser is defined as to be looking for the name-space declaration as soon as it enters the html.

    Anyway, thanks for your help!

  12. #12
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Is it working now? (From what you said, it seems like it.)

  13. #13
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Osoyoos BC Canada
    Posts
    178
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, Dan,

    Thank you it all seems to be working.

    And I've also found that notepad++ uses ANSI Encoding as default for a new document, but by going to Settings>New Document> one can set the default to UFT-8 without BOM or to UTF-8.

    I set it to utf-8 without BOM (since I recall that Byte Order Marks can trip up some browsers) and can now cut and paste new Japanese characters from a word document into a notepad++ document.

    Got there...

  14. #14
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Cool. I use Edit+ myself.

  15. #15
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,159
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dan Schulz View Post
    If I recall correctly (and I could be wrong here since it's been a while since I read the spec) the order the attributes appear in is VERY important.
    The order of the attributes has no importance whatsoever.
    Birnam wood is come to Dunsinane

  16. #16
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hence the disclaimer, boss.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •