SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Member
    Join Date
    Apr 2009
    Location
    S E Asia
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Czech character display problem

    I have frustrating display problem concerning Czech characters.

    Users notes are saved to a MySQL database in utf-8. The Czech characters are saved in correct format in database.

    They display correctly on one page but on another using exactly the same code to both retrieve from database and display in a textarea field they do not display correctly.

    Other Czech characters on the page outside the textarea field on both pages do display correctly, leading me to believe the page encoding is correct. It is only the characters inside the textarea field that display incorrectly.

    I can give urls but you would need to register to see the pages in question.

    site url: Acupuncture For The Mind - Akupunkturapromysl

    Register (free) then in the user home page (Acupuncture For The Mind - Course Control Center) scroll to the bottom and you will see the notes field. This displays Czech characters as it should.

    If you go to module one (Acupuncture For The Mind - Akupunkturapromysl) and scroll down the notes field at the bottom of the page does not display Czech characters correctly.

    Any help or suggestions gratefully received before what little hair I do have left is pulled out!

  2. #2
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    I do not know the answer but what I would try is this;

    echo the text onto the page outside of the <textarea> - does that still display bad characters?

    Thus indicating whether it is a problem with the text area (or the code that generates the text area).

    then

    echo the offending text on the first (comparison) page you mentioned, hopefully proving the database handling code is not at fault.

  3. #3
    SitePoint Member
    Join Date
    Apr 2009
    Location
    S E Asia
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Cups thanks for the suggestion. Here's what happened:

    echo the characters after retrieving from the database = same error

    copy the actual characters and echo anywhere = no problem - so the Czech characters display correctly on all pages

    retrieving from the database causes the characters to display incorrectly on the one page, but not on the other

    checked the data in the database and it is saved and displays correctly there as Czech characters

    both pages encoded as utf-8 and confirmed in the http headers

    both pages use identical code to retrieve and dispay the data

    any ideas?

  4. #4
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    you said originally:

    Other Czech characters on the page outside the textarea field on both pages do display correctly, leading me to believe the page encoding is correct. It is only the characters inside the textarea field that display incorrectly.
    This suggested that it was only the text being displayed inside the textarea which was displaying incorrectly.

    I suggested:

    echo the text onto the page outside of the <textarea> - does that still display bad characters?
    I meant on that same page, leave off the text area and display the offending text with the good text around it.

    Are you saying you did this and you have a mixture of good and bad text on the same page, as straight html?

    Where is the text for the textarea originally generated from?

    I recall having an issue like this yours - where users could upload a text file (for a translation) and that text file was not utf-8, and therefore the text was tainted all the way through its life-cycle until I displayed it.

    Edit:

    I suspect it is the same case if they even copy from a text file which is not utf-8 and paste into an input screen - you may have to implicitly set encoding to uft-8 prior to saving.

  5. #5
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Here are some quality i18n and l10n links which might help you turn up the answer.

    Character Sets / Character Encoding Issues [Web Application Component Toolkit]
    PHP charset/encoding FAQ - Kore Nordmann - PHP / Projects / Politics
    Charset vs. Encoding - Kore Nordmann - PHP / Projects / Politics
    Internationalisation Gotchas &mdash; Internationalisation Tips

    The best advice though it to enforce utf-8 at every single step of the way, from where the text is generated (not from where it is copied as mentioned previously) through to the server encoding, database i/o, html and browser.

  6. #6
    SitePoint Member
    Join Date
    Apr 2009
    Location
    S E Asia
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Cups hi and thanks for your help.

    When the data is pulled from the database it displays badly inside or outside the form textarea field.

    If I copy the same Czech characters and echo them straight to the page they display fine. It is pulling them from the database that precipitates the bad display.

    The Czech characters display and are therefore saved correctly inside the database and they also display inside another page correctly when pulled from the database.

    So on one page all is fine. On another with exactly the same code, it is not! That is the puzzle.

    Thanks for the links, going through them now.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •