SitePoint Sponsor

User Tag List

Page 2 of 2 FirstFirst 12
Results 26 to 38 of 38
  1. #26
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Something is improperly trying to encode the input into HTML entities:

    HTML Code:
    <div class="top"> 
    <b><a href="mailto:-">&aelig;&micro;&egrave;&macr;</a></b> posted this message on: 07-05-2009. 
     </div> 
    <div class="mid"> 
    "&aelig;&micro;&egrave;&macr;�" 
    </div> 
    
    <div class="bodem"> 
    <a href="http://" target="_blank"></a> 
    </div> 
    <div class="top"> 
    <b><a href="mailto:-">Bulevardi</a></b> posted this message on: 07-05-2009. 
     </div> 
    <div class="mid"> 
    "This is a test in Charset UTF-8.
    <br>
    <br>&Atilde;&copy; &Atilde;&nbsp; &Atilde;&para;" 
    </div> 
    <div class="bodem"> 
    <a href="" target="_blank"></a> 
    
    </div> 
    Did you leave in a htmlentities() or something?

    The ISO-8859-1 guestbook does the same (needlessly).

    Edit:

    Can you post the code?

  2. #27
    SitePoint Addict
    Join Date
    Dec 2008
    Location
    Brussels
    Posts
    377
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by sk89q View Post
    Something is improperly trying to encode the input into HTML entitiesid you leave in a htmlentities() or something?

    The ISO-8859-1 guestbook does the same (needlessly).
    To get it from the form into the database:
    PHP Code:
    $bericht $_POST['bericht'];
    $bericht strip_tags($bericht);
    $bericht htmlentities($bericht);

    $email $_POST['email'];
    $email strip_tags($email);
    $email htmlentities($email);

    $website $_POST['website'];
    $website strip_tags($website);
    $website htmlentities($website);

    $naam $_POST['naam'];
    $naam strip_tags($naam);
    $naam htmlentities($naam);
    $bericht str_replace("\n""<br>"$bericht); 
    To show the guestbook out of the database:
    PHP Code:
    echo "<div class=\"top\"> \r\n";
    echo 
    "<b><a href=\"mailto:"$email ."\">"$naam ."</a></b> posted this message on: "$datum .". \r\n ";
    echo 
    "</div> \r\n";

    echo 
    "<div class=\"mid\"> \r\n";
    echo 
    "\""$bericht ."\" \r\n";
    echo 
    "</div> \r\n";

    echo 
    "<div class=\"bodem\"> \r\n";
    echo 
    "<a href=\"" $website "\" target=\"_blank\">"$www ."</a> \r\n";
    echo 
    "</div> \r\n"
    Probably I'm making lots of mistakes... I'm a beginner in PHP and my coding isn't properly done yet. I don't do it as profession, just hobbying.

  3. #28
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Those htmlentities() are causing the problem. It's assuming that the encoding of the string is ISO-8859-1, so it tries to pick out those characters it recognizes and converts them to HTML entities, but that utterly destroys the string. It's cutting UTF-8 characters in half (or into smaller parts) and replacing one of the halves with a HTML entity. You don't actually need htmlentities(). You can just use htmlspecialchars(), because you do not need to encode every possible character (htmlspecialchars() just escapes <, >, &, and " by default).

  4. #29
    SitePoint Addict
    Join Date
    Dec 2008
    Location
    Brussels
    Posts
    377
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hmm I see.
    But for some reason I put those htmlentities() into the code. I remember it was for a reason of chars that weren't displayed like it should.
    Anyway, it could be that they're not needed anymore in the script... but now that it finally works I'm not gonna change it anymore on that site

  5. #30
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    NONASCII characters on an ISO-8859-1 form

    NONASCII characters on an ISO-8859-1 form
    ~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~

    From: Michel Merlin
    Posted on http://bulevardi.be/gbISO88591/guestbook.php
    (HTTP Headers:
    Content-Type: text/html
    Content-Type: text/html;charset=ISO-8859-1)
    (Source, brackets removed:
    meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1" /)
    ---

    Following bulevardi's Post www.sitepoint.com/forums/showthread.php?t=613859#post4247403 of Thu 07 May 2009 19:25 GMT:

    « Hélène garçons œuvre 1€ ± 5‰ »

    - Please make a reply in UTF-8, another in ISO-8859-1
    - If possible one pair in OE, one in TB
    - Leave the present Parent Message reproduced as usual
    - Write (outside the Parent Message field; by typing, NOT by copy-pasting) a character string, short and simple, but with EACs (European Accentuated Characters)
    - Repeat the same « Hélène garçons œuvre 1€ ± 5‰ » inside the Subject of your email reply
    - Add yourself in "Cc:"

    Versailles, Fri 8 May 2009 01:16:45 +0200
    -> Posted:

    Michel Merlin posted this message on: 08-05-2009.
    "NONASCII characters on an ISO-8859-1 form
    ~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~
    From: Michel Merlin
    Posted on http://bulevardi.be/gbISO88591/guestbook.php
    (HTTP Headers:
    Content-Type: text/html
    Content-Type: text/html;charset=ISO-8859-1)
    (Source, brackets removed:
    meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1" /)
    ---

    Following bulevardi's Post www.sitepoint.com/forums/showthread.php?t=613859#post4247403 of Thu 07 May 2009 19:25 GMT:

    « Hélène garçons œuvre 1€ ± 5‰ »

    - Please make a reply in UTF-8, another in ISO-8859-1
    - If possible one pair in OE, one in TB
    - Leave the present Parent Message reproduced as usual
    - Write (outside the Parent Message field; by typing, NOT by copy-pasting) a character string, short and simple, but with EACs (European Accentuated Characters)
    - Repeat the same « Hélène garçons œuvre 1€ ± 5‰ » inside the Subject of your email reply
    - Add yourself in "Cc:"

    Versailles, Fri 8 May 2009 01:16:45 +0200"
    Copied on SitePoint 01:32:25

  6. #31
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    NONASCII characters on an UTF-8 form

    NONASCII characters on an UTF-8 form
    ~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-
    From: Michel Merlin <michel.merlin@laposte.net>
    Posted on http://bulevardi.be/gbUTF8/guestbook.php
    (HTTP Headers:
    Content-Type: text/html
    Content-Type: text/html;charset=utf-8)
    (Source, brackets removed:
    meta http-equiv="Content-Type" content="text/html;charset=utf-8" /)
    ---

    Following bulevardi's Post www.sitepoint.com/forums/showthread.php?t=613859#post4247403 of Thu 07 May 2009 19:25 GMT:

    « Hélène garçons œuvre 1€ ± 5‰ »

    - Please make a reply in UTF-8, another in ISO-8859-1
    - If possible one pair in OE, one in TB
    - Leave the present Parent Message reproduced as usual
    - Write (outside the Parent Message field; by typing, NOT by copy-pasting) a character string, short and simple, but with EACs (European Accentuated Characters)
    - Repeat the same « Hélène garçons œuvre 1€ ± 5‰ » inside the Subject of your email reply
    - Add yourself in "Cc:"

    Versailles, Fri 8 May 2009 01:18:50 +0200
    -> Posted:

    Michel Merlin posted this message on: 08-05-2009.
    "NONASCII characters on an UTF-8 form
    ~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-
    From: Michel Merlin
    Posted on http://bulevardi.be/gbUTF8/guestbook.php
    (HTTP Headers:
    Content-Type: text/html
    Content-Type: text/html;charset=utf-8)
    (Source, brackets removed:
    meta http-equiv="Content-Type" content="text/html;charset=utf-8" /)
    ---

    Following bulevardi's Post www.sitepoint.com/forums/showthread.php?t=613859#post4247403 of Thu 07 May 2009 19:25 GMT:

    « Hélène garçons Å?uvre 1â?¬ ± 5â?° »

    - Please make a reply in UTF-8, another in ISO-8859-1
    - If possible one pair in OE, one in TB
    - Leave the present Parent Message reproduced as usual
    - Write (outside the Parent Message field; by typing, NOT by copy-pasting) a character string, short and simple, but with EACs (European Accentuated Characters)
    - Repeat the same « Hélène garçons Å?uvre 1â?¬ ± 5â?° » inside the Subject of your email reply
    - Add yourself in "Cc:"

    Versailles, Fri 8 May 2009 01:18:50 +0200"
    Copied on SitePoint 01:33:10

  7. #32
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Both tests above (copied here Thu 07 May 2009 23:32 and 23:33 GMT) were posted in forms using IE6 and OE6 set for ISO-8859-1 as detailed in For Long URLs, Accentuated Chars, encode as Quoted-Printable, Western European (ISO), use "EUR" for Euro symbol

    I will redo with setting for UTF-8

    Versailles, Fri 8 May 2009 01:43:45 +0200

  8. #33
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Link to these forms?

  9. #34
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by sk89q View Post
    Link to these forms?
    Thu 07 May 19:25 GMT, repeated in header of each respective test

    Versailles, Fri 8 May 2009 02:26:35 +0200

  10. #35
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I didn't get an email. :/

  11. #36
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here's a test form:
    http://sk89q.therisenrealm.com/testground/utf8email/

    Needed key: sptest

  12. #37
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    sk89q's charsetless form and UTF-8 email reply conveyed NONASCII sans corruption

    sk89q's charsetless form and UTF-8 email reply conveyed NONASCII sans corruption
    Quote Originally Posted by sk89q View Post
    Here's a test form:
    http://sk89q.therisenrealm.com/testground/utf8email/

    Needed key: sptest
    Thanks. I posted this, prepared with OE6 set for ISO-8859-1:
    Thx "sk89q" for your www.sitepoint.com/forums/showthread.php?t=613859&page=2#post4248360
    OE6 ISO-8859-1 michel.merlin@laposte.net
    Posted on http://sk89q.therisenrealm.com/testground/utf8email
    (HTTP Headers: none for "Content-Type")
    (Source: none for "UTF" or "CHARSET"or "ISO")

    « Hélène garçons œuvre 1€ ± 5‰ »
    Příšerně žluťoučký kůň úpěl ďábelské ódy

    Fri 8 May 2009 21:09:50 +0200
    and this, prepared with TB 2.0.0.21 (charset auto, probably UTF-8):
    Thx "sk89q" for your www.sitepoint.com/forums/showthread.php?t=613859&page=2#post4248360
    TB 2.0.0.21 auto (should send in UTF-8) michel.merlin@laposte.net

    Posted on http://sk89q.therisenrealm.com/testground/utf8email
    (HTTP Headers: none for "Content-Type")
    (Source: none for "UTF" or "CHARSET"or "ISO")

    « Hélène garçons œuvre 1€ ± 5‰ »
    Příšerně žluťoučký kůň úpěl ďábelské ódy

    Fri 8 May 2009 21:12:30 +0200
    The two UTF-8 email replies I received reproduced the sample (« Hélène garçons œuvre 1€ ± 5‰ », Příšerně žluťoučký kůň úpěl ďábelské ódy) sans error

    PS. To make sure we all speak of the same about appearance on our screens (no matter the fonts installed or not, the charsets selected, etc), here is how should appear the sample above:
    Versailles, Fri 8 May 2009 21:38:25 +0200, edited (PS) Sat 09 May 01:15:00
    Attached Images Attached Images
    Last edited by Michel Merlin; May 8, 2009 at 15:15.

  13. #38
    SitePoint Member
    Join Date
    May 2009
    Posts
    14
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    anytime you modify the string odds are you're screwing the encoding if you're not using mb


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •