SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Wizard
    Join Date
    Mar 2008
    Location
    United Kingdom
    Posts
    1,285
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Weird chars on top of my site ()

    Hi,

    I'm getting these strange characters at the top of one page on my site: &#239;&#187;&#191; ... </head><body>&#239;&#187;&#191;

    Now I think I know why. I'm sure it's related to the &#163; symbols I'm taking from the db.

    Browser: FireFox 3.
    Charset = UTF-8

    It only happens when I first visit the page(or after a few hours of not going to it), it displays the ugly chars at the top and &#163; in the source code, where it should be & # 1 6 3 ;. Every other time I visit thereafter, it has the & # 1 6 3 ; in the source code, which is the correct value as taken from the db.

    Can anyone help me with this?



    The only other solution I can think of is to do a str_replace on-the-fly, but not sure if this would work heh.


    Really stuck on this annoying issue.


    Many thanks for any ideas.

  2. #2
    SitePoint Evangelist
    Join Date
    Aug 2007
    Posts
    566
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It is an encoding problem.
    &#239;&#187;&#191; is known as the BOM (Byte Order Mark), and describe the physical encoding of the file.
    http://unicode.org/faq/utf_bom.html#BOM
    Are you sure that you serve the page with an utf-8 charset, and not latin or worse, ascii ?

    I've found a few thread about this in google, generally related to the charset not set to utf-8
    http://www.google.com/search?hl=en&l...BF&btnG=Search

  3. #3
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,836
    Mentioned
    25 Post(s)
    Tagged
    1 Thread(s)
    The fix for where the BOM appears at the start of the file is to change the charset you are using with the page to UTF-8. When set to the correct charset the BOM will not appear.
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">

  4. #4
    SitePoint Wizard
    Join Date
    Mar 2008
    Location
    United Kingdom
    Posts
    1,285
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the speedy replies guys! OK.

    If it helps, I can PM you a link to the incriminating page to give you an idea what is happening.

    My PHP file is set up in the following way:

    PHP Code:
    <?

    // db connection

    // set page title, description variables

    // include header file, which has charset of utf-8 set

    // display db data

    // include footer

    ?>
    So should I also put a charset of UTF-8 in the main PHP file(before the header include statement), as well as the header file.

    I will first try saving both files as UTF-8 encoding and see what happens, I had read this could be an issue.

    Will let you know.....post haste!

  5. #5
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Your server is probably sending an HTTP header like this,
    Code:
    Content-Type: text/html; charset=iso-8859-1
    You can override that in your PHP code:
    Code PHP:
    <?php header('Content-Type: text/html; charset=utf-8'); ?>
    (Note that this must come before you write anything to the response stream. It might be a good idea to have it at the top of your PHP code.)
    Birnam wood is come to Dunsinane

  6. #6
    SitePoint Enthusiast
    Join Date
    Aug 2008
    Posts
    84
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Should be placed on the first line of the PHP file.

  7. #7
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, that's a good place for it. As long as it comes before anything is written to the output stream (e.g., an echo() statement or HTML markup outside <?php ... ?>) it's fine. Putting it at the very top guarantees that, although the BOM might cause a problem for you. If at all possible, try saving the files as UTF-8 without a BOM. The BOM is unnecessary for UTF-8 anyway.
    Birnam wood is come to Dunsinane

  8. #8
    SitePoint Wizard
    Join Date
    Mar 2008
    Location
    United Kingdom
    Posts
    1,285
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    OK, in my PHP file, the first couple of lines are:

    PHP Code:
    <?php header('Content-Type: text/html; charset=utf-8');

    $id intval($_GET['id']);

    ...and 
    so on
    But when I load it, I get the error message:

    PHP Code:
    WarningCannot modify header information headers already sent by (output started at [script namein [script nameon line 1. 

  9. #9
    SitePoint Evangelist
    Join Date
    Aug 2007
    Posts
    566
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    you probably have an space or a white space character before the "<?php"

  10. #10
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's probably because you have saved the file as UTF-8 with a BOM. The PHP parser starts reading the file, finds the BOM and outputs it to the response stream, which means you can't send any headers at all. (Headers must precede the body, so once you start outputting the body you've lost your chance to send headers.)

    You must re-save the source file as UTF-8 without BOM, or tweak your server to send the correct encoding attribute in the Content-Type HTTP header.
    Birnam wood is come to Dunsinane

  11. #11
    SitePoint Wizard
    Join Date
    Mar 2008
    Location
    United Kingdom
    Posts
    1,285
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    @tripy: There's no space before <?php

    @AC: How do I save the file UTF-8 without BOM? Thanks again

  12. #12
    SitePoint Wizard
    Join Date
    Mar 2008
    Location
    United Kingdom
    Posts
    1,285
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeesh, I opened the file in HTML-Kit and sure enough the code was in there, but not in my Notepad file. Thanks guys!

  13. #13
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Don't use Notepad with UTF-8!
    It always saves with a BOM.
    Birnam wood is come to Dunsinane

  14. #14
    SitePoint Wizard
    Join Date
    Mar 2008
    Location
    United Kingdom
    Posts
    1,285
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Oh man, news to me. I always use Notepad for my webby tings. Moving over to Notepad++ though


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •