SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Evangelist stef25's Avatar
    Join Date
    Nov 2004
    Location
    belgium
    Posts
    465
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    strange characters in my xhtml (ex:  )

    Text that says "Africa, an Overcrowded Continent" is being generated by my CMS as Africa, an Overcrowded Continent. this is how it appears in the client. when i try to validate the page, W3C says there are characters it can not interpret as utf8.

    obviously this is an issue with character encoding. im just wondering why it only generates these characters between certain words, and not others?

    edit: in my original posting, the error did not appear. & # 160; is what appears between overcrowded and continent. i can not google that string (no results appear?) and its not rendered on page. it does appear in the source which makes believe its actually equal to a "space", but W3C can nevertheless validate my page cause of it.

    so how could i stop this stuff from appearing in my source? it appears in an article which is entered into the cms by the author and he entered nothing i can see that would have caused these characters to appear.
    Last edited by stef25; Feb 5, 2007 at 11:59. Reason: corrections
    I need someone to protect me from
    all the measures they take in order to protect me

  2. #2
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Got a link?

  3. #3
    SitePoint Evangelist stef25's Avatar
    Join Date
    Nov 2004
    Location
    belgium
    Posts
    465
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    you got PM. is this any use, since you cant see the backend cms and database? its all generated dynamically
    I need someone to protect me from
    all the measures they take in order to protect me

  4. #4
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,159
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The   (non-breaking space) will be a problem if it occurs as a single octet with value 160 (A0). That is not valid UTF-8; it must be encoded as two octets (C2 A0).

    Either change the declared encoding to ISO-8859-1 or make sure that the content delivered from your CMS is properly encoded as UTF-8. If it's not UTF-8 from the database, you'll have to add an encoding filter that fixes the problems.
    Birnam wood is come to Dunsinane

  5. #5
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Looking at the link...
    Code:
    <!-- start excerpt -->
    <img src="/images/443t.gif" alt="Hunter Thompson - Gonzo" class="thumb_article" />
    We were somewhere around Barstow on the edge of the desert when the drugs began to take hold…
    
    	<p>So Hunter Thompson began his crowning work “Fear and Loathing in Las Vegas”, ...</p>
    <span class="continue_reading"><a rel="bookmark" href="/article/964/hunter-thompson-gonzo-journalist" title="Permanent link to this article">Continue reading &gt;&gt;</a></span>
    
    </div><!-- ends .middle_content_box_2 --><!-- THIS FORM LISTS ARTICLES ON A COUNTRY PAGE -->
    <div class="middle_content_box_2">
    
    <!-- start article title -->
    <h2 class="content_box_header"><a href="/article/1064/india-an-overcrowded-continent" >India - An Overcrowded& #160;Continent</a></h2>
    Replace the quotation marks with their appropriate character entities, and replace the & #160; (note: space added by me in both instances) with a regular space between "Overcrowded" and "Continent" .


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •