SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Enthusiast
    Join Date
    Nov 2008
    Location
    New York
    Posts
    90
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Character Encoding Question

    This is more HTML question but could not find home for HTML questions so forgive me.

    I am puzzled why the following page:
    http://1a-reisekatalog.de/
    is showing that it is encoded with utf-8 (I am using Firefox ver. 3.0.5 and looking at View->Character Encoding menu) even though when I look into the page source I can see the following line:
    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" />

    Another page:
    http://paris.fr/portail/accueil/Portal.lut?page_id=1
    is showing that it is encoded with ISO-8859-1 via browser (which I believe should also show for the first page) and almost exact line exists in the page source:
    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">


    So the question is how the browser knows that the first page is encoded with utf-8 even though the source sets it to ISO-8859-1?

    Thank you.
    www.forkaya.com - Web Development, PHP Scripting

  2. #2
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Because the web server sends an http-header with the response, that specifies a different charset. The meta-tag is only interpreted by the browser, if there isn't an http-header. Usually there is, and so meta-tags are usually irrelevant.

    You can read more here and here.

  3. #3
    SitePoint Enthusiast
    Join Date
    Nov 2008
    Location
    New York
    Posts
    90
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    well firebug is giving me the following header info for the first page:
    Response Headers
    Date

    Fri, 09 Jan 2009 18:49:48 GMT

    Server

    Apache/2.0.52 (CentOS)

    X-Powered-By

    PHP/5.2.5

    Expires

    Thu, 19 Nov 1981 08:52:00 GMT

    Cache-Control

    no-store, no-cache, must-revalidate, post-check=0, pre-check=0

    Pragma

    no-cache

    Connection

    close

    Transfer-Encoding

    chunked

    Content-Type

    text/html; charset=ISO-8859-1


    so it is not it in this example...
    www.forkaya.com - Web Development, PHP Scripting

  4. #4
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    I get the following, using LiveHTTPHeaders in Firefox.

    http://1a-reisekatalog.de/
    Code:
    HTTP/1.x 200 OK
    Date: Fri, 09 Jan 2009 19:03:46 GMT
    Server: Apache
    X-Powered-By: PHP/4.4.9
    Keep-Alive: timeout=2, max=200
    Connection: Keep-Alive
    Transfer-Encoding: chunked
    Content-Type: text/html; charset=utf-8
    http://paris.fr/portail/accueil/Portal.lut?page_id=1
    Code:
    HTTP/1.x 200 OK
    Date: Fri, 09 Jan 2009 19:04:52 GMT
    Server: Apache/2.0.52 (Red Hat)
    Vary: Accept-Encoding
    Content-Encoding: gzip
    Content-Type: text/html; charset=iso-8859-1
    Age: 172
    X-Cache: HIT from www.paris.fr
    Connection: close
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  5. #5
    SitePoint Enthusiast
    Join Date
    Nov 2008
    Location
    New York
    Posts
    90
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    @SilverBulletUK
    You are right. I looked up a wrong page. Too many tabs open

    @kyberfabrikken
    I apologize for misleading you. Thanks for your help that will put me back on track. I can see that some http headers do not contain charset setting so I am assuming I will still need to check HTML if it is not there in the header....
    www.forkaya.com - Web Development, PHP Scripting

  6. #6
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by forkaya View Post
    I apologize for misleading you.
    I wasn't mislead
    Quote Originally Posted by forkaya View Post
    I can see that some http headers do not contain charset setting so I am assuming I will still need to check HTML if it is not there in the header....
    Yes, the content-type header is optional, and so the server can choose not to send it. The charset subpart of the content-type header is also optional, so some times you see just the mimetype, but not the charset. In theese cases, the meta tag will be used instead. If that isn't there, the browser will make a guess or fall back on a default value. What that is, varies from client to client. In Firefox you can choose the default charset in the browsers settings and you can probably do the same for other browsers too. The most common default values will be either iso-8859-1, cp-1252 or utf-8.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •