SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Evangelist
    Join Date
    Jun 2010
    Location
    Israel
    Posts
    523
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    charset declaration html5 / php

    Hey everyone, i saw that html5 has a new way of declaring charset, instead of:
    Code:
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    i can put:
    Code:
    <meta charset="UTF-8">
    My question is, do old browsers support that shorthand? should i expect any issues with it?

    And regarding a php question, im trying to get some text from external links, the text is returned as '???' even if i use
    Code:
    header('Content-Type: text/html; charset=utf-8);
    I tried anpther encoding (my language - hebrew) and it worked:
    Code:
    header('Content-Type: text/html; charset=windows-1255');
    But the question is if the two wont conflict with each other... (the utf-8 declared in the html and windows-1255 declared in php)

    Thanks for the help,
    ulthane.

  2. #2
    I solve practical problems. bronze trophy
    Michael Morris's Avatar
    Join Date
    Jan 2008
    Location
    Knoxville TN
    Posts
    2,023
    Mentioned
    63 Post(s)
    Tagged
    0 Thread(s)
    It is far more effective and far more efficient to declare http headers in the actual header of the document rather than use the http-equiv tags. When applied the content type in particular those tags are a joke: by the time the browser reaches the tag it has already chosen a charset and language. At best you waste the client's time restarting the page render. At worst the client happily ignores your tag (and most browsers do).

    You already know how to set the headers in PHP. The http-equiv tags are redundant and unnecessary.

  3. #3
    SitePoint Evangelist
    Join Date
    Jun 2010
    Location
    Israel
    Posts
    523
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    So i didn't understand what you say is i should delete the meta tag completely and only put php header declaration? it then wont be visible to the client (dunno if it has any downsides or...)

  4. #4
    I solve practical problems. bronze trophy
    Michael Morris's Avatar
    Join Date
    Jan 2008
    Location
    Knoxville TN
    Posts
    2,023
    Mentioned
    63 Post(s)
    Tagged
    0 Thread(s)
    ... HTTP 101 ...

    A document transmitted via the HTTP protocol will have two sections - a header and a body. The way modern browsers work, you never see the header, but they are there. These are the response headers for Google.

    Code:
    Date: Fri, 24 Feb 2012 13:59:47 GMT
    Expires: -1
    Cache-Control: private, max-age=0
    Content-Type: text/html; charset=UTF-8
    Content-Encoding: gzip
    Server: gws
    Content-Length: 22147
    X-XSS-Protection: 1; mode=block
    X-Frame-Options: SAMEORIGIN
    
    200 OK
    After this information comes the MIME encoded body of the document. If it's text it will be relatively readable.

    PHP can control, through the header function, the contents of any of these lines. This allows you to modify the response code, caching and so on. Whatever you don't populate your webserver program populates for you according to its own settings.

    meta http-equiv tags will, in theory, override these properties. But it's more efficient to pass the correct desired value in the header in the first place. Also the content-type header cannot be changed after rendering of the document has started, so http-equiv="Content-Type" is useless and will be ignored. The same applies to the Content-Encoding and Content-Length properties. Meta http-equiv tags are primarily used for setting specific caching rules in otherwise static html documents, and they are quite effective in that role.

  5. #5
    SitePoint Evangelist
    Join Date
    Jun 2010
    Location
    Israel
    Posts
    523
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    hey , thanks for the information, i understood now, i got another question thought, is there any way to change encoding for only a part of the php script? (like in a function only)
    Im using UTF-8 for my website, but i must use windows-1255 to get page titles from external links, cuz UTF-8 always return '???'

    Any clue? or a workaround?

  6. #6
    I solve practical problems. bronze trophy
    Michael Morris's Avatar
    Join Date
    Jan 2008
    Location
    Knoxville TN
    Posts
    2,023
    Mentioned
    63 Post(s)
    Tagged
    0 Thread(s)
    Character coding must be uniform for the file.

  7. #7
    SitePoint Evangelist
    Join Date
    Jun 2010
    Location
    Israel
    Posts
    523
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    isnt there any way to get external page titles without being dependent on encoding ...?

  8. #8
    I solve practical problems. bronze trophy
    Michael Morris's Avatar
    Join Date
    Jan 2008
    Location
    Knoxville TN
    Posts
    2,023
    Mentioned
    63 Post(s)
    Tagged
    0 Thread(s)
    $_SERVER['REQUEST_URI'] holds the file name the outside world is asking for.

  9. #9
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,246
    Mentioned
    16 Post(s)
    Tagged
    0 Thread(s)
    ulthane, if I understood your latest request correctly, after your script downloads content from some external source, you'll then need to detect and convert its encoding.

  10. #10
    SitePoint Evangelist
    Join Date
    Jun 2010
    Location
    Israel
    Posts
    523
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    hey Jeff thanks for the answer however i noticed that from 2 different pages with same encoding i get different results (one as '???' and the other as normal...) so i guess it was not an encoding issue, or at least it will be hard to detect and fix
    So I just checked the returned title with a preg_match and if it doesnt contain the right characters im looking for it will be named "link" if anyone is interested in the solution here it is, it works fine but its a little bit slow ....

    Code:
    function get_page_title($url)
    {
    	ini_set('user_agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11');
    	$doc = new DOMDocument();
    	@$doc->loadHTMLFile($url);
    	if (!$doc)
    		return 'link';
    	$xpath = new DOMXPath($doc);
    	$title = trim($xpath->query('//title')->item(0)->nodeValue);
    	if (preg_match('/[^a-z0-9 ]/i', $title) || $title=='')
    		return 'link';
    	return $title;
    }


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •