SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Addict
    Join Date
    May 2006
    Location
    Ljubljana
    Posts
    241
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    regular expression - matching meta content-type

    Hello,

    I have the following code for matching meta content-type (charset):

    Code:
    preg_match( '@<meta\s+http-equiv="Content-Type"\s+content="([\w/]+)(;\s+charset=([^\s"]+))?@i', $html, $matches );
    	//if ( isset( $matches[1] ) ) $mime = $matches[1];
    	print_r($matches);
    	if ( isset( $matches[3] ) ) {
    		$charset = $matches[3];
    		return $charset;
    	}
    This won't match the following html:

    Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>
    	Det andet Andalusien - Solferie - Livsstil - Nordjyske.dk
    </title><link href="/css/nj_style.css" rel="stylesheet" type="text/css" media="all" />
    Why is this?

    What do the @ symbol mean with regex (couldnt find the info on the net)?

    Many thanks for help!

  2. #2
    Unobtrusively zen silver trophybronze trophy
    paul_wilkins's Avatar
    Join Date
    Jan 2007
    Location
    Christchurch, New Zealand
    Posts
    14,729
    Mentioned
    104 Post(s)
    Tagged
    4 Thread(s)
    I don't think that the @ symbol has an special meaning for regular expressions.
    Normally the forward slash is used at the start and end of regular expressions, instead of the at symbols that are seen there.
    See http://php.net/manual/en/function.preg-match.php
    Programming Group Advisor
    Reference: JavaScript, Quirksmode Validate: HTML Validation, JSLint
    Car is to Carpet as Java is to JavaScript

  3. #3
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,615
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    I just tested the regex code itself works correctly.

    I think your problem might be with the array indexes of your $matches variable.

    Array indexes start at 0, rather than 1.


    If you're trying to get the meta tag's attribute values, you may have to look at regex groups/grouping.
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  4. #4
    @php.net Salathe's Avatar
    Join Date
    Dec 2004
    Location
    Edinburgh
    Posts
    1,397
    Mentioned
    65 Post(s)
    Tagged
    0 Thread(s)
    Are you sure that it will not match the sample HTML that you provided? Using your sample code:

    PHP Code:
    $html '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><title>
        Det andet Andalusien - Solferie - Livsstil - Nordjyske.dk
    </title><link href="/css/nj_style.css" rel="stylesheet" type="text/css" media="all" />'
    ;

    preg_match('@<meta\s+http-equiv="Content-Type"\s+content="([\w/]+)(;\s+charset=([^\s"]+))?@i'$html$match);
    var_dump($match[1], $match[3]); 
    Outputs:
    string(9) "text/html"
    string(5) "utf-8"
    Salathe
    Software Developer and PHP Manual Author.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •