Regexp for tags with a certain class name

What is the
pre_match_all(“”, “”, $output);
for a table innerhtml where it has a class name?

How many times is this going to be asked?

Use DOMElement or SimpleXMLElement for XML-like languages. Don’t use regex.

Also, gilgal, you really need to put more effort into asking your question. Getting preg_match_all right for a start, plus showing a sample of code and actually being polite (hello, please, etc) rather than launching straight into a single question will get you a lot further.

Simple really.

Off Topic:

The “Choose your forum carefully” section in that link is my new hero. SitePoint’s PHP forum is the devil. :mad: Bookmarked for later use.

What I have is:

$file="1-1.htm";
$contents_of_page = file_get_contents($file);
//preg_match_all("#<th.*>(.+)</th#Ui", $contents_of_page, $thInnerHTML);

//looking for the tds
//preg_match_all("#<tr.*>(.+)</tr#Ui", $contents_of_page, $trInnerHTML);
preg_match_all("/(class=\\"maintext\\")/is", $contents_of_page, $tableInnerHTML);
print_r($tableInnerHTML[1]);
preg_match_all("#<tr.*>(.+)</tr#Ui", $tableInnerHTML[1][0], $trInnerHTML);
print_r($trInnerHTML[1][1]);

But I realize that the preg_match_all(“/(class=\“maintext\”)/is”, $contents_of_page, $tableInnerHTML); will give me the class name in an array. I need the innerhtml of this tag with this class name.

<table width="100%" border="0" cellspacing="1" cellpadding="5" class="maintext"><tr><td class="top" width="20%">...

$dom = new SimpleXMLElement(file_get_contents($file));
$tables = $dom->xpath('table[@class]');

I only want the table with class=maintext. There are many tables.

Then you could use something like the following to select that.


$table = $dom->xpath('//table[@class="maintext"]');

See SimpleXMLElement::[b][/b]xpath for info and examples.

Edit:

added // to the xpath to select nodes no matter where they are

Yes try something like this:


$dom = new SimpleXMLElement(file_get_contents($xml_file)); 
$tables = $dom->xpath('//table[@class="maintext"]');

I don’t understand how xml is related to this? Because I’m looking for attributes?

HTML is just an implementation of XML.

(Some people will disagree with this… but it’s the truth)

$dom = new SimpleXMLElement(file_get_contents($file));//line 87 
$tables = $dom->xpath('//table[@class="maintext"]');  
print_r($tables);

I got a bunch of errors referring to line 87:

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: meta line 1 and head in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: title><meta http-equiv=“Content-Type” content=“text/html; charset=utf-8”></head> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: link line 1 and head in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: =utf-8"></head><link rel=“stylesheet” type=“text/css” href=“/style5.css”></head> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : AttValue: " or ’ expected in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: adding=“0” align=“center” class=“tablebk”><tr><td valign=“middle”><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : attributes construct error in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: adding=“0” align=“center” class=“tablebk”><tr><td valign=“middle”><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Couldn’t find end of Start Tag iframe line 1 in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: adding=“0” align=“center” class=“tablebk”><tr><td valign=“middle”><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: td line 1 and iframe in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: src=“/sections/genesis/1-1.htm” align=left frameborder=0 cellpadding=0></iframe> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: tr line 1 and td in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: /sections/genesis/1-1.htm" align=left frameborder=0 cellpadding=0></iframe></td> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: table line 1 and tr in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ions/genesis/1-1.htm" align=left frameborder=0 cellpadding=0></iframe></td></tr> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : AttValue: " or ’ expected in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: tr><tr valign=“middle” class=“topbk”><td height=“24” align=“center” cellpadding= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : attributes construct error in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: tr><tr valign=“middle” class=“topbk”><td height=“24” align=“center” cellpadding= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Couldn’t find end of Start Tag td line 1 in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: tr><tr valign=“middle” class=“topbk”><td height=“24” align=“center” cellpadding= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : AttValue: " or ’ expected in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: e width=“100%” border=“0” cellspacing=“0” cellpadding=“0”><tr><td><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : attributes construct error in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: e width=“100%” border=“0” cellspacing=“0” cellpadding=“0”><tr><td><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Couldn’t find end of Start Tag iframe line 1 in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: e width=“100%” border=“0” cellspacing=“0” cellpadding=“0”><tr><td><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: td line 1 and iframe in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: no src=“/menus/genesis/1-1.htm” align=left frameborder=0 cellpadding=0></iframe> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: tr line 1 and td in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: c=“/menus/genesis/1-1.htm” align=left frameborder=0 cellpadding=0></iframe></td> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: table line 1 and tr in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: enus/genesis/1-1.htm" align=left frameborder=0 cellpadding=0></iframe></td></tr> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : AttValue: " or ’ expected in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: align=“center”><tr align=“center” valign=“middle”><td height=“90”><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : attributes construct error in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: align=“center”><tr align=“center” valign=“middle”><td height=“90”><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Couldn’t find end of Start Tag iframe line 1 in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: align=“center”><tr align=“center” valign=“middle”><td height=“90”><iframe width= in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: td line 1 and iframe in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: eight=90 scrolling=no src=“/topmenu30.htm” frameborder=0 cellpadding=0></iframe> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: tr line 1 and td in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: =90 scrolling=no src=“/topmenu30.htm” frameborder=0 cellpadding=0></iframe></td> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: table line 1 and tr in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: crolling=no src=“/topmenu30.htm” frameborder=0 cellpadding=0></iframe></td></tr> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: td line 1 and table in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: =no src=“/topmenu30.htm” frameborder=0 cellpadding=0></iframe></td></tr></table> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: tr line 1 and td in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: rc=“/topmenu30.htm” frameborder=0 cellpadding=0></iframe></td></tr></table></td> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: body line 1 and table in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: 0.htm" frameborder=0 cellpadding=0></iframe></td></tr></table></td></tr></table> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: head line 1 and td in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: " frameborder=0 cellpadding=0></iframe></td></tr></table></td></tr></table></td> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in … \ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Opening and ending tag mismatch: html line 1 and tr in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: meborder=0 cellpadding=0></iframe></td></tr></table></td></tr></table></td></tr> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Extra content at the end of the document in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: meborder=0 cellpadding=0></iframe></td></tr></table></td></tr></table></td></tr> in …\ins_conc_indb.php on line 87

Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: ^ in …\ins_conc_indb.php on line 87

Fatal error: Uncaught exception ‘Exception’ with message ‘String could not be parsed as XML’ in …\ins_conc_indb.php:87 Stack trace: #0 …\ins_conc_indb.php(87): SimpleXMLElement->__construct(‘???<html><head>…’) #1 {main} thrown in …\ins_conc_indb.php on line 87

You kinda need to write better HTML then… Valid HTML makes the world spin round, k?

It seems to me that it’s going through all the tags instead of the table tag which has maintext as the class.

On the other hand if I use:
DOMElement

//SimpleXMLElement
$dom = new DOMElement(file_get_contents($file)); // line 88
$table = $dom->xpath('//table[@class="maintext"]');  
print_r($table);

I get:

Fatal error: Uncaught exception ‘DOMException’ with message ‘Invalid Character Error’ in …ins_conc_indb.php:88 Stack trace: #0 …ins_conc_indb.php(88): DOMElement->__construct(‘???<html><head>…’) #1 {main} thrown in …ins_conc_indb.php on line 88

It looks like you have some malformed HTML code on your hands.

Run it through a validator, and fix the problems that are within your code.

Not some people, everyone on the planet :wink:

@gilgal, if your code is html then you can’t use simpleXML
Instead use domDocument which has methods loadXML and loadHTML becuase contrary to what AlienDev says they are in no way the same thing.

Like I said above:
On the other hand if I use:
DOMElement

//SimpleXMLElement
$dom = new DOMElement(file_get_contents($file)); // line 88
$table = $dom->xpath('//table[@class="maintext"]'); 
print_r($table);

I get:

Fatal error: Uncaught exception ‘DOMException’ with message ‘Invalid Character Error’ in …ins_conc_indb.php:88 Stack trace: #0 …ins_conc_indb.php(88): DOMElement->__construct(‘???<html><head>…’) #1 {main} thrown in …ins_conc_indb.php on line 88

Do I need to substitute DOMElement with loadXML or loadHTML?

I don’t know how this works.

Maybe it’s time to refer to some documentation, so that you can find out what should be done.

DOMDocument::loadHTML

Yes, you need to create a new DomDocument, and then load your file. Then you can use the getElementBy* methods to retrieve what you want.