xPath html file

Hi Guys,

What i’m trying to do is read an html file into a domDocument then use xPath to retrieve the relevant fields.

so far i have:

if (isset($_GET['searchDeep']))

  // Deep search code
	//$searchString = str_replace( " ","+",$searchString);
	$search_url   = "http://www.clickbank.com/mkplSearchResult.htm?dores=true&includeKeywords=$searchString&firstResult=1";
  print $search_url; print "<br />";
	// make the cURL request to $search_url
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_USERAGENT, 'Firefox (WindowsXP) - Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv: Gecko/20070725 Firefox/');
	curl_setopt($ch, CURLOPT_URL,$search_url);
	curl_setopt($ch, CURLOPT_FAILONERROR, true);
	curl_setopt($ch, CURLOPT_AUTOREFERER, true);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
	curl_setopt($ch, CURLOPT_TIMEOUT, 30);
	$html = curl_exec($ch);
	if (!$html) {
		echo "<p class=\\"fcs-message-error\\">cURL error:" . curl_error($ch) . " (Error Number " . curl_errno($ch).")</p>";

        // parse the html into a DOMDocument  
        $dom = new DOMDocument();
				$xpath = new DOMXPath($dom);	
				// Loop
				foreach ($xpath as $item)
            print $item->query("//div[@id='results']//tr/td[@class='details']/h4/a");
            // URLs
            $cbURL = $item->getAttribute('href');
            // Replace with my hoplink
            $cbURL = str_replace("zzzzz", "graham25s", $cbURL);
				    print $cbURL;
            $xpath = new DOMXPath($dom);
            $paras = $xpath->query("//div[@id='results']//td[@class='details']//div[@class='description']");
            $para = $paras->item(0);
            $description = $para->textContent;	
            $xpath = new DOMXPath($dom);			
            $paras = $xpath->query("//div[@id='results']//td[@class='details']//h4/a");
            $para  = $paras->item(0);
            $title = $para->textContent;					
            $link = '<a rel="nofollow" href="'.$cbURL.'">'.$title.'</a>';					
            print "<br/><strong>".$link."</strong><br/>".$description;
            //print $link;

I have pieced this together from my limited knowledge :slight_smile: i can’t seem to loop the results returned back.

any help would be appreciated

thanks guys


The code could be tidied up somewhat as there are a number of things happening that don’t really need to be. A basic example would be to change everything below your //print_r($dom) line with something like (Notes: this will not output anything, you know how to print HTML. Also no santity checking is used, the HTML is assumed to have the right structure):

$xpath = new DOMXPath($dom);    
$results = $xpath->query("//div[@id='results']//tr[@class='result']/td[1]");
foreach ($results as $result)
	$anode  = $result->getElementsByTagName("a")->item(0);
	$title  = $anode->textContent;
	$hopurl = str_replace("zzzzz", "graham25s", $anode->getAttribute('href'));
	$desc   = $result->getElementsByTagName("div")->item(0)->textContent;

	// Write your HTML

Thanks very much mate worked great :slight_smile: