crawler2.zip (7.3 KB)
i have this file to crawl sites,its work with html dom
i have some problem look at below examples;
<?php
include_once('simple_html_dom.php');
$target_url = "http://php.net/";
$html = new simple_html_dom();
$html->load_file($target_url);
foreach($html->find('img') as $link){
echo $link->src."<br />";
}
?>
its crawl and return all src of images that exist in php.net page,its clear for me that return attributes of tags…
but what about if i need to get some text in page ,see below code
<section id="item-details">
<h1>Bike delivery</h1>
<p>
<time datetime="2016-05-19 15:22:55" class="small-text icon-clock">
25 minutes ago </time>
<span class="small-text">
Address: new york </span>
<span class="item-price"><strong>200000</strong> Dollar</span>
</p>
<p>hi its our service<br />
you can trust us<br />
we are the best<br />
follow us<br />
</section>
i wana extract
“hi its our service
you can trust us
we are the best
follow us” As content
then “200000 Dollar " as price; and " Address: new york” as address and " 25 minutes ago
" as time …and Bike delivery as title… and save to my data base
how i can do it with html-dom (file available for download in up with example)
or some thing else?