Hi.
How to scrape number results founds for http://delicious.com/search?p=mydomain.com
For example for http://delicious.com/search?p=ebay.com only scraped 15235.
$s = file_get_contents('http://delicious.com/search?p=ebay.com');
preg_match('~<em>([\\d,]+)</em> results~', $s, $matches);
$result = $matсhes[1];
not showed results for this code:
<?php
$s = file_get_contents('http://delicious.com/search?p=ebay.com');
preg_match('~<em>([\\d,]+)</em> results~', $s, $matches);
$result = $matсhes[1];
echo ($result)
?>
I wonder if Delicious allowed such use of their service
It doesn’t say anything about scraping the site in the T&Cs, which is strange. Doesn’t make it legal though.
However, you should really be using the API for this.
not showed results for this code:
it work for me, shows 15,276
oh, after copy code i found $result = $matсhes[1]; c - not english synbol, sorry.
here valid code
$s = file_get_contents('http://delicious.com/search?p=ebay.com');
preg_match('~<em>([\\d,]+)</em> results~', $s, $matches);
$result = $matches[1];
echo $result;
If i want to replace ebay.com with $website how to work?
<?php
$s = file_get_contents('http://delicious.com/search?p=ebay.com');
preg_match('~<em>([\\d,]+)</em> results~', $s, $matches);
$result = $matches[1];
echo $result;
?>
The Delicious Terms of Service disagree with what you’re wanting to achieve.
- Feeds and API
Delicious provides access to portions of Delicious via RSS feeds and an API; for the purposes of these Delicious Terms, such access constitutes use of Delicious. Delicious asks that you use these features respectfully, as outlined in the documentation. You may not use these or any other features or Delicious itself to allow the display of a substantial portion of the Delicious database or reproduce, duplicate or copy Delicious. Delicious reserves the right to change these features at any time and to disable access to the feeds and the API at any time for any reason.
I suggest that you make use of their API / RSS tools instead.
I hardly think that scraping a number from a search results page constitutes copying the Delicious service (bookmarking).
karimian, replace:
$s = file_get_contents('http://delicious.com/search?p=ebay.com');
with
$s = file_get_contents('http://delicious.com/search?p='.urlencode($website));
not showed results for
<?php
$s = file_get_contents('http://delicious.com/search?p='.urlencode($website));
preg_match('~<em>([\\d,]+)</em> results~', $s, $matches);
$result = $matches[1];
echo $result;
?>
$website is not defined in your code snippet; you’ll need to give it some value!
<?php
$website = 'example.org';
$s = ...
$website defined in php tag before this php code.but your code not worked.
The code that you posted did not contain the $website so my suggestion was to include it within your code. My code, if copied and pasted, is not expected to do anything (or even run at all!).
If you are going to ask questions about why code is not working, please post the code that you are using and not some vague approximation of it.
Hi.
How to scrape this data for example :http://google.com/safebrowsing/diagnostic?site=sitepoint.com
Only this txt:
What is the current listing status for sitepoint.com?
This site is not currently listed as suspicious.
What happened when Google visited this site?
Of the 151 pages we tested on the site over the past 90 days, 0 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 2010-04-15, and suspicious content was never found on this site within the past 90 days.
This site was hosted on 4 network(s) including AS14618 (AMAZON), AS27357 (RACKSPACE), AS25973 (MZIMA).
Has this site acted as an intermediary resulting in further distribution of malware?
Over the past 90 days, sitepoint.com did not appear to function as an intermediary for the infection of any sites.
Has this site hosted malware?
No, this site has not hosted malicious software over the past 90 days.
And for example: http://google.com/safebrowsing/diagnostic?site=yahoo.com
What is the current listing status for yahoo.com?
This site is not currently listed as suspicious.
Part of this site was listed for suspicious activity 4 time(s) over the past 90 days.
What happened when Google visited this site?
Of the 93691 pages we tested on the site over the past 90 days, 26 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 2010-04-15, and the last time suspicious content was found on this site was on 2010-04-15.
Malicious software includes 123 scripting exploit(s), 18 trojan(s), 12 adware(s). Successful infection resulted in an average of 4 new process(es) on the target machine.
Malicious software is hosted on 48 domain(s), including searchspy.co.kr/, yimg.com/, incheongh.com/.
26 domain(s) appear to be functioning as intermediaries for distributing malware to visitors of this site, including puremystique.com/, desihoti.com/, mayatek.info/.
This site was hosted on 33 network(s) including AS36752 (YAHOO), AS14778 (INKTOMI), AS14777 (INKTOMI).
Has this site acted as an intermediary resulting in further distribution of malware?
Over the past 90 days, yahoo.com appeared to function as an intermediary for the infection of 20 site(s) including tamilcinema.com/, phimso9.com/, bollywoodheaven.com/.
Has this site hosted malware?
Yes, this site has hosted malicious software over the past 90 days. It infected 3 domain(s), including oocities.com/, ibkuzi.co.cc/, ibkuzi.freehostia.com/.