SitePoint Sponsor |
|
User Tag List
Results 1 to 16 of 16
Thread: Scraping Script
-
Jun 18, 2007, 05:19 #1
- Join Date
- Feb 2004
- Location
- www.bulldog.name
- Posts
- 0
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Scraping Script
Is their any examples of scraping script on the net.
-
Jun 18, 2007, 05:40 #2
- Join Date
- Dec 2004
- Location
- London, UK
- Posts
- 1,376
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
A scraping script is just something which connects to a webserver, pretends to perform an action or just gets a page, and interprets the resulting HTML.
There's no generic "scraping script", as everything is different.
-
Jun 18, 2007, 05:43 #3
- Join Date
- Aug 2004
- Location
- Manchester UK
- Posts
- 13,807
- Mentioned
- 158 Post(s)
- Tagged
- 3 Thread(s)
lol, though I might see you in here Bulldog!
Here are a few links for you to browse.....
http://www.daniweb.com/code/snippet293.html
http://www.devnewz.com/devnewz-3-200...eInternet.html
http://codingforums.com/archive/index.php?t-36563.html
Should set you on the right pathMike Swiffin - Community Team Advisor
Only a woman can read between the lines of a one word answer.....
-
Jun 18, 2007, 05:52 #4
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
i would hardly call any scraping script "the right path"
-
Jun 18, 2007, 05:57 #5
-
Jun 18, 2007, 08:37 #6
- Join Date
- Dec 2004
- Location
- London, UK
- Posts
- 1,376
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
-
Jun 18, 2007, 08:59 #7
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
-
Jun 18, 2007, 11:13 #8
- Join Date
- Sep 2006
- Location
- Fairbanks, AK
- Posts
- 1,621
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
There's countless legitimate uses of scrapers - I've written more than a dozen here at work in the last year, none of which are stealing copyrighted material and all of which were the only option available for the purposes we needed. Most of them are scraping content from our own servers (e.g. pulling in dashboard data from a myriad different monitoring utilities that don't provide alternative access means such as SOAP or RSS); the ones that reach out across the internet I took great pains to make as "friendly" as possible - they connect directly to what they need and nothing more, and all of them implement local caching so that at most I'm only sucking down the remote page once per hour (most are cached for a full 24 hours).
-
Jun 18, 2007, 11:21 #9
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
i was very careful to insert the word "usually" in my statement
bulldog's previous thread was about large databases for sale on the web, for example lyric databases over 500,000+ and recipe database and so on, and how do people get the data for those databases... and i answered "they scrape them from other sites" ... and then he started this new thread
-
Jun 18, 2007, 11:25 #10
-
Jun 18, 2007, 11:28 #11
-
Jun 18, 2007, 14:05 #12
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
-
Jun 19, 2007, 01:17 #13
- Join Date
- Dec 2004
- Location
- London, UK
- Posts
- 1,376
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
-
Jun 19, 2007, 01:20 #14
- Join Date
- Aug 2004
- Location
- Manchester UK
- Posts
- 13,807
- Mentioned
- 158 Post(s)
- Tagged
- 3 Thread(s)
Last edited by spikeZ; Jun 19, 2007 at 03:25. Reason: curse you Rudy ;)
Mike Swiffin - Community Team Advisor
Only a woman can read between the lines of a one word answer.....
-
Jun 19, 2007, 03:04 #15
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
REST?
plagiarism
-
Jun 19, 2007, 04:33 #16
- Join Date
- Dec 2004
- Location
- London, UK
- Posts
- 1,376
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Bookmarks