SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Zealot
    Join Date
    Dec 2000
    Location
    Grosseto, Italy
    Posts
    189
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    ASP site crawler. How to make one?

    Hi,
    I am intereted in understanding how does a crawler works. By crawler I mean something like the some search engines use to fetch automatically sites on the net. What is the method an ASP script like that can look up the content of a site?

    Thanks

  2. #2
    SitePoint Wizard big_al's Avatar
    Join Date
    May 2000
    Location
    Victoria, Australia
    Posts
    1,661
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    what you are planning to do is not the easiest of projects.

    The basic priciple behind it would be to send a script to the page and look for what ever your after, if it finds a link, store it, got to it and record any keywords it may find.

    The architecture behind this would be a script, a COM module and a high performance database (SQL. PostgreSQL, Oracle)

    The COM module would probably need to be written in C if you want a really powerful system, although VB could do the job.

    Not a very simple system to build, thats probably why a company like Altavista licences it's systems around the $750,000 USD mark

    Hope this has helped
    .NET Code Monkey

  3. #3
    SitePoint Zealot
    Join Date
    Dec 2000
    Location
    Grosseto, Italy
    Posts
    189
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    errr...

    ehm, well, I think It wolud have been better if you never replied...but now you did it

    I was just trying to understand how all that works. Maybe to make my own little one in future. I thought ASP and of course a good database could do that alone.

    ook. Thanks a lot for the help. I still have a lot to learn

  4. #4
    SitePoint Wizard big_al's Avatar
    Join Date
    May 2000
    Location
    Victoria, Australia
    Posts
    1,661
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    I didn't mean to come across as if was only possible by the programming elite

    There is alot to a system like this, I do recall seeing a little article on a basic crawler on asp101.

    Another place you could go to is one of the larger coding sites like pscode.com they might have an example in either asp or vb

    This sort of project probably be alot easier to implement in ASP.NET as you can use C#, VB (pure VB), C, PHP, Perl etc.
    .NET Code Monkey

  5. #5
    Serial Publisher silver trophy aspen's Avatar
    Join Date
    Aug 1999
    Location
    East Lansing, MI USA
    Posts
    12,937
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Its not that hard really. I've made the equivalent of a crawler in PHP.

    One key element though is regular expressions, so if you are unfamiliar with them become familiar. (on a side note I don't know if ASP includes regular expressions out of the box or if they are something you'd need at add in)
    Chris Beasley - I publish content and ecommerce sites.
    Featured Article: Free Comprehensive SEO Guide
    My Guide to Building a Successful Website
    My Blog|My Webmaster Forums

  6. #6
    SitePoint Enthusiast pinkstar's Avatar
    Join Date
    May 2001
    Location
    in a gallaxy far, far away
    Posts
    27
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you are looking for a search engine for your site that is easy to set up, I liked this one:

    http://www.surf-net.co.uk/asp/site_s...rch_script.asp

    It times out, however, with larger sites, because it is so simple.
    pb.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •