SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Wizard
    Join Date
    Apr 2002
    Posts
    2,322
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    user agent bot/spider/crawler detection

    hiyer
    i need to detect search engine bots/spiders/crawlers or whatever they're called.
    at the moment my (php) script checks the user agent for -

    "bot" and "spider" and "crawl"

    so if the user agent contains those charecters (regardless of upper/lowercase) it'll detect correctly

    can anyone see any problems with this? are there search engine bots that don't contain the above three words?
    are there even normal browser that do contain the above three words (i doubt that but you never know)?

  2. #2
    ...
    Join Date
    Jan 2002
    Location
    London, UK
    Posts
    759
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    One obvious one is Altavista - there bot's called Scooter

    there's a large list of bot names and ip address's here

  3. #3
    Serial Publisher silver trophy aspen's Avatar
    Join Date
    Aug 1999
    Location
    East Lansing, MI USA
    Posts
    12,937
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Chris Beasley - I publish content and ecommerce sites.
    Featured Article: Free Comprehensive SEO Guide
    My Guide to Building a Successful Website
    My Blog|My Webmaster Forums

  4. #4
    SitePoint Wizard
    Join Date
    Apr 2002
    Posts
    2,322
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    yeah, i didn't realise there were quite so many of em :/

    this is the most comprehensive list i've come accross with that has their user agent strings -
    http://www.siteware.ch/webresources/useragents/spiders/

    makes it pretty hard to check for them.

    ok, thanks.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •