SitePoint Sponsor

User Tag List

Results 1 to 2 of 2
  1. #1
    SitePoint Evangelist
    Join Date
    Nov 2000
    0 Post(s)
    0 Thread(s)

    Spiders acting very strangely now....

    Hi, up until a few days ago the Googlebot was coming by all the time, indexing my html pages. A few days ago, I implemented the following in my .htaccess file:

    RewriteEngine On
    RewriteRule ^([^/]+)/?$ /cgi-bin/$1/index.html [R=301,L]
    RewriteRule ^([^/]+)?$ /cgi-bin/$1/index.html [R=301,L]

    this is supposed to redirect all requests for: "" or "" to "" and it works for that perfectly.

    This was done for the sake of PageRank and link popularity. Since then Google hasn't hit a single html page. They have hit a bunch of gifs and jpgs and they access robots.txt (which is blank) after just about every single one of those file accesses. Yesterday I noticed ia_archiver is also accessing robots.txt after almost every other file access.

    Can someone please tell me if the modifications I made to my htaccess file could have caused this in some way? Everything was really going swimmingly until this started.... Thanks to anyone who can help!
    But what care I for praise? - Bob Dylan

  2. #2
    SitePoint Enthusiast
    Join Date
    Dec 2002
    0 Post(s)
    0 Thread(s)
    Hmmmm. I would actually remove it from your .HTACCESS, because Apache should pick up the default document from HTTPD.CONF and send that to anyone requesting just the directory...

    An alternative would be to check if the GoogleBot is coming (check the headers) and if so just redirect it to your index page... but be careful, because Google can penalize for cloaking like this...
    Need GUARANTEED top 20 rankings on Google? If so, checkout my web based guide:
    Google Ranking Secrets Revealed!


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts