SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Evangelist
    Join Date
    Feb 2005
    Posts
    520
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    .htaccess to stop msnbot?

    Hi,

    For the last two days, our forum and our wiki has been taken offline for hours on end by what appears to be heavy bot traffic from the msnbot. Some searching around seems to suggest this is no longer the actual msnbot but third-parties renting the servers?

    We've tried to restrict the msnbot via robots.txt, but it doesn't seem to respect that. We also tried an .htaccess rewrite, like this:

    RewriteCond %{HTTP_USER_AGENT} msnbot [NC]
    RewriteRule . - [F]

    This doesn't seem to work either. It may be that the bot doesn't always send its user agent, perhaps? Has anyone had any experience getting this particular bot to leave your site alone?

  2. #2
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    8,904
    Mentioned
    139 Post(s)
    Tagged
    2 Thread(s)
    You did put
    Code:
    RewriteEngine On
    in your .htaccess I assume?
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy

  3. #3
    SitePoint Evangelist
    Join Date
    Feb 2005
    Posts
    520
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, sorry! Was just quoting the relevant bit, but good point.

  4. #4
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    8,904
    Mentioned
    139 Post(s)
    Tagged
    2 Thread(s)
    Okay
    So, are you sure the bot has "msnbot" in it's name, exactly spelled like you did? What does your access log say?
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy

  5. #5
    Non-Member
    Join Date
    Dec 2011
    Posts
    25
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you know bot IP then you can block it in your .htaccess by adding code

    <Limit GET POST>
    order deny,allow
    deny from x.x.x.x
    allow from all
    </Limit>

    x.x.x.x = physical IP of bot

  6. #6
    SitePoint Evangelist
    Join Date
    Feb 2005
    Posts
    520
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I need to get the access logs from my host, she said msnbot but I guess we need to confirm for sure. Will try the IP approach as well, thanks.

    Though it does seem like we have some other on-going issues with apache, so perhaps the bots were just the straw that broke the camels back.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •