SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Addict Corobori's Avatar
    Join Date
    Jun 2003
    Location
    Concepcion, Chile
    Posts
    388
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Watch out your bandwith usage !

    Hi,

    On 2 websites (not related) I faced the same bandwith usage in June. BaiduSpider and Googlebot "burned" huge amount of bandwith. In one particular website I noticed that normal web traffic used 20Gb for June whereas GoogleBot used 96Gb and BaiduSpider usde 89Gb which is 185Gb only for 2 spiders or bots ! I just reduced the Google's crawling rates from the Webmastertool panel and disallowed Baidu from robots.txt let's see what will happened.
    Jean-Luc
    Corobori WebDesign
    Working in the Concepcion area, Chile, since 1999
    Follow us on Twitter

  2. #2
    SitePoint Wizard
    Join Date
    Oct 2005
    Posts
    1,770
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)
    Did you check the bot's IP address to verify that it was a genuine Google bot? Some bots may spoof the googlebot useragent. I've had to block a number of undesirable bots that were sucking up a lot of data transfer using htaccess. That Brandwatch bot hammered my site with up to 8 page requests per second. I emailed the folks over there and requested they stop the bot from spidering my site. They said it would stop but it didn't. Now all it gets is a denied error and zero bytes transfered. If you don't want Chinese traffic, blocking Baidu is a good idea. When I blocked Baidu, the amount of spam I got on my forum dropped. Blocking Baidu, Yandex, and denying referrers from .cn, .ru, and .ua tlds dropped my forum spam to next to nothing. I have only had a few spam posts all year, less than I used to get in a single day. It isn't just data transfer these bots consume, they also use server resources and there is no point in getting into hot water with your host for consuming too much server resources to serve data to bots that do nothing for you.

  3. #3
    SitePoint Addict Corobori's Avatar
    Join Date
    Jun 2003
    Location
    Concepcion, Chile
    Posts
    388
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your input. It appears genuine Googlebot traffic, I lowered the crawl rates. As for Baidu I'll guess I'll have to find something as I am not sure about htaccess on that windows server.
    Jean-Luc
    Corobori WebDesign
    Working in the Concepcion area, Chile, since 1999
    Follow us on Twitter

  4. #4
    SitePoint Mentor bronze trophy
    John_Betong's Avatar
    Join Date
    Aug 2005
    Location
    City of Angels
    Posts
    1,578
    Mentioned
    62 Post(s)
    Tagged
    3 Thread(s)
    Quite some time ago I experienced similar Bot bandwidth problems. I Googled and applied the following recommendation, which is ignored by Google.

    robots.txt
    Code:
    Crawl-delay: 10
    It may have been coincidental but I am pleased to say the Bot bandwidth dropped.
    Last edited by John_Betong; Jul 1, 2013 at 20:01. Reason: rephrased

  5. #5
    SitePoint Addict Corobori's Avatar
    Join Date
    Jun 2003
    Location
    Concepcion, Chile
    Posts
    388
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    BaiduSpider activity:
    picBaidu.png
    Jean-Luc
    Corobori WebDesign
    Working in the Concepcion area, Chile, since 1999
    Follow us on Twitter

  6. #6
    SitePoint Member
    Join Date
    Jul 2013
    Posts
    0
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hey Cheesedude, sorry to hear that our bot hasn't been obeying. Do you mind sending another email to joel@ brandwatch .com with the relevant information and I'll try to get that sorted fo you.

    Thanks,

    Joel Windels
    LCM at Brandwatch


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •