SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    HAHA!
    Join Date
    Mar 2006
    Posts
    656
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Bad bot trap logic and implementation

    I understand the logic of this bad bot trap and need to implement it on one of my sites. I am a php noob and the site is currently .html

    I will convert all pages to php to start off. So the question I have is this: Logically, would I not have to include both the blacklist.php and the bot trap link on every single page to catch malicious bots? As bots could come in from external links it would seem logical to catch them as soon as they enter the page and/or have their id checked by the blacklist.php?
    Cheap web hosting directory listing the cheapest web hosting

    Submit articles to an article directory

  2. #2
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Probably yes. Why not use the .htaccess way and put a protection on the entire domain in one place, though?
    Saul

  3. #3
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,580
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Hostpitable View Post
    I understand the logic of this bad bot trap and need to implement it on one of my sites. I am a php noob and the site is currently .html

    I will convert all pages to php to start off. So the question I have is this: Logically, would I not have to include both the blacklist.php and the bot trap link on every single page to catch malicious bots? As bots could come in from external links it would seem logical to catch them as soon as they enter the page and/or have their id checked by the blacklist.php?
    While you're changing all your files to PHP, maybe it's a good time to consider re-organizing your site. You have a common header and footer, and probably navigation, on every page. Rather than repeat the markup for those for as many pages as there are, and have to edit them all to change anything, make single header, navigation and footer files and include them in each page with PHP.

    Then you only have to make changes in one place to update your site in the future, and adding a bot trap, or a web stats tracker, or ad tracking code, or live chat link, or any number of things becomes immensely easier

    Quote Originally Posted by php_daemon View Post
    Probably yes. Why not use the .htaccess way and put a protection on the entire domain in one place, though?
    The PHP method updates the blacklist automatically whenever a new "bad bot" is found, before it's able to waste your bandwidth copying your entire site.

  4. #4
    HAHA!
    Join Date
    Mar 2006
    Posts
    656
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the reply Dan. Well as I mentioned I'm still learning. Header, footer and sidebar via includes would be easy enough to accomplish I guess but to have custom description and meta tags I would already need to get up to a more advanced level and maybe pull stuff from a db.

    But you're right the bot trap serves as a wake up call with regards to site reorganisation. The big challenge is that it all needs to look like before only script generated (we don't want to upset google do we?)
    Cheap web hosting directory listing the cheapest web hosting

    Submit articles to an article directory

  5. #5
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,580
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Hostpitable View Post
    Thanks for the reply Dan. Well as I mentioned I'm still learning. Header, footer and sidebar via includes would be easy enough to accomplish I guess but to have custom description and meta tags I would already need to get up to a more advanced level and maybe pull stuff from a db.
    Your header can start after the <body> tag if you want. It doesn't need to include the title and meta tags; they can be left in the individual files.

    Quote Originally Posted by Hostpitable View Post
    But you're right the bot trap serves as a wake up call with regards to site reorganisation. The big challenge is that it all needs to look like before only script generated (we don't want to upset google do we?)
    If you do nothing more than copy a section of HTML into a file, and replace that section of HTML with an include() statement pointing to the new file, then absolutely nothing about the output of your URL has changed

    The whole process, at least for an initial pass where all you're doing is moving common elements to common files, takes only as long as you need to select, press delete, and paste the include() line into each file.

  6. #6
    HAHA!
    Join Date
    Mar 2006
    Posts
    656
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Great, that's the first step to a much more effective work process.

    By the way, can you maybe pm me your favourite book about PHP&MySQL that would help me to get started on more complex projects?
    Cheap web hosting directory listing the cheapest web hosting

    Submit articles to an article directory

  7. #7
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,580
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Hostpitable View Post
    By the way, can you maybe pm me your favourite book about PHP&MySQL that would help me to get started on more complex projects?
    I wish the example code was a little better in terms of best practices and security... but this is still a good book to start with. SitePoint's Build Your Own Database-Driven Website Using PHP & MySQL is good. About 7 years ago, this was a 10-part article series on the sitepoint.com site, and that's where I started learning PHP. The articles were so popular they expanded the series into a book, which is now in its 3rd updated edition.

  8. #8
    HAHA!
    Join Date
    Mar 2006
    Posts
    656
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks Dan! Looks like a good book to get me started.
    Cheap web hosting directory listing the cheapest web hosting

    Submit articles to an article directory

  9. #9
    HAHA!
    Join Date
    Mar 2006
    Posts
    656
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Could it be that the frugal Scot inside of me has located these articles here ?
    Cheap web hosting directory listing the cheapest web hosting

    Submit articles to an article directory

  10. #10
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,580
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Hostpitable View Post
    Could it be that the frugal Scot inside of me has located these articles here ?
    No, the original series is no longer on SitePoint, and what remains is essentially a teaser to get you to buy the book. The original articles wouldn't be that useful, anyway, since a lot has changed in 7 years.

  11. #11
    Worship the Krome kromey's Avatar
    Join Date
    Sep 2006
    Location
    Fairbanks, AK
    Posts
    1,621
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hospitable: one method you may be interested in regarding per-page meta keywords and such is one that I've employed with great success on many different sites. It goes something like this:

    Code php:
    <?php
    //this is the header include file
    ?>
    <head>
    <title><?php echo $title ?></title>
    <meta name="keywords" content="<?php echo $keywords ?>" />
    </head>
    <body>

    Usage looks like this:
    Code php:
    <?php
    $title = "Welcome to my page";
    $keywords = "welcome,page,blah,blah,blah";
    include("header.php");
    This will let you do things like include your bot-trap code (which, by the way, I'm glad you linked to here - I have more than a couple sites that are getting hit by countless bots, many of the repeats... I think I'll be adopting a version of this for my sites) on each page while still allowing each page to maintain its own identity. Plus, it doesn't require any advanced PHP knowledge nor even a database!
    PHP questions? RTFM
    MySQL questions? RTFM

  12. #12
    Worship the Krome kromey's Avatar
    Join Date
    Sep 2006
    Location
    Fairbanks, AK
    Posts
    1,621
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Another thing you may want to look into, Hospitable, is Spamhaus's DROP List. It's essentially a list of networks (entire networks along with subnets, some of which are as big as /16 (the largest is /15 and comprises 122.8.0.0 - 122.9.255.255)!!) operated by known spammers who operate bot nets of various types. Spamhaus does focus on e-mail spam and not bot nets, but I intend to employ the DROP List in an advisory manner shortly on one of my sites that gets hammered by bots. If you're interested in the results of my testing and the code itself, PM me and I'll keep you apprised as progress is made.
    PHP questions? RTFM
    MySQL questions? RTFM

  13. #13
    HAHA!
    Join Date
    Mar 2006
    Posts
    656
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Kromey, thanks for the tutorial on including the keywords. That way, IŽll also include my description. If I get more advanced that will probably be pulled from MySQL but I have a lot of reading to do until my whole site will be generated through an index.php

    I've read the ROKSO list on Spamhaus about a year ago (it's a bit like a crime novel). That DROP list is interesting and I will definitely ask my shared host what they are doing about this. I pm'ed you about your coding project.

    As for the bad bot trap, the logic is compelling and really manages to distinguish bots from humans (almost like an invisible captcha).

    @ Dan, I ordered the book.
    Cheap web hosting directory listing the cheapest web hosting

    Submit articles to an article directory

  14. #14
    Worship the Krome kromey's Avatar
    Join Date
    Sep 2006
    Location
    Fairbanks, AK
    Posts
    1,621
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    So far, out of 4 spam registrations, my Spamhaus lookup has a capture rate of 100&#37;. That is to say, Spamhaus listed the IP of 1 of the visitors and 3 of the e-mail domains; the 4th e-mail domain (which, sadly, was not the registration where Spamhaus had the visitor IP listed) just simply does not exist (no DNS records of any type for wrudoza.com) and thus would be tossed out along with the spammers in the final version of this code.

    Once I've got more data (and assuming my capture rate stays at least as high as it is now) I'll post my results along with code so that everyone can help stamp out spam.

    I should clarify that I am not in fact using Spamhaus' DROP List yet. Turned out to be easier to integrate a lookup in Spamhaus' Zen DNSBL than to handle subnets of the /XX variety.
    PHP questions? RTFM
    MySQL questions? RTFM


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •