SitePoint Sponsor

User Tag List

Results 1 to 17 of 17
  1. #1
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    robot.txt redirect if not spider

    How would I redirect a user if they try going to my robot.txt file, but obviously spiders will still need to be able to access it. I'm guessing this would be done in the .htaccess but I'm unsure how.

    Thank

  2. #2
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Why would you want to? There are no security implications of readers being able to see the robots file (after all those files are only invisible to search engines, not users), plus most people don't know that robots.txt exists, it's just a simple plain-text format.

  3. #3
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    True but it would be added security if the user had no idea that I had an admin section.

  4. #4
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Does your admin section require a username and password to access it? If so why are you even adding it to the robots file, search engines can't index protected content. 90% of all websites have an admin section (especially CMS powered ones), the chances are your users know you have one, the problem is if you try and filter the traffic to the robots file via htaccess, you stand a very likely chance of breaking it's effectiveness with search engines, in summary don't do it

  5. #5
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No, I don't believe its redundant if users don't know it's there they wont mess with it. "Out of sight, out of mind"

  6. #6
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by SublimeSite View Post
    No, I don't believe its redundant if users don't know it's there they wont mess with it. "Out of sight, out of mind"
    Actually that is redundant, if you really believe that your users won't assume you have an admin area, your being naive. What your claiming is that your willing to place more emphasis on the potential ignorance of your audience over the potential loss you might suffer if search engines are mistaken (quite easily) as a user. It's just a case of being paranoid to want to try and prevent people from accessing the robots file.

    PS: If your admin area requires authentication it shouldn't even be in the robots file.

  7. #7
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Lets think about this if I was to make a section called "crl_admin", what type of user would guess that unless they read your robots file. I am not be ignorant by assuming that my user would not no the correct directory unless there was a robots file.

  8. #8
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    What type of user would actively seek out the robots file unless it was someone intent on finding pages you were trying to hide from search engines? The answer is advanced users who are more than capable of getting around a redirection script, in fact the chances are that the very nature that your trying to protect the content will be grounds enough for someone wanting to target your website to know that your worried making them more likely to try and attack. All your doing is trying to justify the need for something which would only affect people that could get around your "protection" in a matter of seconds, if anything your inviting an attack. The average user doesn't even know what a robots.txt file is, it's silly to assume that they will be looking inside it.

  9. #9
    SitePoint Member
    Join Date
    Jun 2007
    Posts
    16
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by SublimeSite View Post
    Lets think about this if I was to make a section called "crl_admin", what type of user would guess that unless they read your robots file. I am not be ignorant by assuming that my user would not no the correct directory unless there was a robots file.
    And correct me if Im wrong, but a spider will only goto your admin folder/file if you have a public link to it somewhere (in which case, the user could find it regardless), as they dont just come up with folders from thin air, they follow links, or try common names (which wont work here if you use something unique)
    Im with Alex on this one, there is zero reason to have your admin links in the robots.txt file
    Which makes preventing user access to it quite redundant

    And lets not forget, its quite easy to trick the server into thinking you're a "spider" anyway, therefore getting around most prevention methods, meaning the file can be viewable anyway, which goes back to you actually adding a security issue by having the folder/file names there in the first place

    So, have the folder name unique or hard to guess, and ensure you have zero links the public can see

  10. #10
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You answered your own question, the type of user that would uses the robots is one intent on finding pages I am trying to hide. Either way your not understanding, what you are basically saying is that password protecting is pointless too because it's telling your users you have something to hide.

  11. #11
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    OK then, how are you going to prevent people seeing the robots file who make themselves appear as a bot? It's very easily done, in fact you can do it by installing a Firefox add-in quickly and easily, that will essentially cripple your script. Password protection is not pointless because it offers people a method of giving authentication to content that NEEDS to be protected, the robots.txt file does not need any protection, we understand exactly why you want to do it, but the fact of the matter is it's impossible to achieve (in the sense of handshake / challenge free partial identity authentication) and by nature, it's actually quite a silly thing to ask for. What your asking for is for something "secure" that only spiders can see, all it takes is a very simple tweak to make you appear as a spider and all that effort is toileted. People don't suddenly see a link to an admin panel and feel compelled to hack it, why do you suddenly think anyone who reads your robots file will want to hack you.

    PS: Doesn't it bother you that no-one else in my knowledge in the history of web design has felt compelled to try and protect the file as such, from individuals to fortune 500 multi billion dollar businesses and bank websites, yet your claiming that this thing which no-one else uses is suddenly needed?

  12. #12
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    alright...

  13. #13
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    There is really zero need to have the administrative control panel in the robots.txt file. This over zealous nature to hide things is—in my opinion—ridiculous, you spend all your time on this fake security while neglecting the real security that will actually keep attackers out. So a user finds out the administrative control panel to your website is "admin.example.com" big deal. Did they get a copy of your password? Is your authentication and user identification weak? Are you on a shared server? There is are so many other things to weaken your admin section then users being aware of its location is pointless.
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  14. #14
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What?!?! There's definitely need for admin to be in robots.txt, it keeps it from being indexed. You guys act like I'm worried about it and I'm spending all my time on it. I assure I'm not, I was simply asking about it and got more then I needed. Beside the convo was over Alex did a good job ripping me a new one.

  15. #15
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    I was not intending on ripping you anything, I was just pointing out the obvious problems with trying to accomplish what you were after, the last thing you want to do is waste any energy or processing cycles trying to protect something which doesn't need any kind of security measures added to it.

  16. #16
    SitePoint Member
    Join Date
    Jun 2007
    Posts
    16
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by SublimeSite View Post
    What?!?! There's definitely need for admin to be in robots.txt, it keeps it from being indexed. You guys act like I'm worried about it and I'm spending all my time on it. I assure I'm not, I was simply asking about it and got more then I needed. Beside the convo was over Alex did a good job ripping me a new one.
    It wont be indexed if there arent any public links to it
    Also, if you have a authentication page/setup for the admin page, a spider couldnt get in anyway, therefore the most it could do is maybe index a login page

  17. #17
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,870
    Mentioned
    25 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by SublimeSite View Post
    Lets think about this if I was to make a section called "crl_admin", what type of user would guess that unless they read your robots file. I am not be ignorant by assuming that my user would not no the correct directory unless there was a robots file.
    What search engine or spambot would find it either unless they looked in the robots.txt file.

    If there is no link to a page and it is password protected then the legitimate search engines are not going to find it even if it isn't in the robots.txt file. All that adding it to the file will do is to make it easier for spambots to find it so as to make it easier to launch attacks against it.
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •