SitePoint Sponsor

User Tag List

Results 1 to 17 of 17
  1. #1
    Get my greedy down dotJoon's Avatar
    Join Date
    Apr 2003
    Location
    daejeon, South Korea
    Posts
    2,223
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Not to allow Robot Search

    Code:
    <html>
    
      <head>
        <title>my secret page</title>
      </head>
    
      <body>
    
           my secret sentence.
    
      </body>
    
    </html>
    I have a HTML page like the above.

    I like to make the page not to be revealed in google search, yahoo search, and so forth.


    What code do I have to put in the page above?

  2. #2
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,290
    Mentioned
    198 Post(s)
    Tagged
    3 Thread(s)

    no bot content

    In the page itself, you could use the meta nocache tags.
    And for the honest bots put rel nofollow noindex in links to the page, and put that page in your robots.txt file as a disallow.
    Keep in mind that this will work for the honest bots.
    You could try using a bot-trap page to get info on bad bots retro, but of course it will most likely be too late for the real page at that point unless they go to the trap first and you have some sort of script that adds the offenders info to a db/list that the real page uses to filter page visits.

  3. #3
    Get my greedy down dotJoon's Avatar
    Join Date
    Apr 2003
    Location
    daejeon, South Korea
    Posts
    2,223
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Mittineague View Post
    In the page itself, you could use the meta nocache tags.
    How to write it?

    Is the following correct?

    Code:
    
    
    <html>
    
      <head>
        <title>my secret page</title>
        <meta  content="noCache"> 
      </head>
    
      <body>
    
           my secret sentence.
    
      </body>
    
    </html>

  4. #4
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    HTML Code:
    <!doctype html>
    <html>
     <head>
      <title>my secret page</title>
      <meta name="robots" content="noindex">
     </head>
     <body>
      <p>my secret sentence.</p>
     </body>
    </html>
    Simon Pieters

  5. #5
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,290
    Mentioned
    198 Post(s)
    Tagged
    3 Thread(s)

    cache metas

    I was thinking more of
    HTML Code:
    <META HTTP-EQUIV="Expires" CONTENT="Sun, Apr 1 2001 13:00:00 GMT">
    <META HTTP-EQUIV="Pragma" CONTENT="no-cache">
    but these, along with the higher priority headers your server sends - eg. PHP's header() - are more for preventing the page from being cached in the users browser and proxy servers.
    So yes, zcorpan's example is the more appropriate to use in regards to bots.

  6. #6
    Get my greedy down dotJoon's Avatar
    Join Date
    Apr 2003
    Location
    daejeon, South Korea
    Posts
    2,223
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by zcorpan View Post

    <!doctype html>
    Should I use <!doctype html> in my all HTML pages? What's the function of it?

  7. #7
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes. To trigger standards mode in browsers. (Or any longer version of it if you prefer, so long as it triggers standards mode.)
    Simon Pieters

  8. #8
    Get my greedy down dotJoon's Avatar
    Join Date
    Apr 2003
    Location
    daejeon, South Korea
    Posts
    2,223
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Mittineague View Post

    <META HTTP-EQUIV="Pragma" CONTENT="no-cache">
    What does "Pragma" mean?

  9. #9
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    See RFC 2616 (section 14.32).
    Simon Pieters

  10. #10
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,290
    Mentioned
    198 Post(s)
    Tagged
    3 Thread(s)

    pragma

    It's short for "pragmatic information". A fancy way of saying "useful info"

  11. #11
    Robert Wellock silver trophybronze trophy xhtmlcoder's Avatar
    Join Date
    Apr 2002
    Location
    A Maze of Twisty Little Passages
    Posts
    6,316
    Mentioned
    60 Post(s)
    Tagged
    0 Thread(s)
    I'd also create a "robots.txt" if you don't have one; generally only amateurs don't have such files.

    It also depends upon what you mean by secret as apposed to being 'indexed' or followed.

  12. #12
    SitePoint Enthusiast
    Join Date
    Dec 2006
    Posts
    94
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    I don't mean to hijack but...

    Is the following likely to keep robots out or lure them in?

    <meta name="robots" content="index,follow">
    This is the last thing placed in the Meta tag section.

  13. #13
    Robert Wellock silver trophybronze trophy xhtmlcoder's Avatar
    Join Date
    Apr 2002
    Location
    A Maze of Twisty Little Passages
    Posts
    6,316
    Mentioned
    60 Post(s)
    Tagged
    0 Thread(s)
    Lure them… Far better try a 'robots.txt' http://en.wikipedia.org/wiki/Robots.txt

  14. #14
    SitePoint Enthusiast
    Join Date
    Dec 2006
    Posts
    94
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by xhtmlcoder View Post
    Lure themů Far better try a 'robots.txt' http://en.wikipedia.org/wiki/Robots.txt
    Perfect. Thanks.

  15. #15
    bronze trophy
    Join Date
    Dec 2004
    Location
    Sweden
    Posts
    2,670
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Violetgems View Post
    <meta name="robots" content="index,follow">
    This is exactly equivalent as not having it present at all. Search engines will index and follow pages without this element just fine.
    Simon Pieters

  16. #16
    Robert Wellock silver trophybronze trophy xhtmlcoder's Avatar
    Join Date
    Apr 2002
    Location
    A Maze of Twisty Little Passages
    Posts
    6,316
    Mentioned
    60 Post(s)
    Tagged
    0 Thread(s)
    No Problem.

  17. #17
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,290
    Mentioned
    198 Post(s)
    Tagged
    3 Thread(s)

    honest bots

    The problem is that using
    HTML Code:
    <meta name="robots" content="noindex,nofollow">
    only keeps away the honest bots.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •