SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Enthusiast
    Join Date
    Aug 2010
    Posts
    44
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Hiding Sites from Search Engines

    Hiya,

    I have a development site where clients can view their sites in progress. I need to make sure that these sites aren't available to search engines until they are ready to go live. I've read up on robot.text files but am confused. Is this a separate web page? If so, what exactly does the page consist of? I gather it's not just a piece of code which I put on my index page.

    If anyone has time, please could you do me a quick idiots guide to hiding sites from search engines, and then removing the code when I want to publish.

    Many thanks!

    Badger.

  2. #2
    Robert Wellock silver trophybronze trophy xhtmlcoder's Avatar
    Join Date
    Apr 2002
    Location
    A Maze of Twisty Little Passages
    Posts
    6,316
    Mentioned
    60 Post(s)
    Tagged
    0 Thread(s)
    You'd probably need more than a robots.txt (which you place in the root) and most likely would have:

    User-agent: *
    Disallow: /

    That might help hide a linked site from good robots though I assume you do not publicly link to this site test site anyway from the internet? If you do have links then something more than a robots exclusion would be needed in case; wetware or liveware accidentally started linking to the pages from other sites.

  3. #3
    SitePoint Wizard
    Join Date
    Jul 2003
    Location
    Kent
    Posts
    1,921
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Although google have said that they can find parts of a site that has no inbound links, it takes time for this to happen. So as long as there are NO other web pages linking to your new development section, search engines are unlikely to find the new pages during the time you are working on a site. I once made a test page to see if this was true, and it was something like six months before it appeared in google with no inbound links to it. And that was possibly just luck, as a second test page never got into google even after a year. Even with a link to a new site, it can be many months before google finds all the site's pages - they seem to index just the home page on their first visit, then on their next visit try just one or two of the links on it on their next visit, and so on.

    When developing a site, I usually just create a folder within my own web site for the new site, and when it's done either point the domain name at the folder (if only I will ever be maintaining the site), or move it to a new location. Nothing has ever appeared in google before completion and going live.

  4. #4
    Life is not a malfunction gold trophysilver trophybronze trophy
    TechnoBear's Avatar
    Join Date
    Jun 2011
    Location
    Argyll, Scotland
    Posts
    5,389
    Mentioned
    218 Post(s)
    Tagged
    5 Thread(s)
    Quote Originally Posted by 13adger View Post
    If anyone has time, please could you do me a quick idiots guide to hiding sites from search engines, and then removing the code when I want to publish.
    I think this guide probably covers everything you need to know. If/when you decide you are ready for a particular directory to be indexed, you simply delete the appropriate line in the robots.txt file.

  5. #5
    SitePoint Enthusiast
    Join Date
    Aug 2010
    Posts
    44
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks all, that's brilliant.

  6. #6
    SitePoint Mentor silver trophybronze trophy
    Mikl's Avatar
    Join Date
    Dec 2011
    Location
    Edinburgh, Scotland
    Posts
    1,449
    Mentioned
    59 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dr John View Post
    So as long as there are NO other web pages linking to your new development section, search engines are unlikely to find the new pages during the time you are working on a site. I
    That's what I always thought as well.

    But, a few weeks ago, I registered a new domain and applied it to a new site. I didn't give the URL to anyone or post it anywhere. But within 24 hours the site was visited by Googlebot. One day later, the site was in Google's index. And now, after three weeks, the site has received about a dozen visitors. And all this time I've been the only one who knows the URL (well, except for the hosting company and the domain registrar).

    Based on this experience, I wouldn't rely on the site not showing up, just because you don't publicise the link. As Dr John rightly says, it's unlikely you'll get any significant traffic. But if you really want to hide the site, Robots.TXT is surely the way to go.

    Mike

  7. #7
    SitePoint Enthusiast
    Join Date
    Aug 2010
    Posts
    44
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks Mikl. It's not as if I'm doing anything I don't want people to know about! I just don't like the thought of people being able to see my unfinished work. It's looking like Robots.txt is what I need.

  8. #8
    Non-Member
    Join Date
    Sep 2007
    Posts
    148
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by TechnoBear View Post
    I think this guide probably covers everything you need to know. If/when you decide you are ready for a particular directory to be indexed, you simply delete the appropriate line in the robots.txt file.
    Thanks for this, it helped me as well. Was curious about the robot.txt


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •