How to make Google not index a subdomain?

Datalife · November 25, 2007, 4:15pm

Hi forum,

I have got a question, but should be a piece of cake.

I just want to confirm with someone how do I block search engines (all of them) from a testing subdomain?

Lets say my domain is mydomain.com

My subdomain is test.mydomain.com

So, in my robots.txt file…

Should it be like this:

Agent: *
Disallow:/test/

Is the above the right htaccess command and should this be placed in the main public_html folder?

My subdomain redirects to a subdomain not a subdirectory…it redirects to http://test.mydomain.com

Thanks for any clarifications

holmescreek · November 25, 2007, 4:30pm

Yes, your subdomain points to a folder in the main sites document_root folder.

public_html/test/

So, in your public_html folder create a robots.txt to include

User-agent: *
Disallow: /test/

Or, to specifically disallow a particular agent :

User-Agent: Googlebot
Disallow: /test/

Here is a good reference on robots.txt that you can bookmark.

aaronjj · November 25, 2007, 4:45pm

Each subdomain should have it’s own robots.txt file in it’s own root directory. To block an entire subdomain you would use
User-Agent: *
Disallow: /

No htaccess involved. Just a plain-text file named robots.txt

Datalife · November 25, 2007, 5:04pm

Thanks!

I wonder why I was away for so long…you Sitepoint guys are so helpful

akritic · November 25, 2007, 7:31pm

Welcome to Sitepoint Datalife Thanks for posting this thread, I learned something from it too. I should probably do the same for my multiple subdomains on my sites - though I’ve never bothered to create a robots.txt file as the instructions on google didn’t appear very clear to me - perhaps a second look is in order.

holmescreek · November 25, 2007, 7:37pm

See my post above, and check out the link. It is about the clearest set of instructions on robots.txt files that I have found.

aaronjj · November 25, 2007, 8:25pm

IMO there’s no need for a robots.txt if you don’t want to block anything. A lot of people throw up empty ones so they don’t get a bunch of 404s in their logs.

aaronjj · November 25, 2007, 8:29pm

Your post is inaccurate. You can’t block access crawling of a subdomain by putting instructions in a robots.txt file in the top domain’s root.

akritic · November 25, 2007, 9:06pm

Ah! Missed that on first glance… thanks, it should take some of the stress out of creating one -at first glance it looks very straightforward.

Topic		Replies	Views
Robots.txt Block Subdomain Marketing	5	960	February 2, 2011
How can I tell google not to index an entire directory? Marketing	7	7128	September 28, 2014
Disable sub domain indexing in robots.txt in virtual hosts Marketing seo	5	2995	July 9, 2020
Prevent website from showing up on Google? Marketing	7	470	February 25, 2010
Robots.txt Code check Marketing	3	570	September 27, 2014

How to make Google not index a subdomain?

Related Topics