Help with robot.txt

i have only one PHP file to generate all contents,

my_domain.com/index.php?menu=chapter_1
my_domain.com/index.php?menu=chapter_2
.
.
.
my_domain.com/index.php?menu=contact_us

there is only an ‘index.php’ to create all contents.

if i want SE spiders not to crawl ‘Contact Us’ page, i will write that into robot.txt


User-Agent: *
Disallow: /[B]index.php[/B]?menu=contact_us

will that cause any problem with my index.php? i mean will those spiders really be ‘disallowed’ and not index my index.php file?? and hence my entire site will not be indexed?

Hi leelong,

That should work. The specs say -
“this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved.”

Since you just wish to disallow just one page, add this meta tag to it -

<META NAME=“ROBOTS” CONTENT=“NOINDEX, NOFOLLOW”>

Different people have had confusions in the past as different spiders understand robots.txt differently, while some allow regexp, some might not. So in that case the ? symbol might be an issue.

thank you

i think adding <META NAME=“ROBOTS” CONTENT=“NOINDEX, NOFOLLOW”> is better solution for me