Robots.txt and no-follow links

I have a site for a community group which includes a page with links to PDF forms for download. The site had been up for a while before I added a robots.txt file to it. In that file, I excluded the directory with the PDFs as there is no need for them to be indexed and I’d rather they weren’t. That was a couple of months ago. Today, Google Webmaster Tools is telling me “Severe health issues are found on your site”, which nearly caused me severe health issues, but turns out to mean simply that previously-indexed PDF files are now blocked by the robots.txt file. Should I have added rel=“nofollow” to the links to these files as well? I’m doubly confused, because I have another site in almost exactly the same position and Google isn’t posting any warnings about that at all.

I have a second site showing the “severe health issues” warning. In this case, the excluded directories are for the files for two Flash slideshows. One is triggering the alert and the other isn’t.

I would really like to get rid of these warnings, simply so I don’t panic every time I log into Webmaster Tools. (I assumed from the warning that the sites had been hacked.) Can I do anything to fix the problem, or will it eventually go away by itself?

It sounds like Google is having a fit because the files are still indexed but the robots.txt is blocking their spider from accessing them. From Google’s perspective, your site has issues because they have no reason to believe that the indexed files shouldn’t be available, given that you had them previously available.

I would submit a content removal request through Webmasters Tools and request that the content in question be removed. As long as it stays blocked with the robots.txt, you should be just fine.

Thank you - I’ll try that. Any guesses as to why the other site isn’t triggering the same warning?

Hmm… not the result I was hoping for! I still have the same warning message, but when I click for further information, in addition to the previous “Some important page is blocked by robots.txt”, I now have “Some important page has been removed by request”. In addition, it’s now started throwing the second error for the non-www. version of the domain, which wasn’t displaying any errors previously. (The www. version is set as the preferred domain; I’m not sure why Google is treating them differently.)

don’t you worry after sometime Google will DE-index this file…

Thank you, but if you mean that Google will remove it from the search results, then they already have - and that’s what’s triggering the second error message. I still don’t understand why there is a problem with this site, and not with the other in almost the exact same situation. :confused: I guess I’ll just have to wait and hope it goes away by itself. :frowning:

you shouldn’t worried about this. I had same problem with my sites. After few weeks it was solved automatically.

Google is warning you that it thinks it should be able to access some pages, but it can’t. That’s fine – you don’t want it to be able to access those pages. It won’t be penalising the rest of your site because of that, it’s just making sure you know that. I really wouldn’t worry about it!

Thank you. It’s just the Vulcan in me - I like everything to be logical and make sense, and I get concerned when it isn’t and doesn’t!

Well, that makes you a million times better than some of the types we get on the SEO forum. :slight_smile:

I normally nofollow my contact and privacy pages only. If you want your page to be indexed than build backlinks to it, but I don’t see why would you build link to your privacy page.