Disallow Widget Php Directory in robots.txt

Hi everybody.
I have a weather forecast site that offers weather widgets to other sites.

The widget is an iframe that pulls data from, let’s say, mysite.com/widget/yourforecast.php

Search engines follow the link and land on yourforecast.php page (which is not cached) causing not cached requests and fatiguing the server.
Can I disallow /widget/ from robots.txt without this penalizing me?
Doesn’t Google like robots.txt that inhibit access to scripts?

Thank you

Hi himsseo, welcome to the forums?

I don’t see why they would. Do you have any reason to believe they would?

1 Like

Google’s spider doesnt care if your robots.txt blocks access to scripts, or images, or javascript… it cannot take into account content it can’t read. Which is what your robots.txt does.

1 Like

Thanks rpkamp for your reply and thanks for the welcome!
I have no direct experience that allows me to believe it or not.
I just thought that because they are code injected on other sites, the crawlers wanted the ability to check that there was nothing suspect or “malicious”.

Thanks m_hurtley for your reply.
I thought that crawlers prefer to crawl some kind of resources:

Crawling CSS and JavaScript is absolutely critical as it allows Googlebot to properly render pages. (from SEJ quoting Mueller’s #askawebmaster video of Jul 20 2020

Maybe i misunderstand?

I think you’re perhaps seeing things as a binary condition when it isnt one, more of a trinary.

Google’s bot doesnt say “Well I cant see this content, so it must be bad, strike against the site”
It says “I can’t read this content. Treat it as null, move on.”
It’s not a good or bad switch - it’s a good, bad, or nothing.

Yes, a crawler wants to crawl everything - that’s its raison d’etre. But, there are some things it won’t be allowed to crawl.

If the page does not render in a way that properly reflects your site without the script, then it will need to crawl that script in order to get the ‘picture’ of your website that you want it to see.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.