Disallow Widget Php Directory in robots.txt

himsseo · October 29, 2020, 1:03pm

Hi everybody.
I have a weather forecast site that offers weather widgets to other sites.

The widget is an iframe that pulls data from, let’s say, mysite.com/widget/yourforecast.php

Search engines follow the link and land on yourforecast.php page (which is not cached) causing not cached requests and fatiguing the server.
Can I disallow /widget/ from robots.txt without this penalizing me?
Doesn’t Google like robots.txt that inhibit access to scripts?

Thank you

rpkamp · October 29, 2020, 8:59pm

Hi himsseo, welcome to the forums?

I don’t see why they would. Do you have any reason to believe they would?

m_hutley · October 29, 2020, 9:17pm

Google’s spider doesnt care if your robots.txt blocks access to scripts, or images, or javascript… it cannot take into account content it can’t read. Which is what your robots.txt does.

himsseo · October 30, 2020, 7:49am

Thanks rpkamp for your reply and thanks for the welcome!
I have no direct experience that allows me to believe it or not.
I just thought that because they are code injected on other sites, the crawlers wanted the ability to check that there was nothing suspect or “malicious”.

himsseo · October 30, 2020, 8:07am

Thanks m_hurtley for your reply.
I thought that crawlers prefer to crawl some kind of resources:

Crawling CSS and JavaScript is absolutely critical as it allows Googlebot to properly render pages. (from SEJ quoting Mueller’s #askawebmaster video of Jul 20 2020

Maybe i misunderstand?

m_hutley · October 30, 2020, 8:46am

I think you’re perhaps seeing things as a binary condition when it isnt one, more of a trinary.

Google’s bot doesnt say “Well I cant see this content, so it must be bad, strike against the site”
It says “I can’t read this content. Treat it as null, move on.”
It’s not a good or bad switch - it’s a good, bad, or nothing.

Yes, a crawler wants to crawl everything - that’s its raison d’etre. But, there are some things it won’t be allowed to crawl.

If the page does not render in a way that properly reflects your site without the script, then it will need to crawl that script in order to get the ‘picture’ of your website that you want it to see.

system · January 29, 2021, 3:46pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with robots.txt file coding? Marketing	2	410	October 8, 2014
Google Adsense/robots.txt issue Marketing	2	777	May 29, 2011
Problem with robots.txt PHP	3	280	October 27, 2011
Robots.txt Help Marketing	8	515	February 24, 2010
Google Using Words From Robots.txt as Keywords Get Started	3	582	January 26, 2010

Disallow Widget Php Directory in robots.txt

Related topics