Rel="nofollow in pure php file

Google is complaining that a pure php file does not have title/meta tags/ and is not mobile friendly

How do I go about putting a rel=nofollow on the file? Do I have to start with html/head/body/etc tags… and if so do I put that before the php code or after?

The file is in a subfolder that contains only js scripts. I could include the subfolder in the robots file, but I’m afraid that it might also block the main folder: /Main/scripts

Thank you and a Happy New Year

Hey qim. Assuming your pure php file is called purephp.php (I’m sure it’s not!) you could simply add purephp.php to your robots.txt file.

On the other hand, if this is a page that visitors will see it ought to have the <html> etc tags. Something like:

<!doctype html>
<html lang="en-gb">
<head>
<meta charset="utf-8">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<title></title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<?php
...
?>
</body>
</html>

:

You can do it with a robots meta tag:

<meta name="robots" content="nofollow, noarchive, noindex">

Which is of course html, not php, and goes in the <head> of the html.

That depends on what the php is doing.
Though it sounds as if you should block the folder in robots.txt if there is nothing that needs to be indexed in there. The robots.txt does not forcibly block or deny access to a file or folder, it merely requests that crawlers don’t go in and index the things there, and “nice” crawlers will comply. This means that the files there will still be accessible if needed, they just won’t be indexed.
This also begs the question, how is Google finding this php file? There must be a link to it somewhere for the Googlebot to land there. Wherever that link is, it should be nofollow.

Yes, most pages have a currency widget that needs this php file. The file has no html whatsoever: pure php

So, maybe placing the file alone in the robots.txt file. But can I simply enter it there as /maindirectory/subdirectory/phpfile.php without upsetting (hiding) any other files in the main directory and its subfolder?

Thanks

That should target just that specific file. Though I have not used robots.txt before to disallow specific files. I generally put files I don’t want crawled all in one directory and Disallow the whole directory.
Remember, robots.txt does not stop the files from being accessed, it just stops them being crawled and indexed.

You might find the examples and explanations here helpful:

http://www.robotstxt.org/robotstxt.html

Thank you, all

Happy New Year!

I guess robots.txt could be used as a “don’t bother with it”
But if you are concerned about other files you do want the bots to find maybe you could use header() to indicate that the file is not text/html (the usual default content-type)?

Hi

How do I do that? I take it thta is php code… Where and how do I put it in the file?

Thanks

That would be concerning me. If Googlebot has found it, so can others - and only “good” bots respect robots.txt. So in addition to the other steps, I’d be looking to see how this happened in the first place, and fix it.

Yes, I can’t imagine a scenario where you would have a hyperlink to a pure php script like this. I would expect a script of that type to be referenced as an include or cron job or similar.

Can we see a page with this widget?

http://pintotours.net/Americas/DomRepublic/ParaPalma.html

The link is at the bottom of the page in a script.

It looks as if you could solve your Google problems by excluding the /scripts/ directory in robots.txt.

However, I can access that PHP file by direct URL, and my understanding is that that’s a security risk, so you might also want to look into fixing that.

1 Like

…but what’s the best way of doing that?

Sorry - I was in a hurry when I looked at the file. I thought it was displaying the unparsed PHP, but now I look again, it appears to be JS. Which seems like an odd way to do things, but not my area of expertise, so if it works…

Hi

You brought up am interesting point. I thought that it was not possible to see php files (at least with "View Page Source=, but as you wrote I can see the whole file through it url…

But the confusing thing is, what I see looks like js, not php.
Or is the php code writing the js? That would explain it.

1 Like

Hi there SamA74,

here is the php file in question…

converter.zip (1.3 KB)

It is my original file and may vary from gim’s. :ok:

This is the currency converter…

http://coothead.co.uk/currency-converter/

coothead

So the script is php and js What we are seeing is the js output which has been modified by the php.
I suppose it is not unusual for raw js code to be visible.
So the solution would be:

I don’t think there is any need for crawlers to go in and index the things in there, or report on what tags are/aren’t present.

Just specify the content type to what the file is actually outputting
ie. JSON, XML, plain text, etc. using the appropriate header field

http://php.net/manual/en/function.header.php

But as others have said, if the file s in a “scripts” folder that doesn’t contain anything you do want the bots to bother with, adding that folder to robots,txt should be fine.