Referrer script logging strange websites

Hi.

I have a basic script on my website that logs the referrering urls of any visitors. It’s worked great for years but I recently noticed that it’s started to log a lot of different websites that have no link to my website nor have anything in common with my website at all.

I have no idea what could be causing this though I’m not not php/web savvy. Any ideas?

Here’s the script I’m using.

<?php
$file = "refers.txt";
$blacklist = array(
'pegaminidesign.blogspot.it'');

//$referrer = filter_input(INPUT_SERVER, 'HTTP_REFERER');
$referrer = $_SERVER['HTTP_REFERER'];

$url = ($referrer) ? parse_url($referrer) : false;
$refer_text = "{$referrer}";

if (isset($url['host']) && !in_array($url['host'], $blacklist))
{
    file_put_contents($file, "{$refer_text}\
", FILE_APPEND);
}
?>

Thanks.

I’m pretty sure that a link doesn’t have to be present on the site. Are they leaving the site and going directly to yours through a bookmark? Perhaps google ads (that have now rotated out)?

My site isn’t on google ads. Theses websites are all extremely varied and don’t seem to have anything in common with my site. This also has never happened before in the many years I’ve had the script running, it’s just been in the last three weeks that it started this logging flood of strange sites.

I noticed the permissions for the text file where the log is has its permissions set to 777. Could it be possible an outside bot has started writing to the file? I don’t know what permissions it should be set at.

I tried changing the txt file permissions to 744 but they still keep getting logged.

I noticed a lot of the sites are russian so I’m guessing this is some sort of spam bot, I have no idea what to do.

Could this be image hot linking? That is the pages in question have images from your site on them?

Just simple referrer spam?

I get hundreds every month in my Awstats log and the referring website does not have a visible link to my site.

Yes, I think Rubble has it. Some sites make their “referrer” page visible and SEO kiddies will do anything for any backlink even when useless.

The referrer could be removed from the headers, you cannot rely on it. For example, if you are using Internet Explorer (IE) browser, it doesn’t pass any HTTP_REFERER value. This means the bots could has been removing the referer value when accessing your site.

Thanks guys I didn’t realise referrer spam was so common.

I read some websites and it doesn’t seem like much can be done except maybe block individual sites. I don’t know if it’s better to block them via htaccess or via the refer script. I’ll go with the referrer script I think.

Do you guys know how I can modify the referrer script shown in the first post to accept wildcards?

I tried adding “‘.ru’,” to the blacklist since about 90% of the spam link to .ru domain names but the script seems to only work with full domains. I want it so for example adding “.ru” would block all websites with that domain or adding “porn” would block all links containing that word.

Thanks.