Being webmaster of a new site, I check the stats every few days. Of course, doing a fair bit of changes and testing still, my IP address is always up the top. I was very surpised to see an IP address (126.96.36.199 ) use up over 16 Mb in one day, 487 hits, 487 pages. The web server logs are filled with this IP address, the person is using an agent called "RPT-HTTPClient/0.3-3"
I have contacted the ISP, and will supply all the details soon. My concern is, what was this person doing, and even if they were a 'legit' crawler' or 'spider', then why 16 Mb ? The whole site is less than 3 Mb.
There are 635 "GET" requests in 24 mins, and that is how it totalled up to over 16 Mb. Can I tell by the web server logs if they are attempting to do anything that they shouldn't (apart from using up WAY too much bandwidth for a spider session) ??
Having a quick look through, the messages are either "200" or "302", but does an entry like this:
indicate that the user agent is attempting, or has been able to 'read' the PHP file ? The file in the above example is only 12,873 bytes, but no doubt any images,etc would add up. I can't understand why they have spidered/crawled through, and changed the language settings, and done everythign possible, like even changing the sort sequence on some pages. It seems a VERY deep crawl to me ??
188.8.131.52 - - [07/Mar/2004:01:48:48 -0500] "GET /product_info.php?products_id=50&language=es HTTP/1.1" 200 31777 "-" "RPT-HTTPClient/0.3-3"
If it is a single person (probably because the name is "dsl-184.108.40.206.dsl.comindico.com.au" ) , and not a connection shared by many, then I'll simply ban the IP.
Is there any legal action that can be taken against this person/s ??