I’m a website developer who is looking at creating his own startup website. Without giving too much away, the functionality mainly relies on directory information (business names, addresses, telephones etc).

To collect my own data would be a massive task and to save time I am considering using a screen scrapper such as Mozenda ( My intention would be to use this type of software on competitor sources and/or large directory services such as to kick start my own project.

I have big concerns about doing this, as I don’t know the copyright/legal implications of using this type of software. As this data is freely available from many print and web sources does this still infringe on copyright laws?

Also: Specific information like you would find in directories (such as detailed business or individual records) may well be covered under data protection laws, if you started reproducing the information outside of a licensed body (if one exists) and posted the information without each of the individuals or businesses express written permission you could find yourself at the end of a lawsuit (I’ve heard of severe fines for making available confidential personal information outside of places where permission was granted by the individual). Perhaps not as likely as being snagged under copyright, but it’s still something you should probably be aware of. :slight_smile:

Dictionaries and encyclopedias do it as well. Easy way to detect copying of your compilation.

Big directories like Yell (and maybe some little directories) ‘seed’ their listings with a small number of fake businesses with working phone numbers and emails that belong to the directory company. They have done this for decades, way back before the internet, ever since business directories like Yellow Pages were first introduced. Rented Direct Marketing address lists are always ‘seeded’ as well, so the list broker can detect unauthorised usage.

Map makers do the same thing, adding non-existent roads or place names to make their maps unique and trap people who steal their work.


If I’m honest I am both interested on where the practice stands with the law and secondary on how it can be verified or enforced.

Thank you for everyone’s comments maybe the answer to this grey area is a moral one rather than legal.

If your interested in this topic further I did find this resource on Out Law.

Sounds like you aren’t all that interested in the legality of these matters as much as you are looking for a way to jump in without risk.

In theory say for example I did scrape a site. How would a telephone directory prove that I had scraped a business name, telephone etc. from their directory?

Esp. when that content is widely available from countless sources (Search engines etc).

I can understand a company having a legal case if it was say a blog post or a ebook but company details?

Thanks again for your reply. I welcome all comments and input.

The telephone book is subject to copyright even though all of the address and phone number info in it is separately available elsewhere. The copyright protection covers the particular compilation of the information and is intended to compensate the copyright holder for the time it took them to assemble all of the information.

Basically the copyright applies for the exact reason why you want to copy it - because assembling all of that information takes time and that time would never get spent if just anyone could profit from the results.

could you go into a little more detail please. As data is the same whereever the source why does it fall under copyright laws? Plus why can companies sell screen scrapping software without no legal action put against them? thanks for your reply.