I am just launching a local business directory. Am just waiting for google to notice my robots.txt now allows it in.
Most of the text on the website is coming from the businesses themselves - they can enter a few lines about themselves. Most of them seem to be pasting text straight from their own websites.
Is google going to penalise me for having duplicate content? At the moment about 60 entries and most of them are virtually identical to passages of text on their own websites.
Am I in trouble?
It depends. When you say a few lines, you’re talking about how many words per description? And what is the approximate proportion of original content/duplicated content?
Each listing gets about 200 characters. Currently have about 60 listings, probably 55(?) of these have copied & pasted from their website.
Also have about 12 advertisers, maybe 10 of these have also copied text from their websites - and that is much longer, maybe 800 characters(?) each.
There’s VERY little other text on the page, so duplicate content has got to be 90%+. But from 60+ other websites - does that make difference?
Will google realise we’re a directory and forgive us?
For only 200 characters, I doubt it would be a significant problem (I might be wrong, if someone disagrees, please correct me).
To be honest, we do not offer SEO services and I’m not an SEO expert, but that proportion of duplicated content is not good, because Google penalizes websites that are judged not to have relevant or original content. Something like 30% is more reasonable usually. I don’t know how directories are affected by the numbers of link they point out though; Google’s algorithm might not have a linear effect (adding links of duplicate content when you have a 1000 might hurt less than when you only have 100).
If you have placed robots.txt to disallow indexing of the pages, you probably shouldn’t bother about being penalized. There are very few chances that your robots.txt wouldn’t work and you get penalized (For instance, if you misspell the robots.txt as robot.txt, things don’t work).