I am taking care of the large “news” website (500k pages), which got massive hit from Panda because of the duplicated content (70% was syndicated content). I recommended that all syndicated content should be removed and the website should focus on original, high quallity content.
However, this was implemented only partially. All syndicated content is set to NOINDEX (they thing that it is good for user to see standard news + original HQ content). Of course it didn’t help at all. No change after months. If I would be Google, I would definitely penalize website that has 80% of the content set to NOINDEX a it is duplicated. I would consider this site “cheating” and not worthy for the user.
What do you think about this “theory”? What would you do?
Thank you for your help!
Hi LucasTheCurious, welcome to the forum
How are you setting the syndicated content pages to noindex?
For example, if the editor of The Example Times wants to ensure that the article she is using with permission from The Example Gazette doesn’t get included in Google News, she would implement the following code in that article page’s HTML:
<meta name="Googlebot-News" content="noindex">
I’d probably follow your suggestion and delete the old syndicated content, then 404 the old pages. I’m assuming most of those pages are quite old so they’d have very little value anyway.
If you make content noindex, you are penalizing your self, by telling SEs not to index those pages. SEs generally judge content on a page by page basis, not the whole site.
I don’t suggest to you use noindex tag for duplicate issue. I use noindex tag for low quality webpages.
Here is example of low quality webpages.
- Tags and Categories.
- Directory that contain more links, and less text.
I prefer to use Canonical link tag for duplicate issue.
Good point, that’s probably the best way to handle multi pages with similar content.
This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.