|
|||||||
New to SitePoint Forums? Register here for free!
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Serial Publisher
![]() ![]() ![]() ![]() ![]() ![]() ![]() Join Date: Aug 1999
Location: East Lansing, MI USA
Posts: 13,283
|
What is duplicate content?
Lately it seems there has been an increase in datafeed driven/affiliate content sites out there. I myself have made quite a few. I have also seen the issue of what exactly is duplicate content discussed a few times recently.
We all know Google says that duplicate content is a "don't" and as such you risk being banned or penalized for doing it. But what exactly is duplicate content? It isn't just affiliate datafeed sites, such as those using Amazon AWS, that have duplicate content. People often create sites using feeds from Wikipedia and DMOZ, is this duplicate content? You could find a press release from Tivo on thousands of news, financial, or electronics websites. Is that duplicate content? What about game cheat sites that all list the same cheats? I think we can all agree that when a single individual or business owns two websites with the exact same content that it is spam. But what about the thousands of websites owned by different people that all use the same content? Amazon AWS (Amazon Web Services) sites are not unique, they only offer affiliate content, and thus it'd seem Google would like to get rid of them in favor of listings for Amazon.com. In this situation it is easy to figure out who should get listed because there is a parent company everyone is affiliates with. What about game cheat sites though? If you wanted to get rid of all the duplicate content how do you decide which one stays? DMOZ editors have faced this issue for a long time. You have two sites with the same content, which one is listed? My solution when I was an editor was to list them both, the reason is that maybe one site might be down when a user tries to visit it, so a certain amount of redundancy makes the directory more useful. New datafeed enabled affiliate programs show up every day, as do new datafeed driven websites. Eventually there will be too many, search engines will have to do something, but what? There will be too many for manual review, and any automatic system could hurt other sites with duplicate content such as news sites and game cheat sites, etc. You might be able to write an algorithm that detects most Amazon AWS sites, but what about the thousands of other affiliate programs out there? And even then you're still just getting most of the websites. People will find a way around any filters. Last edited by aspen; Sep 18, 2004 at 08:53. |
|
|
|
|
|
#2 |
|
runat="server"
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Nov 2001
Location: Colorado
Posts: 2,113
|
yeah, it's quite the quadary they have. With RSS feeds and blogs spreading like wildfire you can see why they feel they have to do something- but what is a good question.
TemplateMonster will certainly take a hit... "People will find a way around any filters." Yep, you hit the nail of the head there... |
|
|
|
|
|
#3 |
|
runat="server"
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Nov 2001
Location: Colorado
Posts: 2,113
|
It's not like television sees this as a problem...You get home flip on CNN and get the latest news scoop...Then later your on MSNBC only to get the same scoop...followed up by your local news giving the same scoop...
Maybe, Google should start their own TV channel? Hey, if Mark Cuban can, why not Google? |
|
|
|
|
|
#4 | |
|
SitePoint Guru
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2002
Location: The Office
Posts: 675
|
Quote:
I think that major affiliate programs when integrated as part of a larger topic add value. When an affiliate site is just an affiliate site it adds nothing. If google can look at the bigger picture then I think that well integrated affiliate sites will be OK. It is the ones that rely entirely on AWS or feeds and have nothing of their own to offer will suffer. Just my opinion. Simon |
|
|
|
|
|
|
#5 |
|
What a twist!
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jul 2002
Location: The Netherlands
Posts: 1,031
|
I run a lyrics website, with approx. 30,000 lyrics at the moment. I'm fairly sure that most of my lyrics are also available on other websites. So, does that mean I've got duplicate content or not? It's nothing from a datafeed, or affiliate, but nonetheless still 'duplicate'.
I have another website with a lot of Wikipedia articles. Duplicate content or not? Other articles on the website are unique. So how should Google handle this? Drop the whole website, or only drop the duplicate content? This isn't going to be easy for Google if they're seriously looking at cracking down on duplicate content. Who decides what duplicate content is, and better yet, why should certain types of duplicate content not be indexed? |
|
|
|
|
|
#6 |
|
SitePoint Wizard
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Dec 2002
Location: Nashville, TN USA
Posts: 2,039
|
Personally, I tend to find it quite frustrating when I'm searching the web for a product review or a product and all I get is Amazon and its clones... Maybe Amazon didn't have any reviews for the product, or maybe I'm looking for a different perspective from what I get there.
But, its not just AWS sites. It's SE spam in general. I was looking for financial/accounting software the other day and typed it into Google and came up with tons of results, all of which looked like spam to me. I think this has resulted in part from webmasters looking at Google (or any Search Engine) as an advertising medium rather than a tool for users to find legitimate information. Yes, Google is great for advertising, and no, you shouldn't stop trying to get to the top for your search results. But, SE spamming has made Google essentially worthless to me in many cases... |
|
|
|
|
|
#7 |
|
SitePoint Addict
![]() ![]() ![]() Join Date: Dec 2001
Location: Wisconsin, USA
Posts: 329
|
I don't see a problem with affiliate sites being listed. As long as the main site is listed higher up, that's totally fair.
|
|
|
|
|
|
#8 | |
|
SitePoint Addict
![]() ![]() ![]() Join Date: Nov 2003
Location: Southampton, UK
Posts: 371
|
Quote:
|
|
|
|
|
|
|
#9 |
|
SitePoint Wizard
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Oct 2000
Posts: 1,709
|
Good question Chris.
To add to it - what about writers simply publishing their unique original article in more than one website? I have professional writers contributing articles to TheCatSite.com on a non-exclusive basis. They can and do publish the very same article on their own website or elsewhere. Some of the articles have been printed in magazines and may appear in the online version of the magazines as well. That certainly is duplicate content - but I don't think anyone should be punished for it. Question is - does Google in fact penalize anyone for the duplicate content? |
|
|
|
|
|
#10 |
|
Non-Member
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Nov 2002
Location: Earth
Posts: 1,107
|
Seems like if you're an ecommerce site, setting up an affiliate program and datafeed would be vastly more worthwhile than making your own multiple (spam) sites.
|
|
|
|
|
|
#11 | |
|
Impregnator of women
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2004
Location: Manchester, UK
Posts: 749
|
Quote:
|
|
|
|
|
|
|
#12 |
|
SitePoint Guru
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2004
Location: Belgium
Posts: 919
|
Also, where does duplication begin?
I once got links to a search-engin in google... |
|
|
|
|
|
#13 | |
|
SitePoint Addict
![]() ![]() ![]() Join Date: Apr 2004
Location: USA
Posts: 266
|
Makes you wonder if an unbiased, non-pay-per-submit directory that is scrutinized like DMOZ may be more useful someday than a large SE like Google.
Quote:
Last edited by StephenBauer; May 26, 2004 at 12:00. |
|
|
|
|
|
|
#14 |
|
SitePoint Addict
![]() ![]() ![]() Join Date: Dec 2001
Location: Wisconsin, USA
Posts: 329
|
If you think about it though, aren't department stores basically affiliates for the products they distribute, in one way or another?
You will find the same CD's at WalMart, Kmart, FYE, Sam Goody, etc.... you'll find the same food items at most groceries, etc... but does it make the stores any less valid? |
|
|
|
|
|
#15 |
|
SitePoint Enthusiast
![]() Join Date: Feb 2002
Posts: 35
|
Recently it has been very difficult to find reviews or comments on products. All you get in google is SE spam. And that's the real problem. To find relevant content. One way is avoiding duplicates.
It's not a very good method but, what's the alternative? A much more complex discrimination algorithm that borders on an Artificial intelligence? Maybe search engines will be the force that drives the research for better and more intelligent algorithms. |
|
|
|
|
|
#16 |
|
Aussie Icon
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jul 2002
Location: Australia
Posts: 1,079
|
I, too, am concerned about the growing amount of datafeed fed "clones" taking up search results.
I dont think that duplicate content is neccessarily a bad thing (think of all the press releases and news articles that can be found on many sites). I think the issue is just SPAM search results in general. In reality, Amazon's product page should be listed before that of the "clone site". |
|
|
|
|
|
#17 |
|
SitePoint Zealot
![]() ![]() Join Date: Mar 2003
Location: Dublin, Ireland
Posts: 133
|
This really is a very interesting issue - I am hoping to recreate certain sections of the amazon.com web site on my own web site over the Summer - all things being well - using their datafeed.
I have to say I appreciate Google's quandry. However why should any original content on my site be penalised as a result? For example...say half my site is original and half is duplicate stuff. Is it fair that the whole site gets penalised by google? Not sure on that one myself and it will be interesting to see how it pans out. Also I am actually changing the domain name on my web site which I am of course entitled to do. However in the change-over process, there will be two effectively duplicate sites. The mind boggles! Once you start down this road it is hard to imagine where it will end!! I note that Sitepoint also allows datafeeds of the forums. I can't see why web designers etc that contribute to sitepoint can't claim some benefit without running the risk of been penalised in google. Also how are google going to monitor this!! I note a larger and larger number of merchants at cj.com are offering datafeeds. So will all datafeeds be equal, but some datafeeds be more equal than others? ...!!.... |
|
|
|
|
|
#18 | |
|
SitePoint Guru
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2004
Location: Belgium
Posts: 919
|
Quote:
|
|
|
|
|
|
|
#19 | |
|
SitePoint Guru
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2004
Location: Belgium
Posts: 919
|
Quote:
|
|
|
|
|
|
|
#20 | |
|
Bananas contain Zinc
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Oct 2001
Location: Scotland
Posts: 1,175
|
Quote:
it wouldnt be in your best interests to links then would it ![]() |
|
|
|
|
|
|
#21 |
|
SitePoint Enthusiast
![]() Join Date: Apr 2001
Location: Chambersburg, PA
Posts: 37
|
I think we give Google credit for being smarter than they actually are. I haven't seen any evidence that they can even determine what is duplicate content much less penalizing any sites for it. I see keyword spamming, hidden links and many other so called 'forbidden' techniques used on a number of first page google searches.
I've used Google since the days when only webmasters and college geeks knew about and used it. In the last year and a half the search quality has decreased to the point of uselessness. On a typical search I see 4-5 other search engines come up on the first page. Where's the search quality in that? If I want to use another search engine I'll go there first. If Google really cared about search quality they could surely block these listings. |
|
|
|
|
|
#22 |
|
SitePoint Zealot
![]() ![]() Join Date: Nov 2003
Location: Kentucky, USA
Posts: 191
|
If google doesn't reduce the SE spam that it has through whatever means it has, it will stop being the driving force behind web searches. If that happens it won't matter what your Google search result ends up cause the masses will have moved on to Yahoo, or whatever. I don't think it is good or bad, just a necessary thing for a company that did good as one of the big players but now that they stand out so much, they have much more to lose. They have to be "leaner" in search results or none of us would use them anymore.. but I think that has been touched on before in this thread (google's innefectiveness)
|
|
|
|
|
|
#23 |
|
SitePoint Wizard
![]() ![]() ![]() ![]() ![]() ![]() Join Date: May 2004
Location: santa rosa, ca
Posts: 1,055
|
When I want to ask a question or find information about a product, I usually add the word forum to the end of my search. This way I can post my question on a forum and get the information I need.
|
|
|
|
|
|
#24 |
|
Makin' It Happen
![]() ![]() ![]() ![]() ![]() Join Date: Jan 2003
Location: Texas
Posts: 608
|
I've asked myself this question many times.
If you think about it Google using DMOZ as their thier directory seems to be the essence of duplicate content. |
|
|
|
|
|
#25 | |
|
SitePoint Guru
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2002
Location: melbourne australia
Posts: 669
|
Quote:
|
|
|
|
|
![]() |
| Bookmarks |
«
Previous Thread
|
Next Thread
»
| Thread Tools | |
| Display Modes | |
|
|
|
All times are GMT -7. The time now is 02:17.










Linear Mode
