Copyscape – Website Plagiarism Search

Tweet

Chris recently raised the difficult issue of content theft and how to tackle it.

We’ve certainly had our share of rip-off artistes and generally we’ve found it to be a two-tiered problem.

  1. Becoming aware your content has been ripped in the first place
  2. Getting something done about it

Finding your ripped content fast is paramount. If Google or Yahoo spiders stumble across the ripped content first, the damage is done. If you’re not identified as the first known occurence of that content it may difficult to recover trust.

Common sense tells you Google is your first port of call to locate your ripped-off content, but one great tool not many people seem to know about is Copyscape — a free, purpose-built, anti-plagiarism search tool. Just give Copyscape the URL of your content and it will do the grunt work and report back to you.

Copyscape.com

Probably the coolest thing about Copyscape is it doesn’t require you to enter ‘keyword phrases’ (headings, paragraphs, etc) from your content in the hope of matching the rip-off, as Google would. Copyscape analyses entire pages similtaneously and can seemingly easily detect matched pages, passages, paragraphs and even matched sentences. Some of the copies I’ve tracked down have been deeply obscured with hidden CSS, but Copyscape wasn’t fooled.

Ok , so you’ve located rip-offs. What now?

It’s easy to feel a bit forlorn and helpless, but it’s generally not as hopeless as you might think. Copyscape provides whois info as well as practical advice in responding to plagiarism.

It’s useful to remember that IP thieves very often rely on other larger, more reputable organisations (their ISP, Adsense, Commision Junction) to operate, so they usually don’t want to risk their server/revenue in a fight. If you’re courteous but very clear they’ll often quickly backdown.

It’s also worthwhile looking at Copyscape’s commercial service called CopySentry which automatically monitors the web for rogue copies of your content and then reports the results back to you via email. I can’t vouch directly for this paid service yet , but if your content is one of your key assets, this seems like a pretty cost effective way to protect it.

Either way, the free service is very useful for anyone running a content site.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • PvZ

    Excellent tip there, thank you very much. I wasn’t aware of this site but will certainly keep it bookmarked from now on.

  • realmsurfer

    A very useful service indeed. With the huge amount of content out there, the only way to keep track of it is with automated spidering. I imagine though that a lot of plagiarists are small operators, often in distant countries, making enforcement if they don’t comply very difficult indeed.

    One solution if a site doesn’t remove copyrighted content may be simply to report the site to search engines for abusive practices, which may also get them banned so as not to affect your own site being penalized for duplicate content.

  • http://www.xhtmlcoder.com/ xhtmlcoder

    Well, I have over 4-major sites that have copied my text word-for-word though I already knew that from my raw logs. My logs report over 100 sites that rip my content-off on a regular basis.

  • http://www.dotcomwebdev.com chris ward

    http://www.pirated-sites.com has loads of great advice!

  • asprookie

    Copyscape’s homepage is a direct ripoff of Google, worse, it’s misleading. It gives the impression that Copyscape is a Google service.

  • http://www.cpoliver.com CPOliver

    @ asprookie: maybe that’s the irony?

    Great to see something to help original content creators get something done about the theft of personal work.

  • http://www.pendleton-naz.org/blog EOBeav

    asprookie, how does Copyscape present itself as a service of Google? The layout is admittedly similar to Google’s, but that’s about where the similarities end, imo.

  • http://www.guidinglightproductions.com Parafly9

    I think its funny that Copyscape looks like Google.

  • Demex

    I just wanted to say thanks. I found a number of competing web design firms in my home country using the same copy as what I either paid a copywriter for or had written myself.

  • http://www.sitepoint.com Matthew Magain

    I think its funny that Copyscape looks like Google.

    Me too. Although the layout is obviously not a direct copy, it is certainly inspired by the big G. The irony made me chuckle.

  • Jake

    You’ll be surprised how many lazy, low-down scum bags there are on the net that think copying your content is fine. MAKE SURE you have harsh copyright statements listed next to each article on your site.

  • sjgault@gmail.co

    Has anyone ever heard of Constructive Plagiarism Prevention? Check their website. Detection is only partly effective, in that it may deter plagiarism some. PowerResearcher however guides students through the research and writing process, then protects them from indavertent plagiarism by hightlighting anything copied from the Internet, diplaying the URL and prompting them for proper citation. It also automates citations! You can test their application for free – ASP online, or download a fully functional 30-day trial copy.

    sjgault@gmail.com

  • Pingback: Percept » SitePoint maakt grove fout

  • Jonathan

    As great as copyscape is, I still prefer to use Google Alerts (http://www.google.com/alerts) for detecting plagiarism. Yes, you have to input a keyword phrase, but it constnatly searches for your work and emails you the results, to get that kind of service from Copyscape would require paying for Copysentry and, even then, you’re limited on how often the search takes place and how many pages you can look for.

    As far as how to respond to plagiarism, there is no one right way. I offer my techniques on my site, but really everyone has their own methods. I just encourage people to look at the fight as one that takes place over the long haul and has to be dealt with over a long period of time.

    Still, if anyone needs any help dealing with plagiarism, just drop me a line to let me know, I’ll gladly do what I can.

    Thank you for bringing attention to this very serious problem.

    Jonathan – http://www.plagiarismtoday.com

  • Pingback: Sponsor Actual

  • Pingback: Sascha Goebels WebLog » Blog Archive » SitePoint Blogs » Copyscape—Website Plagiarism Search

  • Err

    What’s do best ?
    Google Alerts or Copyscape ?

    Alerts is more comfortable i think.

    Err, www.toys.cba.pl

  • dhom

    Copyscape uses the Google API.

  • Carl
  • Pingback: Blogfera: Tecnología, Internet y Actualidad. » De donde son las fuentes de tu articulo.?

  • mollila

    This is a great tools. I have caught several copyright violations already.

  • http://www.eukhost.com eukhost

    I have used Copyscape before and it’s definitely a great resource to keep track of your site’s original content specially if you have a content rich site that’s been optimized for search engine purposes.

  • The text link man

    I think copyscape is crap ,there I said it .
    So why do I say this you ask ?

    Because I have tried wrting article on some highly
    used keyword terms .Now all the content in the article I wrote myself and guess what copyscape say I copied it ?

    The fact is with millions of site out there based on the same highly used keyword term there is no content new under the internet sun .So I call copyscape crap,I mean I have friend who create new content site and if copyscape claims it’s copied he deletes and start over ..
    One look and I see it say i copied when a few words are the same ???

    fire-place.org

  • The text link man

    Quick Question By the way is copyscape.com the only website out there doing this kind of function ?

    http://www.fire-place.us

  • http://www.sitepoint.com AlexW

    Because I have tried wrting article on some highly
    used keyword terms .Now all the content in the article I wrote myself and guess what copyscape say I copied it ?

    It’s works by finding repeated phrases, rather than keywords. Phrasing is much more random than words. It’s like DNA. We all have the four bases making up our DNA (G, A, T & C) but it’s the way they’re put together that makes them special.

    http://preview.tinyurl.com/y8zzj4

    Looks to me like CopyScape has done a good job. Are you saying you had never seen the ‘woodstove.johnsspot.com/’ site before you wrote your content?

  • http://www.preferatele.com bumbescu

    Good tool to stay away from others who want to copy your content

  • Sanju

    Copyscape really does a good job.

  • James S

    I personally use the http://www.copygator.com website to find duplicated content. To me it has a number of benefits over copyscape:

    1. it’s automated and brings me results instead of me searching for duplicated content. All i had to do was submit my feed and it started monitoring my feed showing me who’s republished my articles on the web.

    2. i get notified by email so it contacts me when it finds copies of my articles online.

    3. i use their image badge feature to alert me directly on my website when my content is being lifted.

    4. it’s a free service as opposed the “per page” cost of copyscape/copysentry.