SitePoint Sponsor

User Tag List

Results 1 to 12 of 12
  1. #1
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    How to deal with duplicate forum posts/content

    Hi,

    Sometimes I get members joining a forum I own where they just copy parts or snippers or even whole articles from places like ezinearticles, wikipedia, etc. They then post them directly on my forum as a new thread or a reply to another members thread.

    Is there a way of getting a script to automatically check for duplicate post content before it gets submitted to the forums so that I don't have to keep filtering through every post by typing parts of the post into Google just to check if it's duplicate content or not.

    This on 100 posts a day takes about 1 hr, but if I had 10 times the amount of posts, it would take over my life.

    With having a more automated way, I can then get onto other tasks.

    Thanks!

  2. #2
    SitePoint Enthusiast Adam Lutz's Avatar
    Join Date
    Jun 2010
    Posts
    63
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Since you are the owner, you have the power to delete their threads and answers you think they just copy-pasted it. try to use Copyscape to distinguish whether the thread or answers are just being copied.

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,263
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by john278 View Post
    ... if I had 10 times the amount of posts, it would take over my life.
    you need 10 times as many moderators

    thanks to the surfeit of "seo experts" this type of spam is not going to stop, and your only choices are (1) exercise constant vigilance and swift banning, or (2) throw your forum open and let it be inundated with crap
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    you need 10 times as many moderators

    thanks to the surfeit of "seo experts" this type of spam is not going to stop, and your only choices are (1) exercise constant vigilance and swift banning, or (2) throw your forum open and let it be inundated with crap
    So there's no automated way so that it can stop the duplicate posts and then I still check for crap once all the duplicates have been removed via the automated software/script.

    I use a good bit of automation on my site, for example, if I have to add a game a day, I normally would upload and write unique descriptions for about 60 games and then use a cronjob to get one of them games approved into the site each day rather than having to go to the site daily to get it approved.

  5. #5
    King of Paralysis by Analysis bronze trophy
    Join Date
    Jul 2004
    Location
    Ottawa, Canada
    Posts
    5,840
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Why do you care would be my question...

    If the post in question adds value to the forum then leave it, if it doesn't then delete it (regardless of whether it's duplicate content or not).

    Address the DMCA issues as they arise.

  6. #6
    Non-Member
    Join Date
    Oct 2008
    Posts
    174
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Look for more moderators to moderate your forum. Another thing is, if they give a credit link where they get the article, I guess that would be fine, but if not, it is considered as steeling. Give warnings.

  7. #7
    Follow: @AlexDawsonUK silver trophybronze trophy AlexDawson's Avatar
    Join Date
    Feb 2009
    Location
    England, UK
    Posts
    8,111
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    I guess one way you could deal with the issue is to make the forum invite only (for posting) or for customers with valid payment ID's, while less people will contribute you can wipe out all traces of spam (or perhaps have moderators set to validate the first 5 posts of any individual who wants to contribute). If there was a way to eliminate spam, fluff, reposting, theft or anything else that occurs on forums the inventor would be a very rich person indeed.

  8. #8
    SitePoint Enthusiast
    Join Date
    Sep 2009
    Location
    In The Matrix
    Posts
    26
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you find it useful to your forum just live it right there and if it gives links where the article came from, don't mind it.

  9. #9
    Non-Member
    Join Date
    May 2010
    Location
    Manila, Philippines
    Posts
    86
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you think the post is meaningless and it sounds spammy, you have the right to delete it.

  10. #10
    SitePoint Member
    Join Date
    Jul 2010
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Use or create a mod to detect duplicate threads.

  11. #11
    SitePoint Zealot
    Join Date
    Mar 2009
    Posts
    126
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Not much you can do

    1. Live with it
    2. Change your forum coding (php? asp?) and check for duplicate contents.

  12. #12
    SitePoint Zealot
    Join Date
    Jul 2010
    Posts
    100
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, the part with more moderators is a good advice, if you can afford it, but you may also try a really simple method. Put all the threads and the replies in a word file and try to search with some key words similar posts. Then click Ctrl+F(Find) and select Find all(with the key words).


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •