SitePoint Sponsor

User Tag List

Results 1 to 10 of 10

Thread: SiteSucker

  1. #1
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,764
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    SiteSucker

    Is there a way to stop an app like SiteSucker from "cloning" your entire website without your knowing or your permission?

    Sincerely,


    Debbie

  2. #2
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,165
    Mentioned
    453 Post(s)
    Tagged
    8 Thread(s)
    There certainly is! Don't put your site online.

  3. #3
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,764
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ralph.m View Post
    There certainly is! Don't put your site online.
    Some of you may not know this, but several years ago Ralph failed miserably as a stand-up comedian.

    He was thrown into an alleyway, but fortunately Hawk come along and gave him a role here at SitePoint.

    And ever since, we have been "blessed" with his comedic ways...

    God Bless the misfits of this world...


    Debbie

  4. #4
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,165
    Mentioned
    453 Post(s)
    Tagged
    8 Thread(s)
    Who said I was being a comedian? Once a page has loaded in a user's browser, they have your HTML, your images, your CSS, your videos, your JavaScript. It's as simple as that. They can copy it all, download it all. Sure, a tool like the one you mentioned might make the process quicker, but there's no real difference.

    So just live with it. There are more important things to worry about, and really, to think that someone even wants to copy your site is a bit of a stretch—some would call it 'self flattery', shall we say. You are better off just getting on with running a business and take reasonable steps to back up your site etc. There will always be criminals and scammers out there, and you have to accept that, just like you accept viruses and deal with them when you get sick. It's just the world we live in.

  5. #5
    SitePoint Mentor bronze trophy
    John_Betong's Avatar
    Join Date
    Aug 2005
    Location
    City of Angels
    Posts
    1,819
    Mentioned
    73 Post(s)
    Tagged
    6 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    Is there a way to stop an app like SiteSucker from "cloning" your entire website without your knowing or your permission?

    Sincerely,


    Debbie
    How do you know you are being cloned? Is SiteSucker in your browser logs?

    Is your site material all your own work? If so then with a bit of luck Google will recognise your original content and increase your "SEO Brownie Points" because of the links to your site.

    Copying is not theft: http://youtu.be/IeTybKL1pM4

  6. #6
    Mouse catcher silver trophy Stevie D's Avatar
    Join Date
    Mar 2006
    Location
    Yorkshire, UK
    Posts
    5,888
    Mentioned
    122 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    Is there a way to stop an app like SiteSucker from "cloning" your entire website without your knowing or your permission?
    Quote Originally Posted by ralph.m View Post
    There certainly is! Don't put your site online.
    Exactly. Once you've put your site online, you've made it available for other people to copy. Sure, you can keep an eye on things and use various plagiarism checkers, then hunt them down and kill them with spears or lawyers (whichever is more effective), report them to Google and their hosting providers ... but that's all a bit "shutting the stable door".

    Over the years we've had a few problems with scraper sites stealing content off SPF, and it comes down to a decision each time how much effort you are prepared to put into pursuing each case and how much you are losing by it.

    Of course, there is a third option, which is to put it behind a paywall, or at least a restricted registration zone, but as that will play hell with your Google rankings and usability, it's really a last resort.

  7. #7
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,764
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ralph.m View Post
    Who said I was being a comedian?
    You're so coy, Ralph!


    Quote Originally Posted by ralph.m View Post
    Once a page has loaded in a user's browser, they have your HTML, your images, your CSS, your videos, your JavaScript. It's as simple as that. They can copy it all, download it all. Sure, a tool like the one you mentioned might make the process quicker, but there's no real difference.
    Right. And what was *implied* by my OP is that it is scary to think that there are tools out there which can whiz through your site and make a carbon copy of it in minutes, if not seconds.

    So I was wondering if there was any easy way to detect that and stop it.

    Admittedly, if Jane User is bored - or nefarious - and wants to spend all month manually saving each page on my website, there is little I can do.

    But if someone fires off a script that automates that process, then maybe there is some way to detect it and "cut it off at the pass"?


    Quote Originally Posted by ralph.m View Post
    So just live with it. There are more important things to worry about, and really, to think that someone even wants to copy your site is a bit of a stretch—some would call it 'self flattery', shall we say.
    If a person's website is mostly content, and someone could make a carbon copy of it in minutes, and then publish your content under a new domain name (e.g. www.DebbiesKnockOffSite.com) then that would be a big issue.

    In fact, I JUST read a really fascinating article in Inc. magazine - old paper copy from my accountant, so sorry, no links - that talked about this guy in Germany that was basically stealing people's entire websites, changing the logos, and creaming his competition. (Dude is a multi-millionaire doing this.)

    Granted, I think what he was doing was paying people in some 3rd world country to code exact copies of the sites he was stealing, but similar concept...


    Quote Originally Posted by ralph.m View Post
    You are better off just getting on with running a business and take reasonable steps to back up your site etc. There will always be criminals and scammers out there, and you have to accept that, just like you accept viruses and deal with them when you get sick. It's just the world we live in.
    Understood, but I'm just trying to think 10 steps ahead, and looking for ways to cut off the bad guys.

    (If a neighbor of yours tapped into your water line, you'd notice your water bill going up and investigate. Maybe there is some counter-tool out there that detects when someone is scanning your directories and making massive copies of things?! I dunno. Maybe that is wishful thinking?)

    Sincerely,


    Debbie

  8. #8
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,764
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by John_Betong View Post
    How do you know you are being cloned? Is SiteSucker in your browser logs?
    No, right now my website is safe from everyone, because it is eternally stuck in "Development Mode" on my laptop!!!

    But s-o-m-e-d-a-y it will be on the Internet...


    Quote Originally Posted by John_Betong View Post
    Is your site material all your own work?
    110%


    Quote Originally Posted by John_Betong View Post
    If so then with a bit of luck Google will recognise your original content and increase your "SEO Brownie Points" because of the links to your site.
    That's my plan.


    Quote Originally Posted by John_Betong View Post
    Copying is not theft: http://youtu.be/IeTybKL1pM4
    Wow! That is one *funky* video?!

    Not sure who did that or what they were trying to say/prove, but I bet you the RIAA would disagree that "Copying Is Not Theft"...

    Sincerely,


    Debbie

  9. #9
    Life is not a malfunction gold trophysilver trophybronze trophy
    TechnoBear's Avatar
    Join Date
    Jun 2011
    Location
    Argyll, Scotland
    Posts
    6,151
    Mentioned
    262 Post(s)
    Tagged
    5 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    So I was wondering if there was any easy way to detect that and stop it.
    As Ralph says, once the site's on-line, there is no way to absolutely guarantee that it won't be copied. However, you could make it more difficult by adding something like a black hole or Crawl Protect. (Don't be put off by the curious English on the Crawl Protect site; the author is not a native speaker.)

  10. #10
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,141
    Mentioned
    190 Post(s)
    Tagged
    2 Thread(s)
    I know Flood Control works well for limiting form submits. I wonder if there's anything like that for GET requests?
    Crawl Protect looks like it relies on User-Agent to block crawlers, but User-Agent isn't reliable.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •