SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Evangelist John D's Avatar
    Join Date
    Jun 2003
    Location
    Derry, Ireland
    Posts
    426
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    How does archive.org / waybackmachine save sites?

    Hey,

    Does anyone know how the sites are saved?

    Unfortunately I lost one of my sites from a hack over a year ago, I finally got backup restored from an old hard disk but the information is about a year out of date.

    The design was totally changed within the time of my backup and the site that shows on archive before it was hacked.

    I can manually save each image and just replace them, I still have a lot of the content already on the backup, but there is additional on archive.org - Is there a quicker way to do this?
    The last page on archive has about 2000 additional members and a LOT more posts, comments etc (It is a forum and social network) - Is there any way I can retrieve these or do I have to just use what I have in the backup and change the design from the archive page?
    CanaryHotspot.com - Canary Island forum and information
    InternationalChatForum.com - International Travel Chat Forum and information

  2. #2
    Mouse catcher silver trophy Stevie D's Avatar
    Join Date
    Mar 2006
    Location
    Yorkshire, UK
    Posts
    5,881
    Mentioned
    122 Post(s)
    Tagged
    1 Thread(s)
    The Wayback Machine only saves the HTML, CSS and image output – it doesn't save any back-end code. And because of the way it stores the various files, I don't think there is any way to retrieve them en masse, as far as I know it is just a case of going through and getting each resource one at a time.

  3. #3
    SitePoint Evangelist John D's Avatar
    Join Date
    Jun 2003
    Location
    Derry, Ireland
    Posts
    426
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks Stevie.

    I was thinking that..Hoping there was some way to do it.

    I'm happy if I can get the image files, even if my database is a lot older, it is better than nothing, and I can still use the images with the template files.

    If anyone does know of a way to get all images that would be great, rather than having to right click and save each individually.

    Thanks again Stevie!
    CanaryHotspot.com - Canary Island forum and information
    InternationalChatForum.com - International Travel Chat Forum and information

  4. #4
    Mouse catcher silver trophy Stevie D's Avatar
    Join Date
    Mar 2006
    Location
    Yorkshire, UK
    Posts
    5,881
    Mentioned
    122 Post(s)
    Tagged
    1 Thread(s)
    If the images are all embedded in the page (ie, the <img> tag calls up the full image rather than a thumbnail), most browsers have a "save complete page", which downloads the page and all linked CSS, scripts and images.

  5. #5
    SitePoint Wizard webcosmo's Avatar
    Join Date
    Oct 2007
    Location
    Boston, MA
    Posts
    1,480
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    I think you should protect yourself in the future from this, by getting a host that keeps backups for the whole site, like hawkhost does ...

  6. #6
    SitePoint Evangelist John D's Avatar
    Join Date
    Jun 2003
    Location
    Derry, Ireland
    Posts
    426
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by webcosmo View Post
    I think you should protect yourself in the future from this, by getting a host that keeps backups for the whole site, like hawkhost does ...
    Believe me...it was a high priority for a host, I was paying monthly for backups and the host wasn't doing it....I obviously wasn't happy.

    But I did get 1 month worth of my hosting credit
    How generous!

    Thanks again Stevie, I will have a look at that now.
    I am using chrome but happy to get another if it can save the page
    CanaryHotspot.com - Canary Island forum and information
    InternationalChatForum.com - International Travel Chat Forum and information

  7. #7
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,788
    Mentioned
    25 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by webcosmo View Post
    I think you should protect yourself in the future from this, by getting a host that keeps backups for the whole site, like hawkhost does ...
    You shouldn't rely on someone else for something so important. You should always keep backups of your site yourself.

    If you use a custom CMS then you should take your own backup each time you change the layout and also backup the database regularly (how frequently depends on how often you update it).

    If you use static pages then the copy you work on that is on your own computer will be the original and the copy on the web site will be an offsite backup.

    All the copies on your computer will of course then be included in the regular backups that you do of your computer so that you should have at least the last two or three versions of everything.
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •