How does archive.org / waybackmachine save sites?

Hey,

Does anyone know how the sites are saved?

Unfortunately I lost one of my sites from a hack over a year ago, I finally got backup restored from an old hard disk but the information is about a year out of date.

The design was totally changed within the time of my backup and the site that shows on archive before it was hacked.

I can manually save each image and just replace them, I still have a lot of the content already on the backup, but there is additional on archive.org - Is there a quicker way to do this?
The last page on archive has about 2000 additional members and a LOT more posts, comments etc (It is a forum and social network) - Is there any way I can retrieve these or do I have to just use what I have in the backup and change the design from the archive page?

The Wayback Machine only saves the HTML, CSS and image output – it doesn’t save any back-end code. And because of the way it stores the various files, I don’t think there is any way to retrieve them en masse, as far as I know it is just a case of going through and getting each resource one at a time.

Thanks Stevie.

I was thinking that…Hoping there was some way to do it.

I’m happy if I can get the image files, even if my database is a lot older, it is better than nothing, and I can still use the images with the template files.

If anyone does know of a way to get all images that would be great, rather than having to right click and save each individually.

Thanks again Stevie!

If the images are all embedded in the page (ie, the <img> tag calls up the full image rather than a thumbnail), most browsers have a “save complete page”, which downloads the page and all linked CSS, scripts and images.

I think you should protect yourself in the future from this, by getting a host that keeps backups for the whole site, like hawkhost does …

Believe me…it was a high priority for a host, I was paying monthly for backups and the host wasn’t doing it…I obviously wasn’t happy.

But I did get 1 month worth of my hosting credit :o
How generous!

Thanks again Stevie, I will have a look at that now.
I am using chrome but happy to get another if it can save the page :slight_smile:

You shouldn’t rely on someone else for something so important. You should always keep backups of your site yourself.

If you use a custom CMS then you should take your own backup each time you change the layout and also backup the database regularly (how frequently depends on how often you update it).

If you use static pages then the copy you work on that is on your own computer will be the original and the copy on the web site will be an offsite backup.

All the copies on your computer will of course then be included in the regular backups that you do of your computer so that you should have at least the last two or three versions of everything.