Automating Amazon EC2 Instance Backup and Recovery, Part One

Part One: Scheduling Snapshots Using Cron

Skill Level: Intermediate
Operating System(s): Linux

Philosophy

99.999% Uptime. That is a wonderful goal, and in order to get there we must plan for the inevitable outages and problems that cause our servers to break.

The best we can do is to be prepared with the Three P’s: Planning, Process, and Practice.

So how do we plan for the worst? Make backups, of course! The best kind are those that:

  • Run automatically,
  • Are stored someplace other than the server that is being backed up,
  • Fast and easy to restore,
  • Don’t cost a fortune to run or keep.

Since we are on AWS, that is very easy to accomplish.

The Plan

Configure a daily event to create a backup copy of our entire EC2 instance.

We will use the cron command to schedule an AWS Elastic Block Store (EBS) snapshot, which makes a point-in-time image of the entire filesystem at the time the snapshot is run, and may be used later to recover the instance in the event of a failure. Note that only AWS instances that are EBS-backed may use this method.

Prerequisites:

  • access to your AWS Account» Security Credentials page
  • access to the AWS Management Console Web GUI
  • root access to a running AWS EC2 EBS-backed Linux instance – for example:
    ssh -i AWSKeyPair.pem ec2-user@{yourInstance_FQDN_or_IP}
    % sudo su
    root@yourInstance#
    

Install the Needed Tools

In order to do this we will need the following things:

AWS X.509 Certificate and Private Key Files

These two files allow your server to execute AWS EC2 instance commands securely.

They are obtained from the AWS Account » Security Credentials page. Towards the bottom of the page locate three tabs: “Access Keys”, “X.509 Certificates”, and “Key Pairs”. Once found, click the middle one, “X.509 Certificates“.

Next, click the link for “Create a new Certificate” and a new window will appear with two orange download buttons, “Download Private Key File” and “Download X.509 Certificate”. You get only one opportunity to download your Private Key File, so be careful to note where this file downloads to! Also download your certificate file, which you may download again at any time.

The private key file will be in the form of:
pk-{your_specific_32_character_random_string}.pem

and the certificate file will look like:
cert-{your_specific_32_character_random_string}.pem

Next, upload the two files to your AWS EC2 instance:

~/Downloads% scp -i AWSKeyPair.pem pk-*.pem ec2-user@{yourInstance_FQDN_or_IP}:
~/Downloads% scp -i AWSKeyPair.pem cert-*.pem ec2-user@{yourInstance_FQDN_or_IP}:

Finally, you will need to copy the two files to your root user’s home directory:

root@yourInstance# cd
root@yourInstance# mkdir .ec2/
root@yourInstance# chmod 700 .ec2/
root@yourInstance# cp ~ec2-user/*.pem .ec2/
root@yourInstance# ls -l .ec2/

Amazon EC2 API Tools

These are the actual commands that run on your instance to allow you to create the snapshot (and restore it too).

root@yourInstance# yum install aws-apitools-ec2 ec2-utils

The above command installs the API Tools into /opt/aws/apitools/ec2-1.4.4.2 and creates a symlink to it:
lrwxrwxrwx 1 root root 13 Oct 20 01:57 /opt/aws/apitools/ec2 -> ./ec2-1.4.4.2

The yum command above also installs the three EC2 utility commands:
/etc/udev/rules.d/51-ec2-hvm-devices.rules
/opt/aws/bin/ec2-metadata
/sbin/ec2udev

If yum is not available, you may download and manually install both:
API Tools: http://aws.amazon.com/developertools/351
EC2 Utilities: http://aws.amazon.com/code/1825?_encoding=UTF8&jiveRedirect=1

Get the latest Sun/Oracle version of the Java JDK

Download the appropriate rpm from: http://www.oracle.com/technetwork/java/javase/downloads/jdk-7u1-download-513651.html

Intel x86 arch: http://download.oracle.com/otn-pub/java/jdk/7u1-b08/jdk-7u1-linux-i586.rpm
AMD x64 arch: http://download.oracle.com/otn-pub/java/jdk/7u1-b08/jdk-7u1-linux-x64.rpm

root@yourInstance# wget -Ojdk-7u1-linux-i586.rpm http://download.oracle.com/otn-pub/java/jdk/7u1-b08/jdk-7u1-linux-i586.rpm

To Install:

root@yourInstance# rpm -i jdk-7u1-linux-i586.rpm

or, to Upgrade:

root@yourInstance# rpm -U jdk-7u1-linux-i586.rpm

To set the $JAVA_HOME environment variable properly for this command-line session only:

root@yourInstance# export JAVA_HOME=/usr/java/latest

Scripts

These scripts help automate each step of the process, providing key bits of the workflow that allow us to quickly get the job done.

IMPORTANT: Please create each of these scripts in the /opt/bin directory, making sure to insert your specific values wherever you see curly brackets {}.

Also, please be sure to set execute permissions properly on all of the new scripts:

root@yourInstance# chmod 750 /opt/bin/*

/opt/bin/instanceid

The instanceid script simply gets the specific ID of this AWS instance, which we will need for later steps.

#!/bin/sh
#
### /opt/bin/instanceid
#
/opt/aws/bin/ec2-metadata -i | /bin/awk '{print $2}'

/opt/bin/ec2do

The core shell script is ec2do, which allows any other calling script to get working access to the AWS API tools without any special setup. We do this because the cron scheduling command has a notoriously limited environment, so any environment variable that you set in ~/.bash_profile will normally not be loaded by a script run from cron.

Please be sure to modify the example below to include your 32-character X.509 random key.

#!/bin/bash
#
### /opt/bin/ec2do
#
## EXAMPLE:
## ec2-describe-volumes
## Becomes:
## ec2do describe-volumes
#
export EC2_HOME='/opt/aws/apitools/ec2'  # Make sure you use the API tools, not the AMI tools
export EC2_BIN=$EC2_HOME/bin
export EC2_PRIVATE_KEY=/root/.ec2/pk-{your_32_char_rand}.pem
export EC2_CERT=/root/.ec2/cert-{your_32_char_rand}.pem
export PATH=$PATH:$EC2_BIN
export JAVA_HOME=/usr/java/latest
$EC2_BIN/ec2-$*

/opt/bin/volumes

#!/bin/sh
#
### /opt/bin/volumes
#
/opt/bin/ec2do describe-volumes | /bin/grep ATTACHMENT | /bin/grep `/opt/bin/instanceid`

Outputs one or more lines like the following:
ATTACHMENT vol-424ebd4a i-2c765a02 /dev/sda1 attached 2011-08-29T19:08:20+0000
ATTACHMENT vol-88c73d2f i-2c765a02 /dev/sdb1 attached 2011-08-29T19:08:20+0000

The second column contains the {volumeID}’s you will need to create the snapshots.

If you have gotten this far, then you have successfully installed your X.509 keys and used the API Tools to get information about this instance. Congratulations!
If you do not get output from this command, please stop and recheck each of the previous steps. It is ESSENTIAL that the /opt/bin/ec2do describe-volumes command return information.

/opt/bin/volsnap

The /opt/bin/volsnap command is the actual script that runs the backup snapshots, and is really quite simple. One line to get the current date, and one line for each volume to backup.

Please replace {volumeID_X} with the actual volume ID’s returned from the /opt/bin/volumes command above. Also replace {yourInstance} with the hostname or any other identifying string you care to use.

#!/bin/sh
#
### /opt/bin/volsnap
#
DATE=`/bin/date '+%Y%m%d%H%M%S'`
/opt/bin/ec2do create-snapshot {volumeID_1} --description "{yourInstance}-{volumeID_1}-$DATE"
/opt/bin/ec2do create-snapshot {volumeID_2} --description "{yourInstance}-{volumeID_2}-$DATE"

Obviously, a much more complex script than this can be written (and has been), but that is a tad beyond the scope of this tutorial ;-}

Run the script now manually to verify that it is working:

root@yourInstance# /opt/bin/volsnap

You should see something that looks similar to this:
SNAPSHOT snap-36380592 vol-424ebd4a pending 2011-10-20T03:46:57+0000 510579120428 8

Log into the AWS console and navigate to EC2 » Snapshots and you should be able to see the snapshots you just ran.

You can also run:

/opt/bin/ec2do describe-snapshots

Scheduling the Backup

Finally, we have made it to the last step – automating out backup script.

root@yourInstance# crontab -e

11 00 * * * /opt/bin/volsnap > /var/log/volsnap.log 2>&1

The above cron entry will run the backup script every night at 11 minutes after midnight and record any output in the /var/log/volsnap.log file.

Summary

Congratulations! Great job – you have automated your backups. In the next part of our series you will learn how to use a snapshot to recover a failed instance completely and easily.

Resource Links

http://aws.amazon.com/ec2/faqs/
http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/
From Zero to Cloud: Setting up an EC2 Sandbox, Part 2

Image via Shutswis / Shutterstock

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • aditya menon

    Thank you for the great piece of information! This is the best way for me to get started with the AWS api.

    I am more interested in the next article, however, because restoring a snapshot into a working volume AND creating a running instance out of that volume is not as easy as it sounds! The interface and the useless, jargon infested documentation only add to the confusion. Have a great day!

  • Giovanni Castillo

    Hi, thank you for this work.

    I have a some problems. I follow this steps, but when I use ./volumes, get an error message:
    unexpected error: org.codehaus.xfire.fault.XFireFault: Signature creation failed, and another messages with java errors.

    I am use this environment:

    amazon linux ami-3bc-9997e

    Please help me to found solutions for this problem.

    Thanks a lot.

  • http://www.wyzaerd.com Eric Stone

    Most welcome! Part Two is almost done and will be posted shortly.

  • Giovanni Castillo

    I founded my problem.

    I use us-west-1 region, and your script need add this:

    $EC2_BIN/ec2-$* –region us-west-1

    And work very fine!!

    Thanks a lot for this tutorial.

  • http://www.wyzaerd.com Eric Stone

    Thanks for the update, Giovanni. That is an excellent addition to the script – to take the region into account. I think that would be a good environment variable to add, too!

  • Pingback: Automating Amazon EC2 Instance Backup and Recovery, Part Two » CloudSpring

  • Bevan

    Apparently the environment variable to use for regions other than us-east-1 is EC2_URL.
    For example in the case of us-west-2, EC2_URL should be set to:
    “https://ec2.us-west-2.amazonaws.com”

  • Matt

    EBS Snapshots and RDS Snapshots can be automatically created using Skeddly. http://www.skeddly.com

  • Charles DiComo

    Great! Thx

  • Jayaraj K

    The below link contains a simple bash scripts that can be configured in system cron.

    http://jayaraj.sosblogs.com/The-first-blog-b1/AWS-Automated-EBS-Snapshot-Script-b1-p3.htm