Syncing Backup FROM Amazon S3

logo_awsThere’s plenty of information available on the interwebs for backing up a local directory to Amazon S3. For home and light file sharing use, I’d highly recommend Jungle Disk. I use it at my corporate office and I have to say that it has been better & faster than the dedicated file sharing server we phased out. You can use Jungle Disk without even really getting your hands dirty. Just sign up for an Amazon S3 account, then sign up at Jungle Disk enter your S3 keys, and you’re off.

If you’re doing some more industrial things, you may need to roll your own S3 backup mechanism.  Do a simple goog’ on it, and you’ll find a lot of helpful guides that will show you how to do that too But, things are not so clear when you already have an S3 bucket and you need to keep a local machine synced up to the bucket.  That’s exactly what I needed to do, and I couldn’t find jack on the web.  At least not easily.

Essentially I need to backup S3.  Files are continually uploaded to the bucket in question, so I need something that will recursively scan the bucket for changes and download any new or changed files to my backup Linux file server.

I finally found the tool for the job, a script called S3Sync.  It seems to be well done and popular, but somehow I missed the boat and had trouble finding it, because I’m not on the Ruby train.  Once, I found the right tool for the job, putting it all together was relatively painless.

You can find the documentation on the S3Sync website, but there’s several scripts in the libary.  The one I needed was s3sync.rb. It gives you the option to sync to or from a S3 bucket.  Perfecto. Then I used bash and wrote a shell script that I could execute via cron and I had it licked.  Now my script runs via cron, once an hour, and downloads any content.

Here’s an example script to give you the general idea:

 

#!/bin/bash

#Set up environment variables.
export AWS_ACCESS_KEY_ID="xx-your-key-xx";
export AWS_SECRET_ACCESS_KEY="xx-your-secret-key-xx";

#You will need to change this to the directory
#where you placed s3sync files.
cd /home/user/s3sync/;

#Execute s3sync.rb.  Notice that the first location is S3,
#the destination is a local folder.
#Using no-md5 speeds up search by using modifed date.
#Using make-dirs creates local folder structure like S3.
#I use --debug to see what's happening.
./s3sync.rb -r --make-dirs --no-md5 --progress --debug
mybucketname:path/to/sync /home/user/local/sync/folder;

 

Of course, this could be made much better. Specifically, some kind of error checking and improved output so I know if there’s a problem, and if the script is running successfully. But it’s a start!

No TweetBacks yet. (Be the first to Tweet this post)

Leave a Reply

You must be logged in to post a comment.