Why I switched from Amazon S3 to Crashplan

Updated on in Computer
Tags: Backup Software Crashplan AWS S3


I am in the middle of the moving my online backup system from Amazon S3 to Crashplan. In this post I will layout why this switch made sense for me and might make sense for you as well. I have about ~500GB worth of data (mostly pictures and some videos). So far I used Amazon S3 (more precisely Amazon Glacier) to backup my keeper shoots (which is about 35GB worth of data). I store my pictures and my main Lightroom catalog on external hard drives in a raid configuration for added redundancy. In the long run I want to backup all my pictures online and not only the core of my work. The following sections compare both Amazon S3 and Crashplan by looking at different aspects and comparing both services with each other.

Pricing

At the time of this writing Amazon S3 cost about $0.1 per GB per month and Glacier about $0.01 per GB per month. While this seems not too bad on first sight, if you crunch the numbers you will see that storing 500GB of data in S3 will cost you about $50 per month and in Glacier about $5 per month. $5 per month for Glacier does not look to bad but you can do better, especially as additionally to the storage fees Glacier has extra fees for restoring the files and transferring data in and out of Glacier. For instance I recently restored to 2 files with a total less than 2GB and got the following bill. I still do not understand why Amazon had to restore ~253GB for my 2GB request, especially as all my current data in S3 + Glacier combined does not even remotely add up to that amount of data ?!?

In the case of Crashplan pricing is straight forward: it is a flat rate. You pay a monthly fee and can backup as much data as you want. In my case I signed up for a year contract and pay $3.50 per month. In Amazon I pay currently about $2 per month just for storing the core of my work (as mentioned above you pay a premium for restoring Glacier data). Being able to backup all my data for as little as $3.50 per month without additional fees for retrieving them is a no brainer. This is especially try if you keep in mind that my storage requirement will monotonically increase.

Convenience

Amazon S3 and Glacier mostly consists of a web interface and a set of APIs to implement your own clients. There exists various clients (e.g., s3cmd, CyberDuck) which allow you do up-/download data from Amazon S3. All of them though are manually though, meaning you have to remember to upload your files and you have remember which files you need to upload. Crashplan on the other hand comes with a pretty nice client which automatically check which files have been changed or added and automatically uploads them to the cloud. The client works right ot of the box and has plenty of settings to customize the systems. Additionally the client not only allows you to backup your data to the Crashplan cloud but also to the other computer or folders (for free). You can even access you data through the crash plan mobile apps.

Encryption

Both services provide encryption support. Amazon S3 uses AES while Crashplan uses Blowfish for encrypting your data. I am not a crypto expert but both algorithm seem a reasonable choice. It is worth mentioning that Amazon S3 only supports client-side encryption in their Ruby and Java API while Crashplan always uses a client-side encryption. Another nice touch on Crashplan is that ability to reset the password through a security question. Versioning

I do not necessarily require versioning but it is nice to have. Both, Amazon S3 and Crashplan, support versioning. While Crashplan has versioning enabled by default you need to enable it in Amazon S3 explicitly.

Efficiency

When it comes to efficiency Crashplan beats Amazon S3 like a drum. Crashplan was solely designed for doing one things and that is backing up your data. Crashplan uses a sophisticated delta-update mechanism along with automatic compression. This means that Crash plan with split your files into small chunks. Then each chunk of data is compressed and encrypted. This approach allows Crashplan to utilize the bandwidth of your internet connection more efficiently and only need up upload those parts of a file which have been changed. To give an example, I updated a few keywords in my Lightroom catalog on my Laptop (which is about 130MB) and the update to the cloud took only a few seconds as the majority of the Lightroom catalog did not change. Furthermore Crashplan supports de-duplication, which means that duplicated files are only uploaded only once. This also means that of you move or rename files Crashplan does not need to upload it allover again.

In the case if Amazon S3 you always have to upload the whole file regardless if the whole file changed or just a few bits. Furthermore I am not sure if Amazon S3 supports compression for the transport at all. Obviously you can implement your own client and add some of those features (this is what for instance Dropbox does) but this is not what I am interested in doing myself.

Summary

The main drawback with S3 is that it was designed as a backend solution for various services (e.g., web hosting, backup, etc). This makes it unnecessary bloated and expensive. The best way of thinking of Crashplan is by thinking of it as a cloud based time machine. It will periodically,automatically backup all the selected files to the cloud. Your requirements might be different, but considering the price and the efficiency advantages of Crashplan it is the obvious choice for me. Also looked into Backblaze but for various reasons I decided against it (I will publish a post soon explaining why Backblaze did not work for me).