Backups, the everlasting topic. Years after several hard drive crashes, after manual backups and semi automatic backups I’m still thinking about the right solution.
Currently I’m using an USB / SATA adaptor to connect various hard drives without enclosure to my computer to run Time Machine semi automatic backups. With OS X 10.7 (Lion) Time Machine will add local snapshots for the time I’m not connected to my backup drive and it will write those snapshots to the backup drive once I connect it again. That is all quite nice but not an optimal solution. Over the years I’ve collected quite a few drives and with every new drive I get more annoyed by the pile they form on my desk. This is still better than no backup but it requires me to connect my drives regularly to my computer. Sometimes I do – sometimes I don’t.
My setup could be vastly improved by using a NAS or Apples Time Capsule. This would allow me to back up over the air with no wires and drives lying around on my desk.
I don’t like Time Capsule because it is just one drive in a plastic box with no direct access to the hard disk or the files. You can attach a recovery disk but that just makes it a little better. Only one drive also means no data redundancy. If the disk crashes the backup is gone. As far as I know there are also no checks for data integrity so if some bits flip on the disk you never know. If thats not correct please leave a comment.
When I buy something like a network attached storage I would also want to use it as a fileserver and other things and Time Capsule just doesn’t offer this kind of flexibility.
The other option I considered was buying a NAS. I really don’t like the consumer plastic boxes like the Drobo, Qnap and Synology products. On the one side they offer lots of nice features like Time Machine compatibility, web interfaces, fileserver and file sharing features. On the other side they are quite expensive, most of them are ugly, they use Filesystems like Ext3/4 or HFS+ and I also heard real horror stories of complete data losses especially with Drobo. To be fair, these stories are one ore two years old.
Then I thought about buying an »Acer Aspire easyStore H341« or an »HP ProLiant – MicroServer« and building my own custom NAS. These are basically small Atom powered PCs with four HDD slots and no display connector. They usually come with Microsoft Windows Server and require some time investment to get going. I thought about using FreeNAS as I like FreeBSD and it uses the ZFS as filesystem. It comes with a nice interface and offers everything I would need. Modern filesystem with data integrity checks, fileserver and file sharing capabilities and Time Machine compatibility.
But again, with hard drives included it wouldn’t be cheap. It would still be no real offsite backup, it could still suffer from hardware failure or theft. Besides the costs for the hardware and the time of setting everything up, there are also some costs for power as this machine would have to run 24/7/365.
One final disadvantage of all those »local« backup solutions is that I have to be home to run my backups but I also want to backup when I’m at work or somewhere entirely different.
This is why I think that those NAS solutions are not right for me.
Online Backup Services
So what are the alternatives? After coming to the conclusion that I actually don’t want a NAS device I thought about online backup services.
Of course, the first question that comes to mind is privacy and security in general. I don’t want to hand over my precious data to some company without strong and secure encryption and by that I mean that nobody but me should ever be able to get my data.
The second thing to consider is storage. How much do I have to pay for how many gigabytes? How redundant is my data stored and is it checked for data integrity?
Luckily a quick google search revealed that there are a couple of interesting options available:
All of those offer encrypted backups. The data is encrypted locally before it is send to the storage servers in the internet. All of those services have detailed informations about their architecture and features and all of them seem to have happy customers.
Personally after researching for two hours I think I will try Crashplan and here is why:
Crashplan is cheap. Its not the cheapest but its cheap enough. You get to backup one computer with unlimited online storage for 49$/year and they have offerings for multiple computers too.
Arq and Jungledisk store the data on Amazon S3 or Rackspace Cloud which are a little more expensive than the other services with their own data centers. The client software of Arq costs another 29$.
Spideroak is carging 10$ / Month / 100 GB.
All the named services offer good encryption and they seem to take similar approaches as well. The important thing is that they offer the option to use a self generated private key which is crucial for having completely private backups. Even if the police would take away all the machines they wouldn’t be able to get to the actual data. Spideroak and Crashplan explain the encryption process very detailed on their websites.
Consistency / Data Integrity
Arq and Jungledisk can use S3 which is considered to be quite save from data corruptions but there are also stories of missing data floating around. But nobody is giving you a full guaranty. Spideroark is claiming a 0.0000% error margin. Crashplan claims daily data verification and auto repair should it ever get corrupted.
The named services seem to have a good reputation of not losing data.
On the Arq website there is also a section about metadata and how the different services manage to keep track of it. The systems are tested with a software called Backup Bouncer. JungleDisk and Arq seem to be the only ones passing all tests, Crashplan fails in one test, Dropbox and Backblaze fail in 19 of 20! The section might be outdated though and since Backup Bouncer is a free tool you can verify it yourself.
Software / Integration
With every of these services comes some kind of software. Arq and Backblaze have native OS X clients while the others have mutli platform tools that do not feel like native apps. This is the only real drawback I found with Crashplan.
Interestingly enough you also get de-duplicated, compressed and encrypted backups on all these services. With Crashplan you can even choose to not use de-duplication to reduce potential cpu load on your computer while checking for duplicate data. The backups are of course differential which means that only changed data is transmitted, not entire snapshots (except the first). Crashplan allows unlimited file sizes while other services have file size limits of 4GB! It can backup locked and files and if you decide to backup your OS X unix directories Crashplan will happily do so.
Over all Crashplan seems to offer fine grained control over varius aspects of backups – which I like. Their support seems to be alright too. I’ve asked how de-duplication actually works and I got a reply within four hours on a sunday without having an account or anything.
As I said, I will try Crashplan and in addition I will keep backing up irregulary to my external Time Machine disk – just to be sure.
I know that there a a lot of other tools out there and I’m still interested in other suggestions although I’ve probably checked them out already.
You might also want to check out Wikipedias »Comparison of online backup services«
Its worth checking out the FAQs and detailed features of all those services as they usually answer most of the questions you come up with.
Lastly you can google for “Service A vs Service B” and you will get a lot of more articles like these on the web to make up your own mind.
Somebody on twitter just pointed me to this post in the Crashplan Support forum where a native mac menu bar app in beta status is available.
Thomas posted a link in the comments to a comparison matrix that he made.
Another interesting hint from the comments: Dolly Drive
Apparently they offer TimeMachine backups in the “cloud”. Unfortunately their faq is a little short on details especially on security and data integrity so I guess I will write them a mail and put the info into another post.
Several (european) readers pointed out that the upload to the crashplan datacenter is really slow, maxing out at 1.3Mbps. This is definitively one major drawback for european customers and something where Arq or other european providers could shine.