Evaluating Online Backup Services

Backups, the everlasting topic. Years after several hard drive crashes, after manual backups and semi automatic backups I’m still thinking about the right solution.

Current Setup

Pile of hard drives

Currently I’m using an USB / SATA adaptor to connect various hard drives without enclosure to my computer to run Time Machine semi automatic backups. With OS X 10.7 (Lion) Time Machine will add local snapshots for the time I’m not connected to my backup drive and it will write those snapshots to the backup drive once I connect it again. That is all quite nice but not an optimal solution. Over the years I’ve collected quite a few drives and with every new drive I get more annoyed by the pile they form on my desk. This is still better than no backup but it requires me to connect my drives regularly to my computer. Sometimes I do – sometimes I don’t.

NAS Options

My setup could be vastly improved by using a NAS or Apples Time Capsule. This would allow me to back up over the air with no wires and drives lying around on my desk.

I don’t like Time Capsule because it is just one drive in a plastic box with no direct access to the hard disk or the files. You can attach a recovery disk but that just makes it a little better. Only one drive also means no data redundancy. If the disk crashes the backup is gone. As far as I know there are also no checks for data integrity so if some bits flip on the disk you never know. If thats not correct please leave a comment.

When I buy something like a network attached storage I would also want to use it as a fileserver and other things and Time Capsule just doesn’t offer this kind of flexibility.

The other option I considered was buying a NAS. I really don’t like the consumer plastic boxes like the Drobo, Qnap and Synology products. On the one side they offer lots of nice features like Time Machine compatibility, web interfaces, fileserver and file sharing features. On the other side they are quite expensive, most of them are ugly, they use Filesystems like Ext3/4 or HFS+ and I also heard real horror stories of complete data losses especially with Drobo. To be fair, these stories are one ore two years old.

Then I thought about buying an »Acer Aspire easyStore H341« or an »HP ProLiant – MicroServer« and building my own custom NAS. These are basically small Atom powered PCs with four HDD slots and no display connector. They usually come with Microsoft Windows Server and require some time investment to get going. I thought about using FreeNAS as I like FreeBSD and it uses the ZFS as filesystem. It comes with a nice interface and offers everything I would need. Modern filesystem with data integrity checks, fileserver and file sharing capabilities and Time Machine compatibility.

But again, with hard drives included it wouldn’t be cheap. It would still be no real offsite backup, it could still suffer from hardware failure or theft. Besides the costs for the hardware and the time of setting everything up, there are also some costs for power as this machine would have to run 24/7/365.

One final disadvantage of all those »local« backup solutions is that I have to be home to run my backups but I also want to backup when I’m at work or somewhere entirely different.

This is why I think that those NAS solutions are not right for me.

Online Backup Services

So what are the alternatives? After coming to the conclusion that I actually don’t want a NAS device I thought about online backup services.

Of course, the first question that comes to mind is privacy and security in general. I don’t want to hand over my precious data to some company without strong and secure encryption and by that I mean that nobody but me should ever be able to get my data.

The second thing to consider is storage. How much do I have to pay for how many gigabytes? How redundant is my data stored and is it checked for data integrity?

Luckily a quick google search revealed that there are a couple of interesting options available:

All of those offer encrypted backups. The data is encrypted locally before it is send to the storage servers in the internet. All of those services have detailed informations about their architecture and features and all of them seem to have happy customers.

Personally after researching for two hours I think I will try Crashplan and here is why:

Pricing

Crashplan is cheap. Its not the cheapest but its cheap enough. You get to backup one computer with unlimited online storage for 49$/year and they have offerings for multiple computers too.

Arq and Jungledisk store the data on Amazon S3 or Rackspace Cloud which are a little more expensive than the other services with their own data centers. The client software of Arq costs another 29$.

Spideroak is carging 10$ / Month / 100 GB.

Security

All the named services offer good encryption and they seem to take similar approaches as well. The important thing is that they offer the option to use a self generated private key which is crucial for having completely private backups. Even if the police would take away all the machines they wouldn’t be able to get to the actual data. Spideroak and Crashplan explain the encryption process very detailed on their websites.

Consistency / Data Integrity

Arq and Jungledisk can use S3 which is considered to be quite save from data corruptions but there are also stories of missing data floating around. But nobody is giving you a full guaranty. Spideroark is claiming a 0.0000% error margin. Crashplan claims daily data verification and auto repair should it ever get corrupted.

The named services seem to have a good reputation of not losing data.

On the Arq website there is also a section about metadata and how the different services manage to keep track of it. The systems are tested with a software called Backup Bouncer. JungleDisk and Arq seem to be the only ones passing all tests, Crashplan fails in one test, Dropbox and Backblaze fail in 19 of 20! The section might be outdated though and since Backup Bouncer is a free tool you can verify it yourself.

Software / Integration

With every of these services comes some kind of software. Arq and Backblaze have native OS X clients while the others have mutli platform tools that do not feel like native apps. This is the only real drawback I found with Crashplan.

Extras

Interestingly enough you also get de-duplicated, compressed and encrypted backups on all these services. With Crashplan you can even choose to not use de-duplication to reduce potential cpu load on your computer while checking for duplicate data. The backups are of course differential which means that only changed data is transmitted, not entire snapshots (except the first). Crashplan allows unlimited file sizes while other services have file size limits of 4GB! It can backup locked and files and if you decide to backup your OS X unix directories Crashplan will happily do so.

Over all Crashplan seems to offer fine grained control over varius aspects of backups – which I like. Their support seems to be alright too. I’ve asked how de-duplication actually works and I got a reply within four hours on a sunday without having an account or anything.

As I said, I will try Crashplan and in addition I will keep backing up irregulary to my external Time Machine disk – just to be sure.

I know that there a a lot of other tools out there and I’m still interested in other suggestions although I’ve probably checked them out already.

You might also want to check out Wikipedias »Comparison of online backup services«

Its worth checking out the FAQs and detailed features of all those services as they usually answer most of the questions you come up with.

Lastly you can google for “Service A vs Service B” and you will get a lot of more articles like these on the web to make up your own mind.

UPDATE 1

Somebody on twitter just pointed me to this post in the Crashplan Support forum where a native mac menu bar app in beta status is available.

UPDATE 2

Thomas posted a link in the comments to a comparison matrix that he made.

UPDATE 3

Another interesting hint from the comments: Dolly Drive
Apparently they offer TimeMachine backups in the “cloud”. Unfortunately their faq is a little short on details especially on security and data integrity so I guess I will write them a mail and put the info into another post.

UPDATE 4

Several (european) readers pointed out that the upload to the crashplan datacenter is really slow, maxing out at 1.3Mbps. This is definitively one major drawback for european customers and something where Arq or other european providers could shine.

Camel Case in MySQL Table Names is a Bad Idea

Today at work I encountered all kinds of “naming schemes” for MySQL tables and columns. Camel case table names in particular can cause serious pain because:

  1. Table names directly correspond to filenames on your hard drive
  2. There are tons of different filesystems and some of them are case insensitive. So if you develop on OS X (case insensitive) but deploy on Linux (case sensitive) things can get funny quickly
  3. There are several different SQL servers which handle camel case / case sensitivity differently. When you switch to PostgreSQL or Oracle you are likely to encounter problems
  4. Read this document to learn about possible implications in MySQL itself

If you use lowercase table names, separated by underscores, you can skip all those potential problems. Luckily renaming tables is not as expensive as altering them.

Cannot delete File / unmount disk because it is in use …

On OS X there are these moments when Finder tells you that the trash cannot be emptied or that a disk can not be unmounted because some files in/on them are still being used. When emptying the trash, Finder even tells you about the files in question but not about the app that is accessing them.

There are two ways to find out:

1. opensnoop

With opensnoop you can display what files are currently being accessed (as in live) including the process id and the name of the application. Either you can display all the files or just the one you are interested in.

For example I have an image on my desktop. I can attach to that file and when I open it via double click in Finder I get the following output:

sudo opensnoop -f /Users/hukl/Desktop/IMG_0434.JPG 
Password:
  UID    PID COMM          FD PATH                 
  501  10244 Finder         9 /Users/hukl/Desktop/IMG_0434.JPG 
  501     32 mds           15 /Users/hukl/Desktop/IMG_0434.JPG 
  501  10278 Preview        6 /Users/hukl/Desktop/IMG_0434.JPG 
  501  10278 Preview        6 /Users/hukl/Desktop/IMG_0434.JPG 
  501  10278 Preview        7 /Users/hukl/Desktop/IMG_0434.JPG 
  501  10278 Preview        8 /Users/hukl/Desktop/IMG_0434.JPG 
  501     32 mds           15 /Users/hukl/Desktop/IMG_0434.JPG 
  501  10278 Preview        6 /Users/hukl/Desktop/IMG_0434.JPG

This only helps though if the file is being actively accessed. More often though an application only holds a reference to the file, preventing Finder to delete it. In this case opensnoop is no good but luckily there is another way:

2. lsof

lsof basically lists information about all files opened by applications. Therefore if I want to know why I can’t delete this image I just opened I can run:

lsof | grep /Users/hukl/Desktop/IMG_0434.JPG 
Preview   10278 hukl    8r     REG               14,5    1584476 483868 /Users/hukl/Desktop/IMG_0434.JPG

Now that I know that Preview.app is still accessing the File I can kill the process and delete the file.

Many times its Finder itself still holding references to the files even if all the applications are closed and there is no apparent reason for not deleting the file. In this case option-click on the Finder icon in the dock and relaunch Finder (you can also kill it in Terminal of course). The files should be deletable and the disks should be unmountable.

Using the Intel 510 Series SSD in a 2011 MacBook Pro at full speed and with TRIM

I just got a new MacBook Pro from my current employer and since I got it without an SSD I bought the Intel 510 250GB and installed it. Everything worked smoothly after the first boot. However, as @denis2342 pointed out, there are a few extra steps to make it run at full speed and performance.

First of all, although this MacBook Pro has a SATA-III interface with up to 6 Gigabit, the System Profiler only showed a »Negotiated Link Speed« of 3 Gigabit. In order to make it negotiate to 6 Gigabit a SMC reset has to be performed. Basically you have to press the (left side) Shift-Control-Option keys and the power button at the same time and after that you have to boot normally.

After that System Profiler showed a »Negotiated Link Speed« of 6 Gigabit.

Then, although OS X enables TRIM support for Apples own SSD drives on the latest MacBook Pros, it doesn’t enable it for 3rd party SSDs. There were workarounds which involved patching a CoreFramework it was kind of messy and not something you’d recommend to any beginner. Luckily there is now a tool called »TRIM Enabler« which allows to backup and restore the Core Framework library and also to patch it with the click of a button. This also worked as expected and after another reboot the System Profiler showed that TRIM was enabled for my 3rd party SSD.

After I ran an Update the TRIM support was disabled again and I had to run TRIM Enabler once more.

I really hope that Apple is enabling TRIM for all SSDs with Lion to make this step unnecessary.

That is about it. This SSD is really blazing fast. If you’re interested, there is a nice in-depth review at anandtech.com

While the SSDs from other vendors are still faster, the Intel SSDs offer a higher reliability.

Mac OS X Keyboard Shortcut for locking the screen

For years I’ve been searching for a keyboard shortcut that would instantly lock my screen / desktop. I’ve seen this on Linux window managers and something like this probably exists on every major operating system.

My current workaround was to open Keychain.app, enable the menu bar icon in the preferences and click each time on the menu bar item -> lock screen.

I tried AppleScript, I tried Automator, I even convinced a friend that it would be necessary to write a small app for this. Recently however, somebody said that there is such a shortcut and that it exists since Mac OS 8.

This holy shortcut is:

⌃ + ⇧ + ⏏ (Control + Shift + Eject)

On Macs without an Eject key you can use the power button in the shortcut instead.

You have to enable »Require Password immediately after sleep or screen saver begins« in the System Preferences -> Security to make this truly lock your screen.

There are probably many other shortcuts in Mac OS I don’t know. If you have any hints where to get a complete list, please let me know!

High Sierra Update

With macOS 10.13 there is now a dedicated screen lock feature which can be invoked with this new shortcut:

⌃ + ⌘ + Q (Control + Command + Q)

Other Updates

  • In Mac OS X terminology this shortcut puts the display immediately to sleep.
  • A website listing this and other shortcuts can be found here