This Sunday (March 31st, 2019) is World Backup Day! Bet you didn't know that was a thing. Completely independent of that, I thought to share (in the momentary absence of new art) some of my thoughts and strategies for backups. The ideas below are collected from things I have heard and read in conjunction with my own opinions and experiences, and (disclaimer) are not necessarily recommendations for a completely safe and secure system. Implementing any system is never 100% guaranteed; that's unfortunately just an outcome of living in an entropic universe.
What are important features of a complete backup system?
Ability to recover from:
A) hardware failure (dead hard drive, fried motherboard, lightning strike)
B) OS/software failure (can't boot into windows, other colossal mess)
C) malware (viruses)
D) ransomware (your files are locked until you pay me $₿$)
E) theft (I'm physically taking your computer out of your physical house)
F) natural disasters (fire, floods, earthquakes)
Frequent, consistent, verified backups
Fast recovery from common failures
Prevent access by others
Access historical versions (accidentally deleted files, version from last week)
I suppose now is as good a time as any to say, if you don't have any backup at all, please, please do get something. For a "set it and forget it" solution, I recommend Backblaze. It's not perfect or free (it's $6 a month) but probably is one of the best single options to try and fulfill the feature list above. You could instead simply plug in an external drive and copy files to that, but I feel that has three primary weaknesses: (1-CD) a bad ransomware/malware attack could lock up or wipe both, (1-E) it’s easy for someone to steal both your computer and your drive, and (2) it’s easy to be inconsistent and infrequent with a backup schedule. Notice above I said "single option"; a great backup system should have at least two tiers to build redundancy and to fully meet the features listed. You can see how an external drive plus continuous cloud backup (a la Backblaze) starts to strengthen the system because the pros of either part help mitigate the cons of the other. That’s the short version of my thoughts. To dig into the gritty details, read on, brave reader.
What kind of backup system do I use? Before we get there, let's consider the kinds of data I have. First, I have my operating system and programs/software. Yes, I can rebuild from scratch, but that would be a royal pain and a waste of time. I have this data segregated on its own physical drive, the OS drive.
Then I have personal documents, which I don't care to lose, obviously. My photo library is large (30,000+ photos) and is very precious (family photos always are).
Animation projects produce lots of data, easily several gigabytes, while larger projects can produce dozens or hundreds of gigabytes. Some of that is temporary data which can be cleaned out after the project concludes, but while the project is on-going, I do want to keep and protect it.
Finally, there's a lot of ripped media: music and movies (from discs I own, okay?). Fairly easy to recover, so I'm not going to go nuts with the backups, but I don't want to re-rip a ton of CDs and Blu-rays either.
Okay, now I can get to my backup system. My first line of defense is backing up my computer to a two-bay Synology NAS. It's like an external drive that's connected via ethernet instead of USB. And there's two hard drives, so I can fit twice as much stuff, right? Nope: the two hard drives are mirrored (that's called RAID 1), i.e. exact duplicates. This is so if a hard drive fails (it's happened at least twice), all the data is preserved and I just need to stick a new hard drive in to replace the failed one. Side note: RAID is not a backup solution on its own. It doesn't cover scenarios B through F, nor all of A. What and how do I backup to the NAS? I use Macrium Reflect backup software to daily create an image of my OS drive. I have a second daily backup scheduled for all my important data. The NAS isn't huge and I am keeping a few historical backup copies so I can go back in time if needed; this means I have to be a bit selective about the data that's backed up, so some folders (e.g. ripped video) are excluded.
This NAS component does meet several of the features listed. For 1 (recovery), it hits A, B, most of C/D, some of E/F. Number 3 (fast recovery) is a big one. I once had my computer simply not boot. I spent the morning troubleshooting trying to fix the issue. At noon I said forget it and restored an OS drive backup. I was up and running that afternoon. Unfortunately my OS backup was a few months old, which caused more problems than one would think. So now I backup my OS daily. Still, the method works. And for 4 (outside access), no one should be able to gain access to my NAS that wouldn't be able to gain access to my computer. I do wonder how and if I should strengthen that. I don't know quite enough about network security.
Okay, not bad, but we're still screwed in a few scenarios (e.g. fire, massive theft, really bad virus). Also, features 2 (frequent/consistent/verified) and 5 (historical backups) are good here but not great. It's not a perfect "set and forget". Right now it's behaving and verifies all the data, but I've had some hardware/software problems in the past which meant out-of-date or inconsistent backups. Some hands-on maintenance is necessary. And as I said, there's not a ton of space to maintain history. I should get a larger NAS (link to non-existent GoFundMe).
So I also use Backblaze, which just backs up mostly everything. This covers my ripped media and adds redundancy for the rarer (#1) scenarios I just mentioned. For 2 (frequent/consistent/verified), it's good*, and for 5 (historical backups), you can roll-back anywhere from today to 30 days prior. By default it doesn't backup OS/software, which would likely be hard to restore anyway with this method. If I had to recover from one of the very bad scenarios, a couple days re-installing wouldn't be my worst problem. For 3, recovery time would be slower since they'd have to ship physical drives to me with the amount of data I have. And 4, do I trust them to protect against outside access? It's a good question, and one that prevented me from using cloud backup systems for a long time. The data is encrypted at rest, and I have a strong password and two-factor authentication, but software vulnerabilities and leaks do occur. I suppose in this case I'm on the side of the benefits outweigh risks. Would I rather lose access to my own data or have someone else access my data? I lean toward the former.
Are we done? Well… we could be. But there's that niggle of doubt. What if I have a huge fire and lose my computer and local backups, and then I realize that Backblaze only has two-thirds of my data? Apparently some people have had issues with Backblaze - I can't say for sure one way or the other, but the fact remains that a third-party service can't be trusted 100%*. So for ultimate backup security, I also periodically transfer all my important data to an encrypted hard drive and store it in a safety deposit box at the bank. And… of course you can't just use one hard drive, because the bank vault is empty while you're backing up to the drive. Gotta have two and swap them out. Yep. That should be pretty secure, and meet the criteria for all of 1 (data recovery) and 4 (outside access), and a reasonable rate for 3 (recovery speed). 5 (historical backups) is also good, depending on what I can fit on a single hard drive. The problem is 2 (frequent/consistent): I aim for once a month, but the reality is more like a few times a year. So it's a terrible primary method, but a great tertiary method.
And that's all! Did I miss anything? Do I have a critical vulnerability? Do you have a good system? Let me know in the comments here or on Twitter or Facebook. Happy World Backup Day!
Thanks for reading,
* Backblaze verification: Backblaze does backup continuously, which means there's typically not much time between making a change to a file and that file being backed up to the cloud. However, verification is something that requires a bit of faith. Apparently all files are checksum’d before upload, but there's no great way to verify that Backblaze has all my data intact. I can check online which files are there, and yes, I can download a random selection of files from time to time, but that's still a random selection of < 0.001% of my files. On the other hand, Backblaze is a successful business employing backup experts, so I'm sure they're doing their due diligence (you had ONE job, right?). Still, at the end of the day, I personally don't like to have one company 100% responsible for my backups, so I don't.