A comprehensive backup strategy

2014-01-04

Today I want to stray from the usual path of papercrafting and talk about something entirely different: Backup strategies. I was prompted by the recent reports of the Cryptolocker malware to reconsider my own strategy, and I think I came up with quite a nifty scheme for backups. Just as I did, you may think that your data is save because, well, of course, you back up. But do you really cover all possible cases where your data is at risk?

At the basis of a good backup strategy is risk assessment and management. Now, what are the risks your data face, and what can (and should!) you do about it? Well, very coarsly, the risks are:

  • Hardware failure
  • Viruses
  • Hacking
  • Inconsistencies in the file system
  • Disasters, e.g. lightning, fire and water
  • Theft

It’s helpful to see how the risks affect different types of data. Your data generally falls into the following categories:

  • ‘Working’ data, i.e. things you are currently working on, such as tax reports, a Powerpoint presentation for work, or a letter to a family member
  • The operating system
  • Application data, i.e. data stored implicitly by various applications, such as the list of previously edited files, customized settings, or thumbnail images from photo management software,
  • Personal media: videos and images you have taken

There is a wide range of backup strategies, each with its own advantages, disadvantages and limitations. In order to see how they fit the risks and data types, we can use this matrix:

Working data Operating sys. Applic. data Personal media
Hardware failure Cloud System backup System backup Offline copy
Viruses Cloud Offline copy Offline copy Offline copy
Hacking Cloud Offline copy Offline copy Offline copy
File system Cloud System backup System backup Offline copy
Disaster Cloud System backup System backup Offline copy
Theft Cloud Offline copy Offline copy Offline copy

Why do we need this table? Well, some data must be treated differently than others. Personal media files, for example, are quite large and it is difficult to back them up over the internet. ‘Working data’, on the other hand, tends not to be that big, but needs to be backed up very frequently. Even losing data from one hour ago may amount to significant costs in these cases.

And some risks need to be treated differently than others: Having an exact 1:1 copy of your hard drive, for instance, is nice when you want to recover the exact state of your computer after, say, a hard drive failure. But it’s very inconvenient, to say the least, if you accidentally deleted just a single file. You would not want to wait through restoring the whole hard drive just to get to that one file.

Let’s go through the table step by step:

For data you are constantly working on, use a cloud storage system such as DropBox or SugarSync. They allow you to set up a directory on your harddrive that will automatically be mirrored “in the cloud”, i.e. all files that are created or modified in this directory will be copied in a safe and secure manner to a server. This covers all cases of data loss nicely, but has two disadvantages: First, it is not convenient for large files or large numbers of files, as uploading changed files takes too long. Second, in order to be protected against viruses and hacking, where files may be deliberately and covertly manipulated, you need to be able to go back to a previous version. Some cloud services do this better than others. I have found that SugarSync keeps the last five versions of a file, for instance, whereas with DropBox I have had cases where I could go back to previous versions and cases where I couldn’t. Therefore, I highly recommend to try SugarSync.

With cloud services, you can replicate your “cloud archive” on two or more computers. I highly recommend doing so if you have access to two computers (such as a desktop and a laptop). In this way, you automatically have a second ‘local’ backup on the other computer, in addition to the data being stored in the cloud.

What about your operating system? Well, assuming you have the necessary installation disk (and if you don’t, please create an installation disk right away), you can always reinstall. However, reinstalling is very time consuming, especially if you then have to reinstall all your applications and adjust all the settings. With most modern systems, including Windows 7 and 8, you can use on-board software to back up the system in its current state. For Windows, it’s called ‘Backup and Restore’, and it can be accessed as follows:

Open ‘Backup and Restore’ by clicking the Start button, clicking Control Panel, clicking System and Maintenance, and then clicking Backup and Restore. Then click ‘Set up back up’. If you ‘let Windows decide’ which files to choose, it will automatically back up your personal files AND a system image if the target drive is large enough and is NTFS formatted. It will not back up program files. That means in the case of data failure, when you restore the system, third party programs are missing (see below on how to tackle that problem).

This must be done on a separate hard drive, of course. If you don’t have two hard drives in your computer, you can buy one for internal installation (if you have the space). Or you use an external drive, to be plugged into the USB port. Edit: I originally recommended Seagate HDDs, but see for the comments on a discussion about the reliability of various HDD brands.

Very importantly, please use a harddrive without internal encryption. This may sound strange, as encryption = security, right? Well, not really. A harddrive with internal encryption does only protect against theft of the hard drive itself, without the controller (as otherwise, with the controller, the hard drive can be read again). With an external hard drive, the probability of it being stolen without the inbuilt controller is zero. So you don’t gain any security. However, you lose the ability to restore the drive if, for any reason, the inbuilt controller breaks. So hard drive encryption primarily puts you at the risk of data loss.

So what happens if your system is compromised by a virus, such as Cryptolocker? A system image on a second hard drive will be affected as well, if it was connected to the computer at the time of infection. And if your computer is stolen, it is highly likely that an inbuilt, attached or nearby external hard drive will be gone as well. For these cases, you need an offline copy, i.e. a copy of your data on a hard drive that is not permanently connected to your computer.

What I did is, I bought the above-mentioned  Seagate Backup Plus 1 USB hard drive (which may not be the most reliable one, see the comments for more details). I connect it to my computer only once every 6 months, perform a complete system image copy, unplug it and store it in a different location, about 100 miles from my home. This makes my data pretty safe even in the case of severe disaster. The location should be far enough so as to not be affected by any reasonable risk, but close enough so that you can get the drive in a reasonable amount of time when you need it.

Now, how to do a perfect offline copy? Buy a large enough hard drive so that all your data plus your operating system fit at least twice onto it. Then, back up everything twice: First, create a directory and copy all files on your hard drive individually into this directory. This will be very useful when you find that a small to medium number of files have been compromised, and you need to get to these individual files quickly.

Then, make a system image. A system image is a 1:1 copy of the exact layout and content of your hard drive. A system image is a monolithic block, i.e. you cannot access individual files from it (thus copying all files individually in addition to the system back image). However, you can restore your system exactly like it was at the time of the backup. It’s sufficient to create a new system image every couple of months, or whenever your computer setup changes significantly. Most parts of your system don’t change much over time, and if you have saved your frequently changing files in the cloud, it’s easy to get from the system image to the current state by copying them back.

Very importantly, a system image as an offline copy should be created when the operating system is shut down. That is, you need to boot a special backup program which will copy your system while it’s not active. One highly recommended such program is Clonezilla. At the link, you will be able to download it for free. The website also contains detailed instructions on how to use it. Essentially, you will create a so-called ‘live CD’, from which you can boot your system and make an exact copy of your hard drive onto a second hard drive.

The bottom line: By combining a cloud archiving service, a built-in or permanently attached hard drive for automated system backups and an external hard drive for offline copies, you will cover all risks listed above. The whole backup strategy is very manageable, as it is mostly automated (cloud and system backups). The only manual step, the offline copy, is required only every couple of months. Be sure to make a fresh offline copy at least every six months, as this will ensure a reasonably up-to-date copy AND it will ensure that your external hard drive is actually still working.

One last and very important remark: If you are not computer savvy or are unsure about the exact steps required, please ask someone for help, preferably someone whom you know well and trust and who can help with your specific system. It’s easy to make a mistake with your backup strategy, and such a thing can be very costly. In the worst case, by operating a backup program wrongly, you could actually destroy the very data you wanted to back up. So be careful and take it slowly!

 

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. For more details on what you can and cannot do with my work, see here.

Comments

  1. SteveBrown,

    How could you recommend Seagate for a backup after “Blaze” the other week reported on its reliability experience with major HDD brands from its 25,000 HDD installed drives database?

    1. admin,

      Thanks for raising this and pointing out the Blaze findings! You are referring to this, right? Well, first of all, when I posted I hadn’t seen those results. So it might in fact be a good idea to reconsider the choice.

      I work in an IT heavy environment, and we have used HDDs from Seagate, Western Digial and Hitachi. From my personal experience, Hitachi was not as reliable as Seagate, and in fact we never had problems with Seagate. However, the number of hard drives we went through was very, very low compared to Blaze, so I am more than willing to believe the Blaze statistics much more than my personal experience.

      As for Western Digital, the external WD hard drives that I found all had internal encryption, which I don’t think is a good idea for a personal backup strategy, for the reasons explained in the post.

      So, the bottom line is this: The huge statistics shown on the Backblaze blog make a compelling point to prefer Hitachi over Seagate. I have made a note in the blog post and referred to your comment, and I will revise the article in more detail in the coming days.

Comments are closed.

-->