We all talk about backups. We all know we are supposed to have backups. But what are backups? Here’s my definition, from experience and talking to lots of y’all:
Backups: Copying the data created and captured by applications and devices to a different location, so that it can be restored in the case something happens to the original copy of the data.
So backups are pretty easy right? All you need to do make a copy of the data! No probs, hell you can do a copy and paste and you have a different copy of the data.
Sysadmin flashback: I was a Linux admin at a place that was too cheap to buy backup software for Linux (this was when Linux first started being used in the enterprise) because I could just write an SCP script. When I balked because we needed some sort of checking they told me to figure out how to write it into the script. I did. Sigh. More on this in a bit — but right now back to the topic at hand.
To get a backup, all you need to do is make a copy of the original data. If that’s all it is, why are there so many backup products? Well, there are lots of considerations. Let’s start with where to put a backup: Where should the second copy of the data go?
- It should go someplace besides the same system the primary copy of the data lives.
Yeah, that should be obvious, but you’d be surprised.
- You may have a legal requirement or legacy expectation that the backup data needs to go to tape.
If it goes to tape you need software (or you need to write a script) that uses tar or some other utility that can stream your data to tape.
- Maybe it needs to be in a couple of places, local data center, remote data center, tape….???
- Maybe it needs to go…. “to the cloud”…
So hold up. We are just talking the basic definition of backups – making a second copy of your original data – and we’ve already got so many options. Maybe this is a good time to to take a step back and think about who needs to weigh in on the decision where this data should go. Let’s keep this main principle in mind….
Why do we back data up?
The ONLY reason we back data up is so that we can RESTORE it.
That’s it. Before you pick any sort strategy on where you are going to put that second copy of the data, you need to know what your business stakeholders are going to want to have happen if something goes wrong with the original copy of the data.
- Will the business be ok if it takes you a couple of days to get the data back?
If so, you may want to consider your tape and cloud backup options – how long will it take to restore data from those sources? This is the info you need to get so you can agree with your business stakeholders on the RTO – Recovery Time Objective. This is how long the business can tolerate being without their data.
- Will they expect you to instantly restore a file? All the files? A subset of files?
The answer to this will help you decide what type of backup to use, how long to keep your data someplace that you can restore it rapidly, etc. This is tied to the RTO – how long will it take you to get your data back for business operations. But for what you need to do to make it happen, it’s also tied to the RPO – recovery point objective.
If your business is constantly revising files, or depends on emails to get things done, they may need you to restore a file and bring back the file with the latest changes. So if you are backing up every night, but your business is making changes to important files hourly, you may have software that lets you instantly restore a file — but can you instantly restore the version of the file that the business needs?
- How much will it cost your business to be without their data for a few minutes? Hours? Days?
This helps you to have a business-driven discussion with your stakeholders on the costs of backing up the data in a way that supports the expectations of the business. If it is really important to the business, they need to understand that a sysadmin writing a script to scp data back and forth may not support their needs (even if the sysadmin has wicked awesome shell scripting skills). They need to fork over the money to protect and restore data in the way that keeps business operations going.
The only way to know what your business needs when it comes to restoring data is to ask them before they need you to do a restore. Then you can do investigation on where the data needs to go, how it needs to get there, and the tools you’ll need to do it. You can bring a proposal to your stakeholders with the costs as well as any tradeoffs involved in meeting their expectations. When you come to an agreement with them on RPOs and RTOs, then you can set SLAs. Then no one will be surprised when an event forces a recovery, and you make ops look like the rockstars we are!
So scratching the surface of backups took longer than I thought. I’ll write more on the process of “doing backups” in a couple of days — but what do you think, did I leave anything important out of this discussion?