Hello, and welcome back to CS615 System Administration! This is week 09, segment 2. In our last video we talked about core concepts in the larger topic of backups -- things like full versus incremental backups, storage media and their properties and the difference between long-term archival backups and routine backups for data recovery. As discussed there, the recovery from a disaster or systemic failure may have specific practical implications, but even for individual file recovery, we need to pay attention to a few details that will then define our overall backup strategy. --- Specifically, we want to provide to our users some guarantees that deleted files should be able to be restored within a given time window. From a pure usability point of view, users may expect the ability to - undo changes, and the normal behavior of the Unix tools to do exactly what is asked of them may at times confuse or frustrate users -- rm(1), well, removes a file, and doesn't, for example, move it into a special directory, so there is no inherent "undo" on the filesystem level here. So instead, we have to find a way to agree with our users what a realistic - recovery point objective is. That is, we define the time window of data loss, or, the other way around, the granularity of our backups. We might, for example, say that we perform nightly backups, meaning we can only restore data from, on average, around 12 hours ago, 24 hours ago worst case. But once data is lost, getting it back is not usually instantaneous, even if it falls within the Recovery Point Objective, which is why - we have to define a Recovery _Time_ Objective -- the time it takes to get back your files. This will include - staff availability, and their overhead. For example, if you have to fill out a support ticket to request a file restore, then somebody has to get around to reading that ticket and then have time to act on it. - But it's also possible that if your tape library is currently busy writing the backup of the current data set to tape, it can't restore your data. This will be less of a concern in large scale enterprises, where hopefully you have multiple means of accessing the backup, but in smaller environments this may be an actual concern. - And of course retrieving data will actually take some time, too. If you lost one file, but it's on a magnetic tape with a 20 Terabyte backup, you may have to go through several hundred gigabytes of data before you get your file back; likewise, if you lost 20 terabytes of data, restoring those will necessarily take time, restricted by the tape performance and the I/O speed the device you're restoring the data _to_. Now all of this makes restoring data a lengthy and cumbersome process, which is why it'd be _even_ better if you could set up your backup system such that it allows - for self-service restore, where any user can access the backups of their data themselves. But this isn't easy to implement, particularly at scale. We'll see some methods that help in this regard in our next video, though. Finally, it's worth mentioning that backups also have a surprise side effect: data that - was deleted now is still available somewhere. And in some cases, that can be a problem -- sometimes, when you delete data, you _really_ want to make sure that it's gone and can _not_ be restored. --- But let us look at a few practical examples. One of the tools we'll use here is one that you've used already on a regular basis when you've submitted your homework assignments: the tar(1) command. So let's have a look... - tar(1) is a _tape_ archiver, and one of the Unix tools supporting a rather surprising number of command-line options, some of which... don't even need a dash. [pause] This is in part why - tar(1) is often cited as an example of arcane Unix wizardry, but it really is not all that complicated. tar(1) is an old tool, and back in the early days of Unix, many tools did not require a dash to specify options. The overall use of tar(1) really boils down to just a handful of commands, so I'm sure you'd be able to disarm this bomb without a problem. But anyway, - tar(1) is a tool to create an _archive_ of a file system hierarchy and generally was used to write it to magnetic tape, with a default device of /dev/rst0. --- So let's see how we can use tar(1) for backup purposes. Here's an EC2 instance with a second volume attached, so let's create a new filesystem on it and let's mount it under /backup. Now, let's create a directory timestamped after the current date.... there. Now we can backup our /usr/local directory using tar(1) by creating an archive and writing it into the filesystem. There we go. So now we have our copy here on the extra disk, which is all nice and well but... --- ...our backup disk is still local to this system. That is, we really only have created a copy on another disk on the same system. But since tar(1) can write data to stdout, we can simply pipe it into any command, and so we can simply write the data to a remote system instead. [pause] But of course there's nothing that requires us to write the data back to a filesystem -- recall that tar(1) is designed to create an _archive_, a file in a specific format. So we can write this data directly to a block device instead of extracting it into a filesystem as we did in the previous example. For that, we can use our old friend, the dd(1) command. [continue] So now, checking in over here on the remote backup destination, we can read the data back from the block device and... find that yep, we did get the archive written to disk here. So now... back on our original server... if we "accidentally" deleted the data from this directory we can restore it from our backup on the remote system using this command. And there we are. By using tar(1) and writing the data into a pipe, we gain additional features, too: for example, we can compress the data before we write it to the remote side. But we can also add other data transformations: for example, we can encrypt the data before we send it over -- illustrated here using the 'openssl enc' command. Now to restore the data, we simply reverse the steps: first decrypt then decompress then write to the file system. There we go. So that's pretty useful, and a nice illustration of the flexibility of tar(1) as more than just a tool to submit your homeworks. --- But to create backups, we have a few other tools. One of the oldest tools here is the dump(8) command. The dump(8) command is used to perform filesystem backups, and unlike tar(1), it has some logic to determine which things it should back up. dump(8) also distinguishes between what we discussed earlier: full -- or level 0 -- backups and incremental backups, meaning it can be used to only back up data that has changed since the last backup. This time, we're writing data to a file on the filesystem on the remote host by piping the output of the dump(8) command into ssh and cat(1). The dump(8) command will take a few minutes as it determines which files need to be backed up -- all of them in this iteration, since we're performing a level 0 or full backup -- and then write the data over the network to the remote file. Now dump(8) then keeps a log of which level backup it performed by writing to /etc/dumpdates, which... specifies the disk, the backup level, and the date, as shown here. On the remote system, we find the complete level 0 backup file, which tells us when this was written, when the last time a full backup was performed -- here, the epoch, since we never performed a backup -- as well as some other information. So now that we've done a full backup, let's see what an _incremental_ looks like. For that, we create a bit of new data by extracting our group project's git repository into /usr/local. Now, let's run dump(1) again; this time as an incremental, and writing the data to a second file. Now note how our backup completed much faster than before, since it only had to copy the files that have changed since the last full bacќup. And on the remote destination, the backup file is of course much smaller than the level 0 dump file. On our server, /etc/dumpdates was updated to reflect the incremental backup. So now, let's simulate some data loss by removing a few files. Oh no! Yep, those files are gone. Now what? Well, let's restore the data from our last backup! We do that using the restore(8) command, asking it to extract the files for /usr/local from the last incremental backup. Here we go. The command tells us from when the data was and then writes it out into our filesystem. There we go - data back. But... we also had removed the /etc/rc.d directory. Let's try to restore that. Hmm, that didn't work out. /etc/rc.d was not found on the backup in question. Which is no surprise, since the incremental only copied files that had been changed since the last level 0, but /etc/rc.d hadn't changed. So let's look at the full backup. For this, we copy the file over here so we can interactively inspect it. Now we can run 'restore(8)' in interactive mode, which drops us into a shell-like prompt. The information about the backup can be displayed like this, as the restore(8) command provides a few simple commands for us to interact with the backup. We can list the contents of the backup using the 'ls' command, change into the directory and inspect its contents, and then select individual files or directories to restore. Then we can extract the data, set the permissions, ... ...and... there we go, our lost data has been restored. Yay. So the dump(8) command supports incremental backups natively, and restore(8) lets us select what data to recover. This is a bit different from our earlier example using tar(1), but is there a way to use other tools in a similar manner? --- Let's try again using tar(1). As we showed before, backing up the data from a directory is easy enough... But if we create new data here... ...and then run the back up command again, we're again copying _all_ the data. No incremental here. But what if, instead of using tar(1) we used a different tool? The 'rsync' command allows us to sync a directory hierarchy to another location so when we run that... you'll note that even though it lists all the directories, it does not actually copy all the files again, but instead only the newly modified files. So we can effectively perform incremental backups, although in this case the incremental is directly applied to the remote destination and not kept as a separate delta, as in the case of dump(8). [pause] But now note that we have yet another use case: not only do we want to keep track of what files we have, but we may also sometimes wish to ensure that a file that we have deleted on our filesystem is removed from the other side. And while dump(8) is great at creating incremental backups, the incrementals really only contain the data of files that have changed, but no notes about which files have been removed. So [continue] when we remove some files here and then run the rsync(1) backup, this behaves just like dump(8), in that the locally deleted files remain present on the remote side. But rsync(1) has another option to ensure that we also sync file deletions -- the "--delete" flag. With that, we're removing the files on the remote side when they have disappeared locally. And this is useful, as it provides a functionality that our dump(8) tool did not offer. But now we have another problem -- how do we roll back changes made locally if we later add the files again? We won't be able to get back to the earlier state where they were gone. Wouldn't it be useful to instead of only focusing on files added or removed to have a better way, a method to say "show me what the filesystem looked like at the given time"? Well, we do have ways to do that, but I'm afraid you'll have to wait until the next video to discover how those work. For today, we'll --- take a break and review real quick what we've covered. - We've seen how tar(1) can be used to create an archive of a file system hierarchy, and - how we can do a number of things with the data created in this way beyond writing it to a file: we can write it to a raw disk device using dd(1), copy it to a remote system via ssh(1), and transform the data in the process, to, for example, compress and encrypt it. But, as we just showed in the last example, - using tar implies a full backup, as there is no support for incremental backups in tar(1). Now - dump(8) on the other hand does support incremental backups, and - integrates with the system more tightly: /etc/fstab contains a field to help dump(8) decide which filesystems need to be backed up, and /etc/dumpdates keeps track of the last time of backup and so on. To get back the data backed up in this fashion, you can - use restore(8), either to restore _all_ the data, _some_ of the data, or to interactively browse the backup. And then we looked at - rsync to show how it can incrementally back up data and, unlike dump(8), how it can - even delete data from the dataset, but we also mentioned that we're still missing out on some functionality, that we'd like to be able to roll back time and browse the filesystem effectively as it was at a specific point in time. How to do that will then be the topic of our next video, so until the next time -- thanks for watching! Cheers!