Hello, and welcome back to CS631 "Advanced Programming in the UNIX Environment". This is week 13, where we will take a look at the various ways in which processes can be restricted from (negatively) impacting one another. This allows us to simultaneously take a look back at what we've covered throughout the semester as well as dive deeper into some topics and hint at additional, related topics that we do not have the time to cover in full in this class. The materials of this week will be divided into the following topics: Permissions and POSIX Access Control Lists, covered in this video Changing eUIDs and Restricting even root's capabilities Restricted Shells Chroots and Jails Process Limits once more CPU Affinity and cpusets and finally Control Groups, Namespaces, Capabilities, and how all those ultimately lead to containers, which have seen a dramatic increase in popularity in recent years. --- But let us go back, all the way to Week 01. The Unix family of operating systems has been, by nature and from its first conception, a multitasking, multiuser system. This implies the need for a number of concepts, such as separate accounts, user privileges, file permissions, process ownership, etc. In addition, as we've discussed repeatedly, all resources available to the system are finite in nature, so there is an inherent competition over these resources: CPU time and memory are limited, disks fill up, open file descriptors are run out of, etc. etc. Some of these resources are managed by the system: for example, the scheduler places different processes on the available CPUs using an algorithm that ensures that no process is starved or overuses the resources. Disk space is finite, but the system may enforce e.g., user quotas to ensure that not one user can fill up all available disk space or use more than their share. The filesystem itself reserves a certain number of inodes for the superuser / system usage, so that even a completely filled up system can still run until the administrator can at least clean up things. Some of these approaches are operating system specific, while others utilize common system calls to reach their goal. In addition, standardized approaches and de-facto standards also come into play. - By the way, this lecture has an accompanying long-form blog post, which I recommend you go through as well. Obviously there will be some repetition, but in these videos I'll try to illustrate a few of the capabilities in practical examples. --- As so often, it's useful to start by reviewing what we already know. You should find that many of the things we've discussed throughout the semester are directly relevant to this topic. For example, in our video for Week 02, segment 1, we looked at how the system may limit the number of file descriptors a process may have open: the openmax.c program illustrated that there may be - a per-process resource limitation (retrieved via getrlimit(2)), - a system-wide defined value hard-coded into the kernel or derived from a fixed header (i.e., OPEN_MAX from sys/syslimits.h), - as well as a system tunable configuration option, possibly changing at runtime from invocation to invocation (i.e., _SC_OPEN_MAX from sysconf(2)). - Then, as we discussed the Unix file system, we identified the basic Unix access semantics for file access: user, group, other. --- Together with the access logic for directory access outlined in Week 03, segment 3, this simple model allows us to restrict what resources in the file system a process may access. We saw that access is determined in a specific order, from most-specific access to least specific access, and by combining group membership and carefully selecting the right permissions, we can provide access broadly as needed. --- However, the granularity of this model is somewhat limited, as it only allows distinguishing amongst these three sets of users: owner, group, and everybody else. Even though a user may be a member of multiple groups, there are different limits on the number of groups you may be a member of. On a system that uses e.g., NFS, you may be restricted to only 16 groups! What's more, access control via group membership is particularly cumbersome: for starters, unlike users, a file can only belong to a single group. So if you want to share your file with members of group A and B, but not with everybody else, you're out of luck. In addition, any changes to group membership require action by the administrator of the system; users can't self-control their group membership or create new groups ad-hoc. --- Several Unix filesystems overcome this restriction through the use of so-called Access Control Lists, or ACLs, most notably via a POSIX extension (POSIX.1e). These ACLs allow the user to specify in more granularity the access they wish to grant. - These Access Control Lists are implemented as so-called "Extended Attributes" - in the file system, meaning they require support of the file system in addition to in the kernel. - In the default 'ls(1)' output, the presence of extended filesystem attributes are usually indicated by the '+' sign after the usual permissions string, - and interactions from a user's perspective is via the setfacl(1) and getfacl(1) tools. Let's look at a few practical examples on different Unix versions, since the implementation of POSIX ACLs is not quite as uniform as one might like. --- Ok, in this terminal, we'll demonstrate the use of ACLs on a Linux system, an Ubuntu 16.04 system in particular. My user is a member of a few groups, meaning that I can easily share files and grant access to all users who are members of one of these groups at a time. Let's consider the file 'simple-cat.c'. Right now, it has read/write permissions only for the owner - i.e., myself. If I want to allow other members of the group 'professor' access, I can change the group permissions, and if I want to allow members of the group 'sigsegv' access, I can 'chown' it to that group. But this means that now members of the group 'professor' no longer have read access. So how do I get _both_ groups access? Looking at the manual page for 'setfacl(1)', I can find the syntax to grant group access. Let's give that a try. Hmm, no luck. Let's see what type of file system we're on. Oh, NFS. NFS, it turns out, doesn't support POSIX ACLs. Or rather, NFS has its own implementation of ACLs, which are only supported in NFSv4 and uses different tools, which we do not have available on this system. Let's move the file to a local filesystem. Ah, ok, so this is an ext3 file system. That should support POSIX ACLs. Let's try again. Good, no error. Let's look at the file again: Note that now the 'ls' output shows a '+' after the permissions string, indicating extended attributes on the file. Let's inspect the ACLs using the getfacl tool: Ok, here we see that now we have group permissions for 'sigsegv'. But wait... we _already_ had group permissions for 'sigsegv', but we wanted to allow the group 'professor' to retain read access. Here, let's give that group read access. Now let's look at what getfacl and ls tell us. The group ownership of the file is 'professor', and standard Unix group permissions allow read access, but the extended attributes also show that now the group 'sigsegv' has retained the read permissions we granted earlier. Now let's suppose we want to give access to this file to some the students in this class. Let's say... Edward and Mingyao. Hmm, both are only members of the 'student' group, and I don't want to give _all_ students access to my file, so I don't want to use this group. But with POSIX ACLs, I can also specify individual users, so let's give that a try: Ok, that seems to have worked. Let's check: Here, in table format it's perhaps a bit easier to read the permissions, and it clearly shows that we were now able to provide fine-grained access. What happens if we copy this file? Hmm, looks like we have to do all the work of applying our ACLs again. That's lame. Maybe there's a way to copy the ACLs from the other file? There we go -- setfacl can read the ACL description from stdin and apply it. Nice. But even better: tools like mv(1) or cp(1) are able to copy the ACLs right away: There, all three have extended attributes, and getfacl shows the correct permissions. If I want to clear all of the ACLs, I can do that via 'setfacl -b', and the file will no longer have any extended attributes. --- Ok, we've seen that we can grant granular access using Access Control Lists on Linux. Now let's compare this to macOS. Here we have our simple-cat file once more, but now I want to grant access to the "wheel" group without changing group ownership. To do that, we use chmod(1) with the "+a" flag. As before, if we run 'ls -l', we get a '+' symbol indicating that there are extended attributes associated with the file. But on this system, we don't use getfacl(1). Instead, we use ls(1) with the "-e" flag. I know, you didn't think there was still a letter in the alphabet available to use as a flag for ls(1), but look at that, we found one! Here, ls(1) now tells us that the extended ACL rule number 0 is to grant read access to the group 'wheel'. If I then want to make sure that the user 'daemon' will _not_ have access, regardless of their group membership, then I can use this command here. Now our rule set looks like so. The chmod(1) manual page includes a lot of information about how ACLs can be manipulated as well as which grants are possible. You'll note that we distinguish between delete operations, operations on the file's meta data, operations on a directory, operations on a regular file and so on. It also includes a number of example, which you can play around with on your own. Now if you want to delete a specific ACL, you specify it's number using the number sign. So here we delete the first rule, for example, and so the next rule logically now becomes the first rule, rule number 0. Note, by the way, that you must not have a space between the "-a" and the number sign. Remember: in shell, the hash marks the beginning of a comment, and you can use comments on the command-line as well. That is, there's a difference between this and this. Anyway, to remove all ACLs on macOS, you use "chmod -N", and as before, the file no longer shows any extended attributes. --- Finally, let's take a look at NetBSD. Even though this is our reference platform, we're looking at it last. This is because support for POSIX ACLs is not yet in the latest stable version of NetBSD, but it will be part of NetBSD 10.0. Here we created a separate VM running NetBSD-current. Let's see... Ok, so here' our file, owner by myself and with group permissions for the group "wheel". The group "staff" comprises a few users who I'd like to share access with, but since I'm not a member of the group, I can't chown the file to this group. Let's try setfacl(1)! Hmm, no luck. The root file system here doesn't support ACLs. As explained earlier, the filesystem needs to support POSIX ACLs. Let's use a separate file system on a second disk like we've done before in the past. Here, we create a new file system on the ѕsecond disk, wd1 and then call "tunefs(1)" to enable POSIX ACLs in the filesystem super block. We mount the file system and chown it to our regular user. Now we can copy our file to /mnt and then call setfacl(1) to share access with the 'staff' group, using the same syntax as we did on Linux. As before, ls(1) shows the presence of extended attributes via the '+' symbol, and getfacl(1) shows the results. So now, fred, being a member of the 'staff' group, can display the file, but of course still can't write to it. Ok, now let's try to give 'molly' write access. Oh, look at that, molly, who is in the group 'staff' cannot read the file! This is because her per-user ACL denies read access, even though her group membership should allow it. This reflects how our regular unix permissions are applied in order as well. But her write access works as expected. Now suppose you want to give everybody in the 'staff' group access, but not 'jenny'. To do that, we use setfacl(1) like so. As expected, 'jenny' can't read the file despite being the 'staff' group, again because per-user permissions take precedence over per-group permissions. So we see that while ACLs allow for fine grained access control, they still follow some of the same semantics as we had previously learned. --- Ok, time for a break. Let's quickly recap what we learned about POSIX ACLs: - Access Control Lists are stored as extended attributes in the file system and thus require the support for this in the file system. Not all file systems necessarily support them, or they may be disabled as a mount option, for example. - If you begin to play around with ACLs, you'll quickly note that you may run into conflicting descriptions -- what if you want to grant access to all members of a group but one -- do user ACLs take precedence over group ACLs? The order in which they are applied may play a role here. Give it a try and verify what the different outcomes are. - The implementation of POSIX ACLs may differ from operating system to operating system. We saw examples - from Linux, where we use the setfacl(1) and getfacl(1) utilities; see the acl(5) manual page for more detail on that platform; - and from macOS, where the ACLs are applied and manipulated using the chmod(1) utility and inspected via ls(1). - Different BSD systems implement ACLs as well, although our reference platform -- NetBSD -- does only have POSIX ACLs in the next upcoming release, NetBSD 10.0, and will use the getfacl(1)/setfacl(1) semantics it imported from FreeBSD. In our next video, we'll revisit effective and real user IDs, and how you can elevate privilege using the sudo(1) and su(1) commands, but we'll also look at how we can restrict your privileges in the same manner. Until then - thanks for watching. Cheers!