Hello, and welcome back to CS615 System Administration! This is week 3, segment 4, and we're going to wrap up our discussion of partitions and filesystems -- to the extent that we're covering them on this high level. When we left off in our previous video, we had shown that we can mount the filesystem we created on our partition and then create files there, so perhaps now would be a good time to look a bit more closely into what types of files the Unix filesystem supports and how they behave before we then consider at what partitions and filesystems are typically mounted on the systems we consider in this class. Here, let's have a look: --- Here's an empty directory, containing -- as usual -- the two entries 'dot' and 'dot dot' referencing the current and parent directory respectively. We create a regular file, which 'ls' then shows as such by leaving the first character of the permissions string as a dash. Next, we create a directory, which 'ls' then identified by a leading 'd'. [pause] Each of these is now an entry in our current directory. As we mentioned before, each file is referenced by an inode, which contains all the meta data associated with it -- the permissions, ownership, timestamps etc. -- and it is the file name existing in this directory here that creates the reference to the inode. And while we generally do not call them that, every single filename to inode mapping is called a "hard link". But we can have multiple names for the same file. So if a file already exists, we can _link_ another directory entry to it. [continue] To do that, we use the 'ln' command. There. Now these two files are indistinguishable other than by name. More accurately, they are not _two_ files, but they are _two names_ for one and the same file. We can verify this by looking at the inode number of the files, which 'ls' shows us by passing the '-i' flag. As we see here, our directory has one inode number, and the two regular files share the same inode. That is, as far as the system is concerned, they _are_ the same file, and 'ls' also tells us over here that there _are_ two names for this file. This number here is the "link count", the number of names that exist for this file on this filesystem. But there's also another type of link, a so-called "symbolic link", which is created using the '-s' flag to 'ln'. The difference here is now that we can see that the symbolic link is indeed a separate file: it has its own inode number, and the file type is shown as 'ell'. [pause] A symbolic link, then, is a special type of file that simply says "hey, don't look at me - look at that guy!" for any operation. The 'ls' command also shows us the target of the symbolic link in the output. So when we read the file, what we get back [continue] is the contents of the file it points to. [pause] Ok, so we've seen files of type 'directory', 'regular' file -- which is really only a different way of saying "hard link" -- and symbolic links. But there are other types of files. One of them is [continue] a FIFO. A FIFO is effectively the manifestation of the concept of a pipe in the filesystem, and behaves just like one, too: anything written to the FIFO can be read from it in that order. 'ls' therefor uses the 'p' character to indicate a FIFO, also known as a "named pipe". The behavior of a FIFO can be a bit surprising on first use, but to illustrate, let's start a process that reads from the FIFO in the background. We can then cat the symlink, which redirects to the regular file, into the FIFO... ...at which time the backgrounded cat process reads the data and prints it to stdout. When we hit return again, the shell tells us that the backgrounded process has completed. [pause] We also note that apparently not all messages generated by the shell correctly support UTF-8 characters, but that also serves as a good reminder that while filenames can contain characters other than just ASCII -- as we are showing here -- that may not always be a good idea. Anyway, let's take a look at what other file types we can find. [continue] Let's create a few hard links to existing pathnames in the current directory... ...and then ask the 'file' command to report on all of them. There we go. Now we see: - a FIFO - a character special device -- such as a TTY, for example - a block special device -- such as a hard disk - a directory - regular files containing UTF-8 unicode - a symbolic link - and a socket, an interprocess-communications rendezvous point in the filesystem that provides access to the sockets API for communications between processes on the same host Here, 'ls -li' shows us more information, where again the first letter of the permissions string gives us the type of the file. So those are the different types of files you will likely encounter in a normal Unix system. There _are_ a few other types of files, but some of them are implementation specific to the operating system in question. Either way, let's now take a look at what filesystems we find mounted by default on the different Unix versions we consider: --- First, FreeBSD. Here, we see that the root filesystem is mounted from /dev/gpt/rootfs, suggesting the use of a GUID Partition Table. The 'mount' command shows what filesystems are currently mounted and doesn't differ from the output of 'df' in this case. Ok, so far, so good - simple enough. --- Now let's look at OmniOS. We've done this before, but just to recap: when we run 'mount', we see a rather different view, with several other... "things" showing up as being mounted. That is, we have our root filesystem, which we know to be on a ZFS pool, as well as several special purpose pseudo-filesystems. Let's look at the 'mnttab' manual page: This is one of those weird things where we project information from the running system into the filesystem, making it appear as a regular file. So looking at /etc/mnttab, we find no surprises: the same filesystems mounted as shown before. How does the OS know which things to mount? Let's look at the 'vfstab' manual page. Now this is an actual file -- a configuration file describing the defaults for mounting filesystems at boot time. Let's take a look at that file. There, this describes which things to mount, as well as what type of pseudo-filesystem to use for the given special mount point. Alright, now let's compare --- to Linux, and Fedora Linux in particular. Again, 'df' gives us some information here, showing the use of some special filesystems like tmpfs as well as the disk-backed root file system. Now if we run 'mount', then... whoa, that's a lot of things showing up here that we didn't see when we ran 'df'. Look at all that. Fedora nowadays uses systemd to bootstrap and manage the running system, so we get a whole bunch of extra weirdnesses stashed in here: We see tmpfs, devfs, control groups, etc. etc., all with their respective mount flags shown here on the right. Let's see what 'man fstab' says on this system: The /etc/fstab file contains information about which filesystem can be mounted. When we look at the file, we note that it contains only a single entry for the root filesystem of type ext4 and mapped to the device with the given UUID. This is quite different from what we've seen currently being mounted, so where is that information tracked? /etc/mtab points to /proc/self/mounts, using the procfs pseudo-filesystem to reflect information about the running system into the filesystem hierarchy, as we mentioned before. /proc/self/mounts looks like so. Note that it _looks_ like a regular file, but is of zero bytes size, yet when we 'cat' it... ...it shows us all the things just like the 'mount' command did. So again, we notice a discrepancy between what filesystems are available when running for example 'df' to report on free space, and what other pseudo-filesystems are active. For many of the pseudo-filesystems it simply doesn't make much sense to report file usage, since by definition they only abstract system properties into a file API. Now all this is a bit strange: we've talked at length about disks and partitions, but most of the things we've seen here are not really filesystems nor are they backed by disks. This is to illustrate how the filesystem API and concepts have been so successful that we've increasingly used it for other purposes. Secondly, it's worth noting that the default layout of the AWS instances we spin up do not necessarily represent those of actual production systems serving specific purposes. So let's instead show a different example: --- What you see here is the 'df' output on the server on which the course website is hosted. This is a NetBSD virtual private server, but as you can tell, here we use a more "normal" layout: We have a root disk - that provides the operating system and is, no surprise, mounted under slash. But we also have additional disks for different purposes. We see - one disk dedicated to the /home directories of all the users on this system, and - another disk mounted for miscellaneous data files. The OS also has a ptyfs, a kernfs, and a tmpfs mount, but nothing quite as crazy as the Fedora systemd instance. --- Ok, so let's visualize how we mount our disks. Even though nowadays we frequently have one monolithic partition containing absolutely everything, we still may see different disks mounted in different places. - And this really is the only stipulation we have when mounting partitions: a partition can be mounted _anywhere_, with the exception of the root partition, which must be mounted under slash. Now way back in the olden days, when disk space was still expensive and sparse, it was rather common to have one disk that contained the bare minimum files needed to boot the system, and that partition would be the root partition, while another disk might contain all the additional files that would show up under 'usr', for example, and end user's private files living on yet another partition. This, by the way, is one of the reasons for why we have a '/usr' directory instead of just providing all files under slash. We'll see in a second that there is actually some logic to this. Now as we also just saw, there are other things we can mount - but those pseudo-filesystems don't get represented here in this graphic since we want to draw a distinction between what data lives where. And for that, we probably want to also think about some sort of standardization across different operating systems and consider where we want to install software -- a topic we'll go into much more detail in next week's videos. --- And what do we do when we want to look up some information, such as about which files go where, or what the filesystem hierarchy should look like? Nope, we don't type "which files go where" into Google and then follow the first Stackoverflow link to some random person opining that the kernel should go under /all-my-kernels. Instead, we pull up the manual page. 'man hier', provided by the OS, provides a description of the filesystem hierarchy. It notes, for example, that under '/bin' we keep the utilities used in both single- and multi-user environments; that system configuration files go under /etc; tells us where users' data goes; where libraries go -- and notes that we don't want to rely on /usr being available, since it might reside on a separate disk not yet available at boot time; tells us about the location for system utilities, thereby answering the question as to why we have both a "bin" and an "sbin"; describes the 'usr' hierarchy itself, which somewhat mirrors the root hierarchy; and tells us where various other files go, such as log files and pid files, for example. So you see, all this has some logic and reason behind it, and having this hierarchy defined and then adhered to by software providers and system administrators makes it easier for you to both use as well as maintain the system: by careful separation of these files, you can use different partitions with different mount flags and security settings, for example, or you may be better able to budget your disk space or avoid runaway processes eating up disk space or inodes on one from interfering with operations of the system as a whole. --- Ok, this brings us to the end of our discussion of filesystems and mount points. Here are a few things for you to think about and to experiment with: - Take a look at the different mounted filesystems in the different operating systems as we've shown in this video. Make sure you understand the differences and identify what each is used for. - Also take a look at the mount options -- that is, the _way_ in which the filesystem is mounted. Many filesystems allow for different behavior based on the mount flags: you may, for example, be able to mount a disk read-only, or with certain properties enabled or disabled to boost performance. Make sure you understand what all the different flags mean and why they might be employed. - We also saw that sometimes there's a discrepancy between what /etc/fstab says should be mounted and what _is_ mounted. Try to explain under what circumstances such discrepancies arise and whether or not that's a problem. --- Next, think a bit about the file system. - Remember at the beginning of this topic we ran some experiments, trying to fill up disk space and using up inodes? Re-run those exercises and see if things make more sense now, if you've gained a better understanding. Then, think about specific limitations in the filesystem. For example: - How large can a single file be? What limits its size? - We saw that we can use for example UTF-8 characters in file names -- what other characters can you use? What characters _can't_ you use in a filename? Why not? - How long can a filename be? Try it out: create a filename with 100, 200, or 2000 characters. What, if anything, is the limit? If there is a limit -- why is that? - Now take that thinking beyond the filename: if each component in a pathname is a filename separated by a '/', and if there is a limit on the filename length, what's the limit on the pathname? - If a file name is nothing but a hard link, and if you can have multiple hard links for the same file, can you have... 100? 2000? What, if any, is the limit? - And finally, try to create a hard link across disks or mount points. You should find that that fails - but why is that? If you're able to answer all these questions, then you'll be in good shape, and I think you'll have gained a better understanding of the topics we've covered so far. But we're not quite done yet with filesystem hierarchies, I'm afraid. As we've seen when looking at the 'hier' manual page, there's a reason for why we put things into one place or another, but.... just how do the files get there? - Well, we'll cover that, directly and indirectly by way of software installation concepts, in the coming videos. Until the next time, and thanks for watching - cheers!