Hello, and welcome back to CS631 "Advanced Programming in the UNIX Environment". With this video segment, we are starting the material for week 4, diving deeper into filesystems and directories as covered in the W. Richard Stevens APUE book chapters 4 and 6. As we've seen in our previous discussions, certain filesystem properties and concepts are encountered on a regular basis, and to better understand them, it will be useful to visualize the file system and some of the corresponding data structures. Let's take a look... --- In the beginning, there was a disk... Well, any storage medium, really, but let's consider a hard disk. A hard disk can be divided into smaller partitions. While there are different types of partitions -- BIOS partitions, say, or operating-system specific partitions -- for our purposes the distinction doesn't matter much. On our reference VM running NetBSD, we can inspect the OS-specific partition table via the disklabel(8) command. --- Here, we see a description of the physical disk. Well, it's a virtual disk, but the OS doesn't know that; it thinks there's an actual physical device there, so it tries its best to describe what that looks like: - One of the factors of the physical disk that we have seen before is the physical block size, shown here to be a very not surprising 512 bytes. - Further down, we see how the disk is divided into partitions. There's one partition describing the entire disk, starting at offset 0 and ending at the very last sector. - Then there's a partition describing the portion of the disk set for use by the NetBSD operating system, starting at offset 64 -- leaving the first sector available to the BIOS partition table. - The root partition where we create our entire filesystem on is partition 'a', - and we carved out a small, second partition for use as swap. --- So once we have created our partition table, we can then create a file system on each partition. Or not - as in the case of our swap partition, where we use the raw disk as a place to stash memory when needed. The filesystem organizes the various cylinder groups and provides some addressing of the structures in each in a so-called "superblock". --- Now each cylinder group in turn contains the actual data blocks -- the parts where we store the actual bytes that make up our files -- as well as a list of inodes and blocks set aside for the meta data associated with the inodes, the inode blocks. Since the entire structure of the filesystem is written in the superblock, it'd be pretty disastrous if you lost this one block, so the file system replicates the superblock in cylinder groups, thereby allowing recovery of the filesystem from a a corrupted superblock. --- Finally, the actual data that we discuss when we talk about files and directories are stored in different groups of data blocks: the inode data blocks and the file data blocks. Here, we see how the meta data about a file -- all the information that we are now so familiar with from our discussion of the 'struct stat' -- is stored separate from the actual bytes of the files. --- Here's another view of this: consider a regular file identified by inode number 123 pointing to a number of data blocks somewhere on the disk. Let's recall from our previous lecture that the data stored in the inode does not include the filename. Filenames are stored as directory entries only, remember? --- Such a directory entry mapping a filename to an inode is known as a "hard link", and we can visualize it as shown here. Note that a "hard link" is not _a different_ name for an existing file: the name of the file is itself the hard link. --- And so it's entirely possible to have multiple such links to a given inode. Such links may exist in the same directory -- where they would then have to have a different filename -- or in another directory elsewhere on this disk, as shown here. In this case, the name of the link in that second directory could be the same as the name of the link in the first directory; the pathname to each would be different, however. --- Ok, so let's take a look at directories. Here we see two directories, one at inode 1267, the second one at inode 2549. The Unix Filesystem may store the contents of a directory on reserved directory data blocks, for efficiency reasons possibly kept separate from the data blocks used for regular files, but for our purposes that is an implementation detail we need not care about. --- Now if you create an empty directory, and you look at the output of 'ls -a', you will note that it contains two entries already: dot and dot-dot. dot always refers to the current directory. That is, inside of every directory there is a mapping referencing the directory itself, named "dot". At the same time, there is another entry, called "dot dot", which refers to the parent directory. These two entries are present for every directory, and allow you to maneuvre the file system hierarchy via relative paths. --- Now any directory must also have another name, what we would usually call it's "real" name. That name is the mapping found in a directory's parent directory, since every directory has to exist somewhere. This illustration here may help make this a little less confusing: - On the right-hand side, we see a directory at inode number 1267. We don't know what this directory is called. The entry "dot" in this directory points to inode number 1267, the directory data blocks of which point to the disk where we find the entries. - The entry "dot dot" here has an inode number not shown in this image, which would be this directory's parent directory. Now this directory at inode number 1267 has a third entry in addition to "dot" and "dot dot": a file of type directory named "testdir", pointing at inode number 2549. - The directory blocks for this directory are shown here. - This directory -- known as "testdir" in the directory with inode number 1267 -- has two entries: dot -- inode number 2549 -- and dot dot -- inode number 1267... which... points to the parent directory at inode number 1267, which contains the "testdir" link. --- Seeing how these hard links are mappings between inodes and directory entries, and looking back at our illustration of the filesystem on the given partition, we understand that such hard links can only exist within the same file system. If this doesn't seem obvious, consider that you might have more than one partition in use; in that case, you'd have a second filesystem on the second partition. This second filesystem would also have an inode map, and logically, you might have, for example, an inode with number 1234 on the first filesystem, just like you have an inode with number 1234 on the second filesystem. As noted in our previous discussions, this is why the st_dev field is required to, in combination with the st_ino field uniquely identify a file. - Ok, we already know that the inode contains all the information from the struct stat - And with the illustration at hand, we can also understand what the meaning of the st_nlink field is: the number of hard links that exist for this particular file. This count can be used to determine when the data blocks associated with a file can be freed: only if the number of things pointing to this inode is 0 -- and no other process has an open file handle for this inode -- can you mark the blocks as available. Note also that based on this illustration you can see that the st_nlink count for any directory must be at least two, since every directory has two names -- dot, and whatever the name is that exists in its parent directory. --- The next thing that these illustrations help us understand is that moving a file within the same file system is a really fast operation, since no data blocks need to be accessed. In fact, "moving" a file within the same file system really doesn't "move" anything: instead, you simply create new link - temporarily yielding two simultaneous names for the same file - and then removing the old directory entry. And that's it! Note how at no time did we go out to the disk blocks and copied those around! Of course this only works when moving a file within the same filesystem -- if you were to move a file from one filesystem or partition to another, you'd have to actually first _copy_ the file to the other partition, then remove the old entry. You can verify this for yourself by creating a large file and renaming it on a single filesystem and then observing how long it takes to move it to another file system. --- Alright, let's take a break here. Having the Unix Filesystem visualized as we did should help us understand the concept of "hard links" and directories and some of the operations involved. We'll show more practical examples in our next video segment, but perhaps start playing around in your terminal with the 'mkdir', 'touch', and, of course, 'ls' commands to inspect the inode numbers and link counts of various files and directories you create. Think about edge cases, too -- for example, if every directory must have a parent directory, what's the parent directory of the root of the filesystem, "/"? What are the inodes of "dot" and "dot dot" there? Well, more on that next time, when we also look at the system calls used to - create - remove - and rename directories and links, including symbolic links. Until then - thanks for watching! Cheers!