Hello, and welcome back to CS615 System Administration! This is Week 2, Segment 4, and in this video, we'll take a quick look at what a hard disk drive actually looks like. Besides giving us an appreciation for the impressive engineering involved and tying something abstract such as data storage to mechanical devices, this is also useful for us to better understand partitions as well as filesystems more generally. So let's get started. --- As we mentioned earlier, the simplest storage model and the most common storage unit is a simple hard disk drive or HDD in a Direct Attached Storage or DAS model. And even though Solid State Drives offer many advantages and are increasingly popular these days, a traditional IDE drive as the one shown here is what we'll use for our illustration purposes, since many of the considerations that interest us are in fact derived from the physical properties of exactly such drives. But so let's open up one of these, and... --- ...oh, right, chances are you actually ruined the hard drive, because honestly, they are just impossible to open up, often requiring some sort of "security screw", which really just means you have to make an extra trip to the hardware store, but that's ok, SysAdmins like going to the hardware store, and then you come back and try again on another drive and... --- ...there, that's much better. So this is what a boring old HDD IDE drive looks like on the inside. We have a - magnetic platter on a spindle, zipping around with a speed of over 7000 rotations per minute with high-performance drives reaching 15,000 RPMs. And often you have more than just one of these, meaning there are probably multiple platters. Our ones and zeros are stored on the platters as changes in the magnetic polarization of individual sections on the platter, detected or changed by the - read-write head which sits at the end of the - actuator arm, which allows it to move radially across the disk. This disk is then divided into - individual tracks marking a specified distance from the center. Now each disk can usually hold data on both sides, and if you have multiple disks, then congruent tracks across multiple platters form a - cylinder, with multiple concentric cylinders forming a _cylinder group_. Each track is then further subdivided into individual - sectors, the smallest addressable unit of the hard disk. These sectors generally are able to store 512 bytes and form the standard _disk block_. That is, this 512-byte _block size_ is an actual hardware restriction, and even though nowadays there exist hard drives with a physical blocksize of, for example, 4096 bytes, a lot of computer hardware and software still assumes 512-byte blocks, which means that those drives often times have to emulate a much smaller block size than they have. We'll see the 512-byte physical blocksize appear again in future videos when we talk in more detail about filesystems, but for now make note that due to physical limitations it's impossible for the hard drive to read more efficiently than 512 bytes at a time. --- Here's another view of a hard drive, where you can better see the multiple platters, which then logically require multiple read-write heads and actuator arms. Since each platter can be read or written on both sides, the read-write heads actually look --- like this. Here, you can clearly see the - dual read-write head. You should also note that the actuator arm is the same for all read-write heads, meaning they always move as one. We'll use that property later when we talk about partitions in our next video. --- Now one thing you may have observed when looking at the different sectors is that necessarily the sectors - on the outer tracks must be physically _larger_ than the sectors - on the inner tracks. But if each block is defined to contain 512 bytes, then we're really wasting space here! Similarly, if the disk spins with a constant angular velocity, then the blocks on the outer tracks are moving _faster_ than those on the inner blocks. So folks decided to take advantage of these properties and came up with --- Zone Bit Recording, whereby the disk is structured such that _more_ sectors are placed on the outer - zones. With constant angular velocity, we then - increase our data transfer speed on the outer sectors, as we're able to read more sectors faster than on inner tracks; and - overall increase storage capacity, as we now have more sectors, and thus more 512-byte physical blocks. As you can tell, within System Administration, there's no rabbit hole too shallow for you to not tumble in and discover that there's an endless list of additional things you can follow. And as so often, these things lead to interesting physical properties of the "real world" around us, even though we're talking about bits and bytes, ones and zeros. --- And some of these physical things affect our performance. Even setting Zone Bit Recording aside, our physical hard drive is limited in its performance by a few factors. For starters -- although most negligible compared to the others -- there's the - transfer or "bit rate" with which the data is read from the disk, a direct function of how many consecutive blocks of data you want to read or write. More directly, though, we understand that we'll be limited by - seek time. That is, the time it takes the actuator arm to move the read-write head to a specific track of the disk. As you visualize this movement, take note that in general, if we want to read one block of data, we are quite likely to also want to read the next block of data that is related, such as the next byte in a specific file. So in order to minimize angular movement -- seek time -- we want to store data on contiguous blocks within the same cylinder. Next, in order to get the data under your read-write head, you might have to spin the disk, since you might just have passed the block you wanted to access. So your next performance limiting factor is - rotational latency, which, on average, is one half the rotational period. Finally, there's - a little bit of overhead once you've read the data from the disk and have to get it to the OS, a bit of overhead for the integrated electronics as they initiate the various physical movements etc. But alright, there's still one more critical aspect that limits the usefulness of our hard drive, and that's... --- capacity. So let's think about what capacity means. The capacity of a hard drive really is just how many individual blocks we can store on the disk. Which, as we've seen with Zone Bit Recording, can be gamed a bit, but ultimately, we need a way to _address_ each individual block, right? So let's consider this disk here with - - - - its various tracks and - - - - divided into sectors. I know, I know, these illustrations here are pretty janky, but at least you don't have to deal with me trying to draw things on a whiteboard. You get the idea here, I hope: we have a fixed number of blocks that we need to be able to address. So how do we do this? One way of doing this was the so-called - Cylinder-Heads-Sectors addressing scheme, whereby you'd be limited -- logically, as we understand from our visualization here -- by the number of cylinders (or tracks), how many heads you have, and how many sectors per track you can create. - Now the early ATA specification defined that you could have - 64K cylinders - 16 heads -- with each platter being used on both sides, that'd be 8 platters max per drive - and 255 sectors per track. So you do the math, calculate that you can have - 512 bytes per sector and end up with, roughly 137 Gigabyte of maximum storage for an ATA hard drive. Not too bad, but of course the problem was that the BIOS had a different limit, because reasons. --- That is, early BIOSes could address - 1024 cylinders - 256 heads - and 63 sectors per track. Which, again at - 512 bytes per sector gave you a maximum disk size of - approximately 8.5 Gigabytes. So that's pretty bad, but guess what, if you look at these two ways of limiting disk size, you actually have to --- use the lowest common denominator and thus have only - 1024 cylinders from the BIOS limit - 16 heads from the ATA definition - and 63 sectors per track from the BIOS limit again, which... - gives you - only 528 _Mega_bytes as the maximum size a disk could possibly address! Now back in the day, 528 Megs seemed certainly like more than enough space anybody could possibly use, but soon enough people wanted to have larger disks. So with time, the different limitations were raised by using different data types to store each of the CHS factors, but the whole approach was eventually replaced by using --- logical block addressing instead, whereby the blocks would be addressed, well, logically, or sequentially, via an index. "Cool beans", you say, "now we don't have any limitations any longer, right?" Well... not so fast. You still have to store the _total_ number of blocks _somewhere_. So in ATA-1, the data type to represent this total number was 28 bits, meaning it could only address two to 28 blocks, or roughly 137 Gigabytes of data. This was changed in 2003 with the - ATA-6 standard, which now uses 48-bit LBA, which yields a somewhat more comfortable 144 Petabytes. So far, we haven't managed to build hard drives with that capacity. But disk capacity is not just a concern for the disk drive itself; we also require the components involved in the boot process to be able to handle the disk! To this end, the Master Boot Record partition table needs to be able to address the disk blocks, but that uses a - 32 bit data type, so can only address 2^32 blocks, or roughly 2.1 Terabytes. Fortunately other approaches exist: - using the GUID Partition Table, you get a 64-bit data type, which... gives you a nice 9.4 Zettabyte limit. --- Alright, let's take a break here. We'll be picking up from here in our next video, when we talk about partitions, which we already just hinted at a little bit when talking about cylinder groups or the MBR and GPT. Understanding the physical aspects of the hard drive thus help us better understand those concepts as well, which is why we covered this topic here in this fashion. Our key take-aways then are: - There's a physical block size, most commonly 512 bytes, which is dictated by the drive. We'll see how that impacts filesystem performance and some other considerations in future video segments. - We saw that some of the limitations on disk drive capacity were imposed by the physical factors, such as when using the Cylinder-Heads-Sectors addressing schema. - But even when we moved to Logical Block Addressing, we're still limited -- this time, however, by the choice of data type. This serves as a useful reminder that no matter what we do, we never have infinite resources. We will see similar problems throughout the semester and in discussions of other storage types or when talking about network protocols etc. - We also noted that other physical factors influence the _performance_ of the hard drives. These will no longer apply when we move to Solid State Drives, but remain an important factor in deciding which storage solutions to purchase for your needs. - Finally, and as we will see in our next video, the arrangement of the physical attributes of the spinning magnetic-platter on a spindle hard drive influences how we partition our disks. Even if the storage space now may actually be virtual, as in the case of an EBS volume, or abstracted across multiple physical drives as in the case of, say, RAID, we still pretend that there's a physical drive with such a rotating platter underneath. But more on that in our next video. Until then - thanks for watching! Cheers!