Hello, and welcome back to CS615 System Administration! This is week 5, segment 6, and after we discovered in our last video that the internet has a physical component with real world impact, we'll now look at exactly how the internet is comprised of independent networks. We've already seen that we connect to the internet across the globe by using, for example, submarine communications cables, which physically connect one continent to another. And we intuitively understand that obviously there are different legal entities controlling the endpoints of these cables and thus the connectivity that is made possible by them. And so this is one direct implication of the nature of the internet, of it being not a single network, but a network of networks. And this network of networks has really taken off since it's humble beginnings back in the 1960s and 70s, hasn't it? As early as --- 1984 did John Gage from Sun Microsystems coin the phrase: "The network is the computer.", meaning that all the computers are merely connected interfaces to the larger network. You see, the idea of putting all your compute resources onto the network is not all that new. But, admittedly, --- this lead to some confusion, so it's useful to clear things up: the network is indeed the network and the computer remains the computer. But now about that network... it's not really one network, and --- even though we all think of computing resources that are reachable via the network as being in magical "the cloud", but of course system administrators know all too well - that there is no cloud, only other people's computers. After all, at some point, somebody, somewhere has to maintain these resources. But ok, let's focus more on the network. Or rather, the network_s_ that make up this network. --- Networks have a tendency to grow. If we connect two endpoints, it's trivial; --- if we connect multiple devices, we need a network layer. As we've already discussed, if the devices are within the same broadcast domain, such as being connected via a switch, then we have a layer two network, and we can --- connect multiple such layer two networks with one another using a layer 3 device: a router. In this way, we've also already discussed the need to subnet our IP blocks, as you will recall. But so of course we will eventually --- want to connect multiple such networks, and when we do that, we begin to talk about _autonomous systems_ -- networks that are grouped together. The various networks within such an autonomous system may we made up of independent networks using separate IP space; IP space that has been allocated to the controlling entity down from IANA to the RIR and the LIR as we discussed in an earlier video. And so, as we connect these autonomous systems, --- we're building the internet. What you see here is a visualization of the routing paths of the internet ca 1997, color-coded by which RIR assigned the space and grouped via connection paths: the central stars in this graph represent the systems that are connected to more other networks. This network looks --- a bit different now in 2021, with obviously more networks and more connections between the networks in more geographical regions, but still representing the routing paths as before. And these routing paths are based primarily on knowledge about exactly these autonomous systems, since the Border Gateway Protocol (or BGP) shares its routes, i.e., the reachability of the different networks, via which AS numbers it can talk to. But... where do we get our AS numbers from? --- Who would better be in a position to assign AS numbers than the same entity that also allocates IP space? So yes, we once again go back to IANA, which allocates AS number blocks to the RIRs, which then allocate the specific AS numbers similar to how IP blocks are managed. --- Ok, so how do we find out what our AS number is? One way to look at the information about a given network block allocation is via the "whois" command, which queries the various Network Information Centers via the WHOIS protocol. As you can see here, the query usually starts at IANA and then may query the relevant RIR for the information, and in this way the protocol is similar to the DNS protocol, which we'll discuss in some detail in a future video. --- So if we want to find out about our Stevens netblock, then we pass in our web servers IP address, and we then see that IANA tells us that it had allocated the 155/8 block to ARIN, which then tells us that it allocated the 155.246/16 block to Stevens back in 1991. ARIN then also provides us with a URL that provides additional information about this netblock, so let's see what they show us there. Hey, look at that, it's json! Neat! Looks like there's a fair bit of information about the netblock and its owner here. Let's see if we can extract the AS number from this: There we go: Stevens AS number is 16889. Now if we look at the information from ARIN in more detail, we find that Stevens has been allocated multiple IP blocks, which we can then extract here, too. Did I mention that I like json and jq(1)? Really useful! Anyway, so we see that Stevens has multiple netblocks, not just the 155.246/16 netblock we see so often. In fact, it looks like Stevens also has an IPv6 netblock allocated; it's just unfortunate that it doesn't seem to be used much. --- But alright, so now we can start to visualize how our network connects to others: - Let's say that the bottom network here represents Stevens, i.e., AS16889 with - its netblocks - used in whatever way - Stevens deems fit. But now what is the AS number - of the network that Stevens connects to? --- For that, let's just find out what the path is that our packets take when we talk to some web server on the internet. We'll be looking in much more detail into how the traceroute(1) utility works in one of our next videos, but I trust you're familiar with it. When we trace our packets to, say, the yahoo website, then we get back output that shows us the different hops along the way. Here we see that we're jumping from where we started to another address in 155.246/16, then to an RFC1918 address -- that is, a private network address -- then some other 155.246/16 addresses before we then leave the Stevens network and eventually find our way to the Yahoo networks. So now let's look at that first non-Stevens address here, 130.156.251.105... Ok, so this address is in a netblock that's assigned to the New Jersey Higher Education Network, and to get its AS number, we'll use a different whois server here that's set up to provide just that information. There, that's pretty convenient, huh? Ok, so now we've identified the AS number of the next network -- AS21976 in this case, of NJEDGE. And we could do this manually for each hop along the way, but fortunately the traceroute(1) utility has an option to do this for us: There we go! So we have Stevens AS16889 then NJEDGE AS21976 then AS4637 and AS10310, which appears to be Yahoo's AS number, but there's also AS26101, showing that of course an organization can have multiple AS numbers. --- So with that information, our image of how the packets go through the different networks can now be filled in: --- Stevens AS16889 is connected to NJEDGE AS21976, which connects to AS4637, and from there to Yahoo's AS10310 and AS26101. So we see that the connections between the networks are made in specific locations, it seems. This process of connecting the different networks is called "peering", and it takes place in the form of an actual, physical connection between the systems of these entities allowing them to exchange routing information via BGP. These connections are made at so-called "peering points" or "internet exchange points" or IXPs, in strategically located data warehouses around the globe, and --- one of the great things about this, like so much on the internet, is that it's primarily public information. That is, we can look up who is peering with whom in what location, via, for example, the "peeringdb" website. So if we, for example, take the NJEDGE AS number 21976 and plug it in here... then we see a fair bit of information about this organization, including which IXPs it is peering at. One of them is the New York International Internet eXchange point, or nyiix, which we also see reflected over here in the DNS name of the next hop in AS4637, which belongs to Telstra, which then connects to AS10310, which is one of Yahoo's AS numbers, and the entry here shows all the IXP's Yahoo peers at, which turns out to be quite a few, all over the globe. Now the packets hopped around AS10310 for a bit, before going into AS26101, but we don't find that AS in the PeeringDB. And neither do we find Stevens's AS16889 in the PeeringDB, because both of those networks do not peer with any other networks. That is, not every network directly peers with others at a public IXP; in addition to the public peering locations, there are also of course private connections that are made, such as between Stevens and NJEDGE, which provides the network connectivity to the larger internet here. The larger networks, like Telstra's AS4637, then connect many more other networks, and thus become a central point in the network visualization we see at the beginning of this video. Now as you can imagine, these peering points, the internet exchanges where these connections are made between the large carriers, need to have some big pipes --- and as that is a large selling point, you can often find some interesting stats on their websites. The Amsterdam Internet Exchange is one of the largest IXPs in the world, and it shows a daily throughput of 9.4 Terabit/second. The daily stats nicely show how traffic drops down a bit at night, ramps up during the day, peeks after dinner and then drops off again. Monthly and yearly stats then show an increase in overall throughput over time. The German DE-CIX, another one of the world's largest IXPs, also has some stats with similar patterns, as does the Moscow exchange, although here the throughput is a bit lower than in Amsterdam or Frankfurt, with even lower rates for IPv6, it appears. The NYIIX stats look like this the Toronto Exchange stats like this and the Netnod exchange in Sweden like this. As I'm sure you've noticed, the graphs all look similar, not only in the patterns they show, but also eerily identical in style. This is no surprise, as they are generated using one of the standard, open source network traffic graphing tools around: MRTG. Even JPNAP seems to be using MRTG, and if you get involved with any network architecture and administration, you are quite likely to become very familiar with graphs that all look like this. --- Alright, I think we can break here and conclude our journey through this network of networks. To ensure you internalize the lessons we illustrated here, I recommend that you - try to identify AS numbers of other organizations and companies and find out where they peer, using the tools we showed in this video. - Try to find out what happens when two organizations develop a dispute and one wants to de-peer, to remove the connection with the other. There have been several cases where two competing companies used this or the threat of depeering as a mans to harm their competitors. - On Wikipedia, you can find a list of the largest internet exchange points and you can browse their websites to find some stats like we showed a second ago. But there are also some locations that are less open -- the Network Access Point of the Americas, for example, is housed by Equinix, a huge colocation provider. Try to find out more about these exchanges. Finally, if all of this interests you enough that you want to be part of the community of network operators that literally and physically build the internet, - you can join the North American Network Opererator's Group or NANOG. The public mailing list is a really interesting place to hang out and learn about the structure of the internet and the various aspects most people higher up the stack never think about. Ok, and with that, we're concluding our first week of networking, but as threatened - I mean, promised - we're not done yet with this topic. In our - next couple of videos, we'll again go down to the packet level to trace traffic of different applications and protocols, but we also will need to take a look at just how our host knows how and where to send packets. You'll be getting intimately familiar with tcpdump(1) and the various strace(1) tools. Hope you're looking forward to it! Until then - thanks for watching. Cheers!