Hello, and welcome back to CS631 "Advanced Programming in the UNIX Environment". We've previously discussed the different UIDs associated with an account; in this video segment, we'll try to answer the awkward question inevitably asked by Unix users: "Mommy, where do UIDs come from?" --- As we know, the Unix system uses numeric UIDs to make all access decisions. Computers like numbers. But humans tend to like strings better, so we also have usernames. The "password database" maps these usernames to userids and thus provides the local accounts on a system. Traditionally, this user database is found in the file /etc/passwd, and it contains the following fields, which map into a 'struct passwd' data structure used by the functions performing the lookups. - the username, a simple string - a hashed password, although we will later see that this data nowadays resides in a separate file - the numerical UID - the numerical GID - a comment, stored as the "pw_gecos" This field is also known as the GECOS field, named after the "General Comprehensive Operating System", a remnant of the early Unix days where compatibility with GECOS machines was desirable. - an initial working directory - and an initial shell --- The user database -- in good old Unix tradition -- is a plain text file found under /etc/passwd, with line-delimited entries containing colon-separated fields. This means that none of the fields can contain a 'colon'. Your typical /etc/passwd file will contain entries for the super user -- root -- a number of service accounts used specifically and entirely for the purposes of privilege separation, and a number of human user accounts. - The first field is the username. - The second field, we said, is the hashed password, but as we can see here, we primarily find an asterisk in this place. This is because nowadays the hashed password is stored in a separate file; we'll get back to this in our next video segment. - Then we have the UID - and the GID, followed by - the GECOS. In the early days of Unix, the GECOS field was commonly used to include not just the full name of the user, but also their Office location and work and home phone numbers, separated via commas within this field. Nowadays, it is mostly used to fill in a full name, that some services or tools on the Unix system may use. For example, when sending mail, the user's full name is filled into the 'From' field from here. - Next, we have the home directory. This is the directory that the login process uses to set the as the current working directory of the user when they log in. - And finally, we have the shell, the program that the login program executes on behalf of the user when they log in. --- Ok, now looking at this password database, we should find a few odd things, that we'll have to go over to better understand the format. We said that this file is the user database, defining the local accounts on the system. So we should expect there to be always exactly one username being mapped to one UID, but - right here at the beginning we see that we have two accounts with the same UID. And one of them is root, no less! This is no mistake: having multiple accounts with the same UID is rarely desired, but it's not an error in the system. What this means practically is that there are two usernames that, when authenticated, will have the effective and real UID of 0. Once logged in, the two accounts are completely indistinguishable from one another since -- remember -- the system only looks at the UID, not at the username to make access decision. We'll see in a second why we might want to have two accounts with the same ID. Let's first look at a few more oddities in this file: - We mentioned that we have a password field with a placeholder for the hashed password. - But this field can also be empty! Which means that this account doesn't have a password -- anybody can log in as this user. While this is probably not what you want, again the system allows for this to happen, because who's the system to tell you how to grant access? - Next, the login shell can be set to any program. Since we have a number of service accounts that we only use for privilege separation -- that is, to allow a process to have a dedicated UID and not have any other privileges -- but that we do not ever want to allow to log in interactively, we can set the login shell to /sbin/nologin. /sbin/nologin simply returns false when executed, meaning a user managing to log in as this user will immediately be logged out again. - But you can set the login shell to _any_ program. Our timelord over here, for example, will simply execute the 'date' command when she logs in. Privilege escalation here would require the use of her sonic screwdriver, I suppose. - But you can also leave the login shell blank, as shown here. In this case, the system will default to /bin/sh. - Your initial working directory is normally set to your home directory... - but you can leave that blank, too, in which case the initial current working directory becomes '/'. - The GECOS field allows for the expansion of the ampersand to the capitaized username -- we'll see that in action in a minute. - ...as well as to provide for the additional information as previously mentioned. - but of course you can also leave that field blank, if you like. - Multiple identical GIDs simply means that these users are in the same primary group, which is rather normal. - But having multiple entries for the same username is most decidely _not_ normal. We'll demonstrate why this is a bad idea in a minute as well. --- Ok, let's illustrate all the different weirdnesses we may encounter in /etc/passwd. First, let's illustrate the use of having a second account with the same UID for root: the toor account, the 'bourne again superuser'. Suppose you are working as root, and accidentally manage to corrupt your login shell in some way... Now you have a problem: you can't log in anymore! Now what? Well, with the 'toor' account, you can still log in, since the 'toor' account has a different login shell -- a statically linked shell from the /rescue set of tools. Once you log in as toor, you are root. Wait, how does that work? Don't you remember? The system only cares about your UID, and 'toor' has UID 0, so as far as the system is concerned, you are the super user. So you can go ahead and fix your broken shell, after which you can log out and logging in as 'root' will work again. Neat, huh? Note, though, that the use of the 'toor' account is a BSD inherited account; most non-BSD systems do not have this account present or enabled. - Ok, next, let's take a look at 'fred': Fred doesn't have a password hash, meaning anybody can become 'fred' without having to provide a password! Probably not what you want, but ok. - Next, drwho. Remember, the doctor has the 'date' command as her login shell, so when we log in as the doctor, that command is executed and upon termination of the command we are logged out again and are back being our usual boring self. - Ok, next up: alice. Alice has no login shell, but when we can still log in as her. Here, we are using the 'login' command merely to illustrate a different method of logging in as a user. This is what the system does when you log in -- we'll come back to this program in a future lecture, though. Anyway, as promised, Alice gets /bn/sh as her shell, because that's the default if no shell is specified in /etc/passwd. But remember there was something funky going on with Alice? We have two accounts for alice, both with the same home directory! Here are our files, all owned by 'alice', but 'alice' is not allowed to create a new file in her own home directory! But we said the Unix system only cares about numeric IDS, so let's take a look at those. Ok, so the directory is owned by UID 1002, but we are... UID 1004. Which explains why we can't create a file here. _We_ can't tell the two accounts apart because we are looking at the usernames, but the system checks the UID only. So having two accounts with the same username is probably a pretty bad idea... - Ok, finally, let's check out the gecos field in action. The 'finger' command can be used to find out information about a user account. As we see here, the ampersand in the gecos field for 'root' was translated into the uppercase name, so we end up with "Charlie Root". Why "Charlie Root"? Good question. I wasn't able to find an authoritative answer, but rumor has it that the account was indeed named after the baseball player Charlie Root. Unix history folklore is weird. - Now let's take a look at the information for user 'jschauma': Note how the 'finger' command was able to parse out the gecos information from the traditional comma separated values, and you see my office location and phone numbers displayed here. By the way, the finger command can also be used over the network to query another system, if that system offers the 'fingerd' service. That might look like so: Ha, look at that - I apparently still have my old .plan and .project files on my system, the ones that I used here at Stevens when I was a System Administrator almost 20 years ago... wild! This information is no longer accurate, I'm afraid. --- Ok, that's enough for this short video. Let's recap: the user database, a text based file named /etc/passwd, contains colon separated fields. Many of those fields may be empty: - an empty password hash field means no password; this is probably a bad idea - an empty home directory means you'll be dropped into / when you log in - an empty shell means you get /bin/sh Some of the fields may be duplicated in the file. this is not always an error: - multiple user sharing the same primary group is completely normal - multiple username for the same UID is generally quite rare, but we'e seen how it can be quite useful - multiple uids for the same username, however, is virtually guaranteed to be a mistake and lead to unexpected errors Alright - next time, we'll take a look at various functions we use to handle the userid lookups, as well as how to get information about groups. Thanks for watching - cheers!