UNIX related things... (tr, cut, awk, and permissions)

simX · Dec 18, 2001

So a couple of UNIX related things that people were requesting I post about:

First off, the commands "tr", "cut", and "awk". Of course, it's really easy to learn about them via the "man" command, but I'll explain them here. GadgetLover also wanted me to post something about groups and permissions, so skip down to after the command explanations if you want to learn about those.

The tr command translates characters. There are 3 options: -c, -s, and -d. A regular tr command like the following:

tr ab cd

... will translate all occurences of "a" in the input to "c", and all occurences of "b" to "d". If you input this command all by itself, it will wait for you to type something in the terminal and press return before it spits out output. However, you can have the tr command process a file by using redirection characters, like this:tr -d aoeu < input.txt This instance will use input.txt as the input, and will instantly spit out the output after you press return, without waiting for further input. Also in the above example, the -d option deletes the characters. So any instance of "a" or "o" or "e" or "u" in input.txt will be deleted and the rest will be the output. The -s option will cause any character that is doubled in the output will drop one of the duplicate characters. For example, if the output (before showing it to you in the terminal window) were "aabbcc", and the -s option was used, the output would actually read "abc". Note that this occurs AFTER the translation, meaning you will never see duplicate characters in the output (by duplicate or doubled characters I mean a character that is immediately followed by the exact same character, as in "aa" or "bb"). The -c option seems to be flawed, however, because when you do it without specifying a file, it doesn't insert a correct line break to allow for new input -- it places your cursor directly after the output, making the output inevitably part of your next input. Try something like: tr -c a z in the terminal, and then give it some input and you'll see what I mean.

Now for the "cut" command. In my previous example, I used the command
cut -d" " -f2. The first option I will explain, -d, specifies the delimeter character. For example, if you have a text file with contents
378192 37192 94919. 38391 48393.t6504, and you use -d" ", there will be 5 fields in the above text file. If, however, you use -d".", there will only be 3 fields, and if you use -d"t", there will be 2 fields. See? Note that if you have 2 spaces in a text file next to one another, and you have the space set as the delimeter, one of them will have to be a field, because you cannot have an empty field. Now with my script, the -f option cuts fields. So when I specify -f2, I want field 2 in every line. That's how the script that found the process IDs of SETI@home works, since the pids are the second field when you initiate the "top" command (note that there are 2 spaces at the beginning of the lines that show process IDs and memory usage in top, which means the first space is a field, making the process ID the second field -- I struggled with this for a while until I noticed this). You can also have it cut certain character positions, by using the -c command, so if you use -c5, you will get the fifth character of each line. I should say that cut doesn't remove what you specify, it removes everything BUT what you specify.

OK, awk. It's very similar to "cut", except it is much more powerful, and it has a different syntax. The command awk '{print $1}' will result in all of the first fields in each line to be printed to the output. Note that this is different from cut, in the fact that when using fields, the delimitation is always a space (as far as I can tell), and even if you have two spaces next to each other, neither are considered fields, which is why in cut you have to use field 2 and in print you only use field 1 to get the process IDs from top. It should be noted that the stuff inside the '{ and }' is a program, so anything returned by the program inside there (with the correct syntax) will be outputted. This is a VERY brief explanation of a very powerful tool, so I highly suggest you look into it more by issuing the man awk command to learn more about it. There are even some examples near the end.

So now you should be able to understand basically how this script works:

top -l1 | grep SETI@home | tr -s " " | cut -d" " -f2 | tr "\n" " " > output.txt
sudo renice -20 `cat output.txt`
rm output.txt

First, I get one snapshot of memory usage. Then, I pipe (the | symbol is the pipe symbol, which means the output of the command before the symbol is used as the input for the command after the symbol) that to grep, which removes all lines which do not have the string "SETI@home" in it (note you can use the -i option for grep to make it case insensitive). That output is then piped to the tr command, which removes every duplicate instance of the space in the lines selected by grep (remember that the cut command, when shown two delimeters right next to each other, treats one as a field, so this can pose problems). Then I pipe THAT output to the cut command, which gets field 2 (the process IDs for my SETI@home instances). Then I pipe it to another translate character command which translates all line breaks ( \n is treated as a line break in tr) to spaces. Then this is outputted to the file output.txt. Then, I get temporary root access via te sudo command, and then I change the priority to -20 (highest priority). Note the two ` symbols around the command cat output.txt. Since the renice command does not accept input from a file, we have it replace the command cat output.txt with it's output, so the renice command thinks it is actually getting the process IDs (which it is, but we are fooling it because it is actually accepting input now). Then, I delete the output.txt file. Note that blb's shell script: sudo renice -20 `top -l1 | grep SETI@home | awk '{print $1}'` is much more elegant. First, it uses the awk command instead of all the cut and tr commands. It also uses only one line and eliminates the need for the intermediary output.txt file, because it replaces the whole last part of the command (inside the two ` symbols) with its output, so the renice command can do its job immediately.

Now for permissions. If you get a long listing of a directory in the terminal, as in the command ls -l, you'll see most lines of the listing for each file and directory look something like:
drwxr-xr-x 4 simmy staff 264 Dec 18 01:33 Public. These are the permissions. The first letter, in this case, a d, tells you what kind of file there is. If the first character is a d, that line is telling you about a directory. If there is a dash, it is telling you about a file. There are other letters that can appear here, but they are very rare. The next 3 characters are the permissions for the owner (the first name that appears -- in this case, "simmy"). r means there are read permissions, w means write permissions, and x means execute permissions. It is always in this order. If one of the 3 kinds of permissions is denied, you will see a dash in place of the letter. So in this case, the permissions for the owner, simmy, on this directory is to allow him to read, write, and execute (it doesn't really make sense for a directory, but it does for a file). The next 3 characters are for the group (the second name that appears -- in this case, "staff"). You read the permissions in the same way. In this case, the group "staff" has permission to read and execute, but not to write. The last 3 characters are permissions for anyone except the owner and people who are in the listed group. In this case, everyone else is allowed to read and execute as well. NOTE: Read and execute are, in fact, different in the UNIX operating system. Suppose you want somebody to be able to execute a shell script you made but not see how you performed it (for example, if you're a UNIX class teacher and you want to demonstrate that a certain exercise is possible, but you don't want the students to snoop around in the code). You might also want to allow someone to read the script, but not execute it (for example, if you were showing a malicious shell script that would delete some files on your hard drive -- people who only have read permissions but no execute permissions can look at the script, but they can't execute it). This is a big difference from the Classic MacOS permissions, which were simply read and write (read included execute).

(More in the next post, because I am reaching the maximum limit for characters in one post.)

simX · Dec 18, 2001

Also, about the groups "staff", "admin", and "wheel". The "admin" group includes everybody who is set to be an admin in the Users pane of the System Preferences -- this includes the root user. "wheel" is similar, in that in includes all admins, but it does not include the root user. The "staff" group only includes root. This information can easily be obtained by the NetInfo Manager application. I would like to point out, however, that you can also sort of get this information via the "groups" command in the Terminal -- and it presents a discrepancy. If I put in the command groups simmy, the terminal claims I am part of all 3 groups "staff", "wheel", and "admin". Furthermore, it claims that a non-admin user, (I made one called "test2"), is also in the group "staff". However, I tend to agree with NetInfo Manager, because if you create a new user, his directories are all owned by the group "staff". However, in the Finder, even an ADMINISTRATOR cannot access the files. This shows that, in fact, only the root user is in the group "staff", as NetInfo Manager correctly reports. I have no idea why the "groups" command would report otherwise. Obviously, though, this is a flaw in Mac OS X, because an admin should be able to access another user's files via the Finder, as I had posted in a previous thread before, but no one really took notice. A simple solution would be to make all new folders created for new users to be owned by the group "admin" instead. This way the owner of the files can access her files, and any admin can, but everyone else cannot. Anyone else care to comment on this flaw -- I would like to know if anybody is experiencing the same problem -- log in as administrator, and create a new user that is not an administrator. Then, if you go to the "Users" folder inside your Mac OS X partition, and open up the home folder for the user you just created, you should be denied access from going into any of the "Movies", "Music", "Documents" and similar folders. I would REALLY like someone to confirm this.

That's enough UNIX for one day. I'll be happy to answer other questions about UNIX though. Also, I believe I am correct for most of this stuff, but if I am not, anybody is free to correct me (but I won't admit I'm wrong without proof!

).

kilowatt · Dec 18, 2001

Thankyou, very much!

that was very helpfull, especially the tr and cut commands. this opens up tons of scripts I couldn't complete before. Also, the ` is interesting as well.

Thankyou, thankyou, thankyou.

simX · Dec 18, 2001

I took a UNIX class last year because I knew I would want to tinker around with the command line, and I don't regret taking it at all. Even though the class was boring, it really pushed me to learn the basic commands, and then I can really learn by myself from then on. I highly suggest taking a UNIX class to anyone who REALLY wants to tinker with OS X's underpinnings, even if it's a beginning UNIX class -- preferably one that deals with BSD UNIX.

I really would like people to verify my observations on my computer and my mom's iBook that all new users have folders that are owned by "staff", and so even administrators cannot access them. Can someone PLEASE verify this for me?

And, as always, if you want any more UNIX stuff, I'll be happy to try and explain.

blb · Dec 18, 2001

On tr: a nice thing about tr is it also accepts character ranges, so if you want to lowercase everything:

Code:

tr A-Z a-z

changes all uppercase letters to lowercase.

On cut: the place where I've seen cut be most useful is parsing fixed-length fields. If you have something like:

Code:

FIELD001FIELD002FIELD003
VALUE1  VALUE2  VALUE3
VALUE244VALUE245VALUE246

you can get the value of field one with

Code:

cut -b 1-8

field two with 9-16 and three with 17-24; not that this is a regularly needed function these days, but every now and then you may run across something like that.

Finally, on the issue of group ownership and access for administrators, the first thing is if a new user's folders are group-owned by admin, then they are owned by a group of which the user is not a member, which seems somewhat counterintuitive. If non-admin userA wants to share something with other staff members (ie, non-admins) but not the world, he/she would have to chgrp it first, then set permissions. Besides, as an admin, you have sudo access, and therefore access to the entire system. As far as Finder access, I can only speculate, but (since it is the classic Mac side of Mac OS X) my guess is it is a form of new user protection; you can't do accidental stuff to another user's directory easily through the Finder, hence making it more difficult to do something you wish you hadn't.

ladavacm · Dec 19, 2001

Originally posted by simX
If I put in the command groups simmy, the terminal claims I am part of all 3 groups "staff", "wheel", and "admin".

However, in the Finder, even an ADMINISTRATOR cannot access the files. This shows that, in fact, only the root user is in the group "staff", as NetInfo Manager correctly reports. I have no idea why the "groups" command would report otherwise. Obviously, though, this is a flaw in Mac OS X, because an admin should be able to access another user's files via the Finder, as I had posted in a previous thread before, but no one really took notice.

Better than groups is the id command. groups gives you the list of all groups you are a member of, but /etc/group file (or the equivalent netinfo directory) contains only the secondary group membership: the primary group id is in the third field of /etc/passwd (netinfo equivalent).

As far as default GID assigned to files is concerned: you are in BSD here, and files (and directories) inherit the GID of the directory they are created in (SysV will give you the same behavior if you set the setgid bit on the parent directory). Also, unlike SysV, in BSD one is member of all his primary and secondary groups at the same time--there ain't no such thing as newgrp.

As far as permissions are concerned, only root breaks the permissions; admin is not root, and this is good (keeps some semblance of privacy, since any admin can sudo, but not from Finder).

Needless to say, home directories are:
localhost% ls -ld .
drwxr-xr-x 33 lada staff 1078 Dec 18 22:11 .

whereas adduser created ones are:
localhost% ls -ld Movies
drwx------ 2 lada staff 264 Dec 10 08:47 Movies

which pretty much explains why are these unreadable to anyone but me.

UNIX related things... (tr, cut, awk, and permissions)

simX

Unofficial Mac Genius

simX

Unofficial Mac Genius

kilowatt

mach-o mach-o man

simX

Unofficial Mac Genius

blb

`'

ladavacm

Unperson Spotter