Practicum - week 11

Introductory remarks:

(Have you read through the written version of this week's lecture? There are details not mentioned in the class.)

When you have a task to solve, first read it through very-very carefully (many mistakes are done because people have not read the text of the assignment carefully enough). Then decompose the problem, and don't try to solve the whole task at once. Don't be upset if you do not know how to solve it immediately. Play around with the problem, and try to solve a simplified version first: maybe this playing around with the problem will give you ideas. Finding how to decompose a task usefully is usually already half way to the solution.

Therefore, first try to solve one part of the assignment, and check whether you have done it. Do not go further as long as you are not sure that the partial solution does exactly what you want. So, when you build a pipe-line for instance, first check the outcome of the first element by adding a `more' or a `less' at the end of the pipe line. If something goes wrong, go back, and check each element separately, and check each combination separately, and try to understand what caused the problem.

Another tip: before trying to solve the assignments, read through the examples on the web-site of the lecture and your own lecture notes. The web site is also meant to be a sort of "short Unix manual", specially done for you. Furthermore, it is not enough to read them through: you need to think them over, and to try them out on a computer. If you do not understand fully an example, decompose it: try out what the first part of the pipe-line does, and what the second part does. What happens if you do not use a given option, or if you leave out one element of a pipe-line. Try to really understand what you do!

The assignments, obviously, are related to what we have learnt this week and in the last weeks: wild cards, pipe lines, different commands, etc. Try to remember all of these.

Unless otherwise specified, you need to email me one pipe-line as the solution of each assignment. If you have just partial results, and you have not solved the whole problem, still send me what you have so far.

List the whole directory tree of the Unix system you are working on into one (huge) file file_structure, "without getting" any error message.

Hints: Firstly, you have to know what it means to list something recursively: it is to list the content of a directory, then the content of its subdirectories, then the content of the subdirectories of the subdirectories, etc. You can list the whole directory tree if you start from the right point of the tree, and then you descend recursively (by using some option of `ls').

Secondly, you cannot avoid error messages, because you are not allowed to enter some directories. Nevertheless, you can suppress them by sending them into some file. Why not using the file /dev/null, which is a sort of "black hole" (anything redirected to that file gets lost)?

Lastly, don't be afraid if the command line runs very long. As you do not have to send me that file, you can just kill the process by typing ^C, without waiting for the process to terminate.

(2 points)

Let us get back to the Federalist papers (cf. the assignment last week).

Count the number of characters written either by John Jay or by Madison alone, indisputably (that is, the number of characters appearing in the articles attributed to him).

How to do that within one command line? First, you may want to concatenate (join together) all these texts, before you count the characters. The result of the command line should return just one number.

(2 points)

Write a command line that will return lines 21-30 of the file joint20.txt. You may want to use two commands learnt this week.

(2 points)

Give a command line that will return as many lines from the beginning of mad40.txt, as many lines there are in total in mad48.txt (without you knowing that number). In other words, shell has first to count the number of lines in mad48.txt, and immediately use this information to collect so many lines from the beginning of mad40.txt.

An important remark. In the case of commands that transform somehow a file (cat, wc, head, tail, rev, etc.) or their standard input, you have usually two ways of sending the content of a file to the command. Either you give the file name as an argument of the command, or you redirect the input of that command (< filename). Try out both ways, because the outlook of the output (that you possibly want to turn into the argument of another command: do you know how to do that?) may be different.

Actually, if the file names are given as arguments, then more files can be also processed, and the file names appear in the output. However, if the standard input is used (even when you use `< filename'), the command itself does not know what file has been attached to its standard input by shell, therefore, no file name appears in the output. (Shell sends the content of that file to the standard input of that command, and the command has no way to know what has been sent to its input.)

In the present case, you should make sure that when you count the number of lines in mad48.txt, the output is just a number, and does not contain the file name. Otherwise, you will be returned the file mad48.txt, and not the beginning of mad40.txt. Before handing in your solution, check which one is returned by your command line.

(2 points)

Create a reverse alphabetical list of the file dis49.txt. This should meet the following criteria:

each word is in a new line
each word appears only once
the words appear in an "anti-lexicographical order"

An "anti-lexicographical ordering" means ordering the words according to their end: we take first their last letter, and if they are the same, we compare the penultimate one, etc. Thus "zebra" (ending in an 'a') would appear at the beginning of the list, after "honda", whereas "almighty" (which ends in a 'y') shows up only at the end. The end of the list will look like:

necessary
February
very
every
inquiry
theory
carry
jealousy
society
stability
tranquillity
community
authority
security
diversity
ingenuity
party
pretty

Hints: first try to get a usual word list (cf. your lecture notes, and read the web site). Then, you have to find out how to order a list in an anti-lexicographical way: ordering in an anti-lexicographical way means sorting the mirror image of the strings in a traditional way, and then mirroring (reversing) everything back again!

Although you might be surprised, but such anti-lexicographical lists of words are actually very useful. Quite a few similar lists were found from the Middle Ages, when they were used by poets to find out nice rhymes. In the 20th century, reversed-order dictionaries were printed for the use of linguists who were interested in linguistic phenomena at the end of words.

(2 points)