Practicum - week 14

The assignments this week include writing shell scripts. A good habit is to add comments into your programs (not only shell scripts), so that other people, and also yourself in the future, be able to reconstruct your original train of thought. In a shell script, you can add comments by starting each line with a #: whatever appears in a line after the # is not processed (the #! is an exception).

It seems that the $PATH variable on Hagen (or on Hilde{01..14}) does not include the '.' (that is your working directory). Check it by typing 'echo $PATH'. This means that even if you have made your shell script executable (chmod +x), you cannot run it simply by typing the name of the script, because the shell cannot locate it. So you need to launch all your scripts by giving its path (absolute or relative), e.g.: ./myscript arg1 arg2 . Alternatively, you can type: bash myscript arg1 arg2 (in this latter case chmod is even not necessary).

Good luck for the last set of assignments! Enjoy!

The following is the man page of the command "loveletter"

NAME: loveletter - creates a love letter, and prints it on the standard output

SYNOPSIS: loveletter [ addressed [ sender [ FILE ] ] ]

DESCRIPTION: creates a love letter, beginning with the phrase "Dear addressed,", followed by some standard message, and ending with the name (signature) of sender. If FILE is also specified, then the content of FILE is read, and inserted somewhere into the body of the message.

For example, the command line:

loveletter Judy Joe message-to-Judy

will print something like this:

Dear Judy,

I love you so much! I miss you so much! I can't live without you!

[the following lines are read from the file called 'message-to-Judy':] Yesterday, when we were in the cinema, and I could hold your hand, you know, it was the most beautiful moment of my life. I would like to meet you again. [end of the file 'message-to-Judy']

I hope to see you again, very soon!

Love,

Joe

Can you write this command as a shell script?

(3 points)

Write a script that lists the N most frequent words of a given file, including their frequency. Argument 1 of the script should specify the number N, whereas argument 2 of the script contains the name of the file

(2 points)

Now, you most probably have the frequencies of the words before the word themselves. Try to get the frequencies after the word. A hint: in each line, you most probably have a tab between the frequency and the word. Furthermore, tab is the default divider between fields.

(1 point)

Read the section in the lecture notes about type-token ratio, if you have not done so yet. Then, write a script that will print the type-token ratio of some text file to its standard output.

The text can be either the standard input of the script, or a file whose name is given as the (first) argument of the script. In order to deal with both in a uniform way, you may use the trick showed by Lonneke: look at the end of the lecture note to see this example, as well as further examples that may be useful. (These examples have been added after the print-out of the lecture notes were posted on the web site.)

It is up to you if you decide to give the result as a number between 0 and 1, or in percentage (0 % to 100 %). The second may be useful if you want to avoid floating point arithmetic, and use a command that is able to perform only integer arithmetic: in the latter case, first multiply by 100, and only then perform the division.

You can solve this assignment in two ways (at least). You can either use a quite complex command line. This is a command line that you would have understood even two weeks ago: remember how you can turn the output of a command line counting the number of words into the argument of a command performing an arithmetical calculation.

Alternatively, you can use variables in order to deconstruct the computation into several, but less complex steps.

Give a solution in either of these two ways.

(3 points)

Solve the problem in the other way, too.

(1 point)