practicum-week6

Practicum - week 6

Finish going through the XEmacs manual.

There is a Hungarian children game, and its title in Dutch would be something like: "Kavan jeve avals divit sprevekeven?" It is not Latin, but the point is to "break" all vowels by a [v] sound. So this would be the result of the sentence "Kan je als dit spreken?"

Can you write a command line that does this job?

Cavan youvu wrivite ava covommavand livine thavat doevos thivis jovob?

(3 points)

Imagine you have a reverse cross-word puzzle: you have letters in a matrix, vertically and horizontally, and you are looking for meaningful words in it. One of the
task you need to do is to read a vertical line, i.e. you want to transpone a vertical line inte a horizontal one.

Write a shell script that has two parameters. The one gives the number of the column to be transposed, and the other gives the name of the file containing the
cross-word puzzle.

(That is the shell script has to cut out one column and transform it into a row.)

For instance if the name of the script is cr-w and I entered the following command line: cr-w 2 Federalist/fed1.txt

then the result is:

E eo A oFsasnorrcqetcrrwthpgsupsaiffmCoaaewtectnitamthabpjwmfpfrtAowqnhppcnhiactateshofpojeafojdfzttwhcnmamowsnhisaaatipojw iHHTEOPIaAGnsttpootwgrwpaaoUadTP hC

(4 points)

Write a shell script that help you comparing the 50 most frequent words of three files (e.g. fed1.txt, dis50.txt and mad14.txt out of the Federalist papers). The file names should be given as arguments of the script (cf. $1, $2,...). I would like to see three columns in the output, each of them beginning with the file's name, and then the 50 most frequent words. You can use temporary files. You have to make use of your solution of previous exercises, as welll as have a look to Henny Klein's tips on how to make a word list (Excursus in week 3's lecture notes). Further tips: delete the interpunctuation marks before doing the word counts (cf. end of today's lecture note).

(3 points)

Here is the beginning of the output I got:

Federalist/fed1.txt Federalist/dis50.txt Federalist/mad14.txt

        the             the             the
        of              of      of
        to              be              to
        and             and             and
        be              to              be
        that            in              which
        in              a       in
        will            their           that
        a       been            a
        which           that            will
        it              have            it
        their           would           is
        not             on              as

- - - - - - - - - -

+1.

Question to think about, and any possible solutions are welcome.

There are different standards to write numberals. In the English tradition you put a period between 1 and 1/10, while in the continental tradition you put a comma to this same place. On the other hand English people put commas to separate the houndreds from the thousands, the thousands from the millions, etc. To this place European people put a point of a space. Can you write a command line that does this conversion to both directions? Imagine, you want to transform a longer scientific or economical article, within which these numbers occur.

Could you do task job 3 weeks ago? What would you have done then? What were the consequences of it in a longer text?