Practicum - week 8
1.
Write a shell script that creates a list of frequency of N-grams based on a text document. The name of the document is given as the argument of your script. Don't do it in the way Canver & Trenkle do (i.e. by inserting additional spaces before and after each word). Just move a window that is N-character-long along your text, and calculate the frequencies. E.g. the following sentence
This is the way to build N-grams
should produce the following bi-grams: _t, th, hi, is, s_, _i, is, s_, _t, th, he, e_, _w, wa,... And the following tri-grams: __t, _th, thi, his, is_, s_i, _is, is_, s_t, _th, etc.
Look at previous lecture notes and previous exercises for some tips how to build N-grams on the character-level.
Now, I would like you to get a list of these N-grams, combining N = 1, 2, 3, and 4, in a decreasing order of frequency. (In a next step this list could serve as the profile used by the algorithm of Canver & Trenkle.)
To check if you have done a good job, my result are for fed1.txt (you can try it yourself...):
birot@hagen:~> ngrams ~/Federalist/fed1.txtetc... (Remark: if you do it for a serious research, it would be helpful to transform all series of more than one space into a single space.)
5342 _
3576 __
3417 ___
2936 ____
946 e
752 t
618 o
527 i
522 n
506 a
445 s
416 r
363 h
302 e_
297 _t
278 l
246 d
229 th
218 c
214 _th
209 u
203 f
191 _a
175 he
169 m
165 p
165
164 _o
156 the
146 _the
142 s_
135 y
133 t_
133 er
129 n_
125 on
122 he_
118 the_
117 b
113 w
110 _of
108 g
106 d_
106 _i 105 of
105 in
105 ,_
105 ,
103 an
102 f_
100 y_
99 re
98 of_
98 _of_
97 ti
95 en
90 _w
89 v
88 es
86 at
83 r_
83 o_
79 T
75 io
74 te
73 it
(3 points)
2.
Create a model for a desk calculator, by writing a Perl program. It should read one argument, then the arithmatic operator (+ , - , * or /), then the second argument, and return the result. Then get back to the beginning, and wait for a new argument. This should go as long as one enters an argument that makes the program halt (e.g. stop, exit, quit, halt, bye). When you have to divide, check beforehand if the second argument is not 0 (in the case of an error you can use the bell of the computer!).
Look at the file: /users1/birot/calcul for
an example of how my program runs. Or come to me in the practicum time
to let you try out my program.
(4 points)
3.
Write a program in Perl that is a first approach of implementing the Master Mind game. It should do the followings:
- You are asked a number to be found out (colours are replaced with digits 0 to 9; e.g. 4321 would stand for blue-red-green-yellow, if you had colours).Hints:- Then you are asked to guess a number (suppose the other doesn't see the previously given number).
- For each guess, the number of good digits are given: this is the number of digits that are the same in the guess and in the first given number (to be found out).
- If the guess is exactly the same as the first given number, then the game stops.
1. Use a scalar variable (a string) for both the first given number and the guesses.2. Don't worry about the number of digits (suppose that both players know the number of digits in the number to be found out).
3. Suppose each digit occurs only once in both the number to be found out and in the guesses. (So 3692 is okey, but don't worry about cases like 3355.)
4. Use regular expressions to check if a given digit occurs in a string.
5. Don't worry about telling how many colours (digits) are in the correct place.
If you want to, you can go on, and write a real Master Mind program....
Look at the file: /users1/birot/mastermind for
an example of how my program runs. Or come to me in the practicum time
to let you try out my program.
(3 points)