Syllabus

The topics of the lectures will be:
 

1st week: Introduction, basics of Unix
2nd week: the file system, basic commands (man, pwd, cd, ls, cat,...); regular expressions for 'ls'; basic text editors (10 minutes of vi; pico), emailing (pine)
3rd week: 'chmod'; 'more', 'less'; pipes (|, <, >), standard input and output, more commands (tr, wc, sort,...)
4th week: metacharacters, escapes, quotes, coding different alphabets; regular expressions with 'grep'
5th week: emacs, xemacs and emailing with them; shell scripts, 'sed', N-grams, 'cut', 'paste'
6th week: Article ("William B. Canvar en John M. Trenkle: N-Gram-Based Text Categorization "), making a concordance (KWIC), Zipf's law
7th week: expressions, variables, conditions, cycles in Shell script, 'set'
8th week: Perl: introduction, data types, operators
9th week: Perl (cont'd): regular expressions, arrays, files
10th week: Perl (cont'd): an example; understanding more about UNIX; ftp, ps, kill, compression, tar, find, at, nohup, &,...
11th week: Article ("Gregory Grefenstette, Pasi Tapanainen: What is a word, What is a sentence? Problems of Tokenization"). Free time to questions.
There will be a practicum on the 2nd - 9th week. The practicum time of the 10th week will be reserved to the final assignment. The practicum time of the 11th week will be either reserved to the final assignment, as well, or that will be the time of the final exam.

The goal of the practica is to give the students the opportunity to make their assignments under better conditions: the computers are reserved to them and they can receive help from the instructors. Otherwise the only requirement in this concern is that the students must hand in their solutions the latest by the beginning of the next practicum, either made during the practicum time or not. Participation in the practicum is compulsory only in the first few weeks, when the solutions should be presented personally, and not sent via email.

Requirements:

The final grade is the sum of the points, divided by ten. LET OP: one gets the higher grade only from .51, i.e. an average of .50 still results in the lower grade.