Practicum - week 9

Write a PERL program that reads a string and decides the type of it: is it an e-mail address? is it an URL (the address of a web-site)? is it the name of somebody (initial letters, surname, family name, etc.)?

If the task is too hard to formulate with regexps, try to approximate the solution as much as you can. E.g.: in a first approximation, an e-mail address contains a @ symbol, while a person's name is composed of an Uppercase letter, followed by lowercase letters.

Return the type of the input string. Do it as long as the input string is not "exit", "halt" or "stop".

You have this type of checking for instance when you use a newer version of Word: it will automatically detect and e-mail addresses and any URLs, and change them into underlined blue text.

(3 points)

Bubble sort:

Sorting is a basic task in computer science. So far we have used the 'sort' command to do that, and we haven't dealt we the question how this is done.

In fact there exist a number of algorithms for sorting the elements of a list or of an array. If the list or the array has N elements, the complexity of the algorithm is the function of N that gives you in how many steps the sorting can be done. If you would compare each and every elements to all possible other elements, than you would have N(N-1) steps. This is called to be an algorithm of quadratic complexity. If N is small, this is not a big number either, and you don't care about computing time. But if N grows, this will grow much quicker, so you may want to avoid to wait for ages before you get your result. Therefore it is very important to look for algorithms as quick as possible.

One of the quickest algorithm for sorting is called "quick-sort". It is a recursive algorithm, not long, but difficult to understand. (Probably you will soon meet it within the frame of another course.) Its complexity is proportional to N.log(N), which means that it is a very good (i.e. relatively slow) algorithm. Compare the rate how this grows in the function of N to the "explosion" of the steps (time) needed by a quadratic algorithm:

N        N(N-1)      N.logN
10       90           10
100      9900         200
1000     999 000      3000
10 000   99 990 000   40 000

There are a number of algorithms, having an "intermediate" complexity. For instance, an easy one is the following, having a complexity of N(N+1)/2: you go through your array, and you look for the highest one (this needs N comparisions). You pick up the biggest one, and put it into the last position of your new array. Then you go through the remaining N-1 elements (this is N-1 comparision, if you are good enough), and the highest value (not taking into account the one already put into the new array) goes into the last-but-one element of your new array, etc.
This means the number of comparisions (steps) you need is:

N + (N-1) + (N-2) + ... + 2 + 1 = N (N+1)/2

This is a quadratic complexity, due to the double cycle (one cycle inside the other).

The algorithm would be (this way you need to have N^2 steps):

# Your array to be sorted: @old(0..$N-1)
# Suppose it has only positive values
# The result goes to: @new(0..$N-1)
for ( $k = $N ; $k > 0 ; $k--) # you are looking for the k-th smallest element
   {
     $highest = 0;
     $rank_of_highest = 0;
     for ( $i = 0 ; $i < $N ; $i++)   # Notice the double cycle
        {
         if ($old[$i] > $highest)
             {
               $highest = $old[$i];
               $rank_of_highest = $i;
             }                    # end of if
        }                         # end of for (i)
     $new[$k-1] = $highest;
     $old[$rank_of_highest] = 0;   # not to be considered again
}                               # end of for(k)

Now, there is another algorithm, that is called "bubble sort". It has three advantages to the previous one:

1. It is a little bit quicker, needing only N(N-1)/2 steps.
2. It needs less memory, because it sorts an array into itself, and not into another array.
3. It is much more fun.

The idea is the following: the same way as light bubbles come up gradually in a liquid and heavy objects go down, the same way small values will get up, and high values will get down into your array. The idea is the following. First you compare the first and the second element: if the first is higher, they change place (don't forget an intermediary variable!), otherwise you leave them. Then you compare the second and the third element: if the second is higher, they change place, otherwise you leave them, etc. At the end you compare th N-1-th element to the N-th element, and make them change place if the N-1th element is higher. This means N-1 comparisions, and at the end you can be sure that the highest element reaches the last position of the array (why?). In the second cycle you restart comparing the first and the second element, etc., but you don't need to go therefore until the last element, any more. This is only N-2 comparisions only. The third cycle needs N-3 comparisions, etc. Therefore the complexity of this algorithm is:

(N-1) + (N-2) + ... + 2 + 1 = N(N-1) / 2

Your task now is to write a Perl program that reads reads 20 numbers from the keyboard into an array (a number in each line, e.g.), and sorts them, using the "bubble sort" algorithm.

(4 points)

Write a Perl program that simulates the machines used at the cash (the cash desk) in supermarkets. It does the following:

- Prints the name and the address of the supermarket at the top of the bill.
- Waits for the cashier to give in the name of the given products (nowdays this is done by bare code readers, but don't care about this).
- There is a possibility to erase the last entry, if an error has occured. Or, alternatively: enter the entry once more, and substract it from the sum.
- Prints the name and the price of each article on the bill.
- Prints the sum to be payed at the end of the bill.

Since you don't want to rewite your program whenever you install this program to a new supermarket, or whenever you change the price of your products, you should read the name and the address from a file (so that it will be enough to replace the file), and similarely, you should read the name and price of the products from another file into (I suggest) a hash-table.

I suggest to communicate with the cashier through the standard error, and to send the bill print to the standard output. So if you run the program by redirecting your standard output, you would get your bill in a file.

When sending the solution to Mariette, attach also a sample of these two files, in order to show what format these files should have.

(3 points)