Introduction to Programming with MPI
------------------------------------

Practical Exercise 09 (Miscellaneous Guidelines)
------------------------------------------------

For general instructions, see the introduction to the collective
practicals.


Except for the first, these examples are about how to handle I/O, and
they rely on you understanding much of what has gone before.  The first
is a trivial question on transferring structures.

It is tricky to test system-dependent examples (like these) properly.
They really need a more user-hostile system, and to be embedded in
complicated code, because they will usually work in simple cases even if
there are quite serious mistakes.  So don't rely on them actually
working, even when they seem to.

Normally, there is generally no point in sending data from a process to
itself, and an I/O process will simply read and write its own data
directly.  If you are using blocking, unbuffered transfers, you will
have to do that.  If you want practice with non-blocking transfers, you
can repeat these exercises using them and treating the I/O process
symmetrically, but there are no worked examples of that.

For now, regard the root process as process zero.


Question 1
----------

1.1 Take a copy of the program you wrote in question 1.1 in practical
exercise 3, that was called 'Ahab'.  Change the file name from
reals.input to structs.input, where each line contains an integer and a
space-free string of up to 4 characters, make the appropriate changes
and run the program.

C and C++ programmers:

Change the array from 'double' to 'struct{int index; char name[5];}',
and use MPI_BYTE and sizeof().

Fortran programmers:

Change the array from 'REAL(KIND=DP)' to 'TYPE(Struct)', where the
latter is defined by:

    TYPE :: Struct
        INTEGER :: index
        CHARACTER(LEN=4) :: name
    END TYPE Struct

and use MPI_BYTE.  Finding the size of a derived type cannot be done
cleanly until the forthcoming Fortran 2008 standard, so don't bother,
and just define a PARAMETER for the size with the value 8.  It is
possible without that, but this course doesn't teach disgusting hacks,
except where critical.


Question 2
----------

2.1 Write a program to read one line (WITHOUT any terminating newline)
from the file 'Programs/textfile.<n>', where <n> is the process number
(i.e. a separate file for each process).  Each line is no longer than 80
characters, but all are of different length.  Use process 0 for all
output.

Write some collective-like code where the root process collects each
line from each process, and prints them in rank order.  Pad all lines to
80 characters to do this.


2.2 Change the program to transfer only the significant text (i.e.
excluding any trailing spaces, padding and terminating characters).  You
need not reallocate the receiving buffer each time, though you should do
that if the messages were of genuinely arbitrary length.


2.3 Change the program to read in the whole of each file, and send each
line separately to the root process, which should print them in rank
order, with a blank line after each.  You need to use point-to-point and
buffered sends, and should terminate the file by sending a special line,
which you don't display, such as:

    +++ End Of Transmission +++


2.4 Change the program so that the root process calls a procedure that
will print all lines it has received, in whatever order they have come
in.  Use MPI_IProbe to determine if there is nothing to collect and
return immediately.  You should use a distinctive tag for this purpose,
and probe and collect only the messages with that tag.

Do not send the end of transmission line, and loop indefinitely calling
that procedure.  You will have to break in when it has finished.
Obviously, in a real program, some separate criterion needs to be used
to determine when all processes have finished and I/O has been
collected.

This method will work even if messages are sent to the root process
while it is involved in other work, provided that the buffer doesn't
fill up.

People who know how for their language should also make the procedure
capable of handling strings of any length; this is not hard in any
of the languages, but needs skills that this course does not assume.

Make a safe copy of this program and call it Tashtego.


Question 3
----------

3.1 Write another program that reads in an integer <count> in the range
1 to 1,000,000 into the root process, broadcasts it to all processes,
and writes out lines of the form "Process <rank> line <number>" to
standard output on each process, with <rank> being the process number
and <number> going from 1 to <count>.

C and C++ programmers should add the following statements immediately
before the call to MPI_Init:

    C:         setvbuf(stdout,NULL,_IOFBF,100);
    C++:       unitbuf(cout);

Test this by specifying <count> to be about 100, and see the mess it
makes of the output.  The effects are very system-dependent, and you
may need to use a much larger value to see them on some systems.


3.1 Change this program to do write out lines to the standard output
only from the root process, using the same technique in the program you
wrote in question 2.4, that was called 'Tashtego'.  You will probably
find it easier to start with 'Tashtego'.  Check that it no longer
produces the same messy output.

Note that this is NOT a safe way of doing this - if the root process is
busy with its own activities and any of the buffers fill up, the program
may fail.  It is safer if the root process does nothing except I/O, but
even that is not entirely safe.  Also note that the changes from the
previous answer are considerable, and it is more similar to 'Tashtego'.


3.3 Repeat the previous question, but use synchronous sends in all
processes except the root process, and pick up the messages in the I/O
process using non-blocking receives and MPI_Waitany.

This will work reliably, provided that the root process uses no other
MPI communication.  It is possible to make it fully reliable when it
uses other MPI communication is possible, but describing the constraints
is complicated.  You have learnt enough to deduce them, but they aren't
obvious.

This is generally the best method for use in real programs.  Obviously,
when doing it for, you should not send from the root process to itself,
and should use ordinary sends, not synchronous ones.