Introduction to Programming with OpenMP --------------------------------------- Practical Exercise 7 (Critical Guidelines) ------------------------------------------ Use similar commands to compile and run the program as in practical exercise 2. This exercise is provided mainly to give you a test harness for some hard tuning issues. The example code is in Fortran only, unfortunately, so C programmers would have to transliterate it. You will not find it easy to get good performance, though by far the easiest one is the 2-D FFT using transposition. No answers are given for this exercise, as it really is something intended only for people who want to investigate the tuning of difficult codes. Translation: I ran out of time, and my preliminary tuning really wasn't worth providing as an example. Question 1 ---------- 1.1 Starting with Programs/FFT.f90, improve their performance. You should run them with a single argument Programs/matrices_f_2900 or Programs/matrices_c_2900, according to which language you use. Put in OpenMP parallel loop directives around any loops you feel appropriate, with no other code changes, taking great care to parallelise only loops where doing so will not introduce race conditions. Note that this seriously degrades the 1-D FFT performance, and improves the 2-D performance less than you would hope. 1.2 Assuming that the loops you parallelised were in the Transpose and Pass... procedures, you can try any or all of the following: Using a temporary array or a blocking algorithm in Transpose (done right, the blocking algorithm is best). Changing the order of loops in the Pass... procedures and which ones are parallelised. Using the if clause on the parallel loop directives. 1.2 Really determined users can try radical restructuring of the code, but this is a notoriously hard tuning problem.