Using ex2f to determine server's MPI performance (resend)

Mon Jul 7 19:31:09 CDT 2008

Hi,

I sent this question a while ago. However, it was not answered. Anyway, 
if the question below is not as relevant, maybe just 1 simple question:

should the ex2f code with about 1000000 unknowns scale properly, maybe 
up to 8 or 16 processors if the server distributes the MPI load properly?

Thank you very much.

Here's the previous email:

I'm thinking of using the example code ex2f to determine my school 
server's MPI performance. Since my own code uses a 1200x1080 grid, I let 
m and n in ex2f.F be equal to 1200 and 1080 respectively. I ran the code 
on 4 processors and found that it's taking a very long time to complete 
for this job (> 0.5hr).

I then use Hypre as the preconditioner and it completed in a few min or 
so. I decided to use this to gauge my server's scaling performance. My 
own code is also using  Hypre. However, in order to increase the runtime 
to make the timing less prone to random timing errors, I added

do k=1,100
   call KSPSolve(ksp,b,x,ierr)
end do

to repeat the solving 100 times. Is this fine (will there be any cache 
related problems)? Else, is there a better way? I need to keep the grid 
size fixed so that I can compared with my own code which uses the same 
no. of grids.

Anyway, I ran the code and I got this result:

processors   ksp_solve time (from log_summary)   wall time min/sec

1                  4.6e2                                 7/50
2                  2.4e2                                 4/9
4                  1.3e2                                 2/20
8                  1.1e2                                 2/23
12                 7.7e1                                1/24

Does this mean that my school server scales poorly for >4 processors? I 
repeated running the code using the 8 processor and the new timing is 
even longer.

Thank you very much.