Slow speed after changing from serial to parallel (with ex2f.F)

Satish Balay balay at mcs.anl.gov
Sat Apr 19 08:52:51 CDT 2008


On Sat, 19 Apr 2008, Ben Tay wrote:

> Btw, I'm not able to try the latest mpich2 because I do not have the
> administrator rights. I was told that some special configuration is
> required.

You don't need admin rights to install/use MPICH with the options I
mentioned. I was sugesting just running in SMP mode on a single
machine [from 1-8 procs on Quad-Core Intel Xeon X5355, to compare with
my SMP runs] with:

./configure --with-device=ch3:nemesis:newtcp -with-pm=gforker

> Btw, should there be any different in speed whether I use mpiuni and
> ifort or mpi and mpif90? I tried on ex2f (below) and there's only a
> small difference. If there is a large difference (mpi being slower),
> then it mean there's something wrong in the code?

For one - you are not using MPIUNI. You are using
--with-mpi-dir=/lsftmp/g0306332/mpich2. However - if compilers are the
same & compiler options are the same, I would expect the same
performance in both the cases. Do you get such different times for
different runs of the same binary?

MatMult 384 vs 423

What if you run both of the binaries on the same machine? [as a single
job?].

If you are using pbs scheduler - sugest doing:
- squb -I [to get interactive access to thenodes]
- login to each node - to check no one else is using the scheduled nodes.
- run multiple jobs during this single allocation for comparision.

These are general tips to help you debug performance on your cluster.

BTW: I get:
ex2f-600-1p.log:MatMult             1192 1.0 9.7109e+00 1.0 3.86e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 11  0  0  0  14 11  0  0  0   397

You get:
log.1:MatMult             1879 1.0 2.8137e+01 1.0 3.84e+08 1.0 0.0e+00 0.0e+00 0.0e+00 12 11  0  0  0  12 11  0  0  0   384


There is a difference in number of iterations. Are you sure you are
using the same ex2f with -m 600 -n 600 options?

Satish


More information about the petsc-users mailing list