[MPICH] slowdown in replica exchange dynamics calculation

Tue Feb 19 15:06:56 CST 2008

> I am running replica exchange combined with langevin dynamics
> using a single fortran program.  For example, on a set of 10
> nodes, a langevin dynamics trajectory is calculated on each
> node for 10,000 steps.  After all the nodes have completed
> their number of steps, the nodes communicate with one another
> via MPI and determine whether to swap temperatures.  Once
> temperatures have been swapped, the pattern repeats.
> 
> It turns out though, that the dynamics calculation portion of
> the above program takes longer than it would if I were to just
> run a langevin dynamics calculation on a single node for 10000
> steps.  For 10 nodes, it is approximately 5-10X slower.  While
> no MPI commands are being executed during this time, the
> duration of the dynamics portion appears to depend on the
> number of nodes.  For this reason, it was suspected that this
> slowdown may be due to a suboptimal distribution of tasks,
> unnecessary barriers, or IO waits.

Is the computation in the single-node case identical to the computation on
10 nodes. If not, you can't really compare.

If there was no MPI in this phase, there should be no barriers either, and
even suboptimal distribution of tasks should not cause a slowdown if it is
all local computation. Is there any file I/O?

Rajeev