[MPICH] slowdown in replica exchange dynamics calculation

Rajeev Thakur thakur at mcs.anl.gov
Tue Feb 19 16:26:53 CST 2008


The I/O could be the culprit. Disable the writes, ignore the initial read
time, and see if the 10-node run is faster. (It should be.) If that is the
case, you will need to use a faster file system, faster than NFS-mounted
home directories. You could look into PVFS, www.pvfs.org. 

Rajeev

> -----Original Message-----
> From: baxa at uchicago.edu [mailto:baxa at uchicago.edu] 
> Sent: Tuesday, February 19, 2008 4:14 PM
> To: Rajeev Thakur; mpich-discuss at mcs.anl.gov
> Subject: RE: [MPICH] slowdown in replica exchange dynamics calculation
> 
> Hi Rajeev,
> 
> Yes.  Both the single node computation and the 10 node
> computations are do-loops calling the same subroutine.
> 
> As for input and output, both programs initially read in a
> coordinate file and various parameter files.  Every certain
> number of timesteps (e.g. 500 steps), the coordinates are
> written to an archive file.  The simulation time wasn't
> affected by decreasing the number of time frames saved to the
> archive file.
> 
> Regards,
> 
> Michael
> 
> ---- Original message ----
> >Date: Tue, 19 Feb 2008 15:06:56 -0600
> >From: "Rajeev Thakur" <thakur at mcs.anl.gov>  
> >Subject: RE: [MPICH] slowdown in replica exchange dynamics
> calculation  
> >To: <baxa at uchicago.edu>, <mpich-discuss at mcs.anl.gov>
> >
> >> I am running replica exchange combined with langevin dynamics
> >> using a single fortran program.  For example, on a set of 10
> >> nodes, a langevin dynamics trajectory is calculated on each
> >> node for 10,000 steps.  After all the nodes have completed
> >> their number of steps, the nodes communicate with one another
> >> via MPI and determine whether to swap temperatures.  Once
> >> temperatures have been swapped, the pattern repeats.
> >> 
> >> It turns out though, that the dynamics calculation portion of
> >> the above program takes longer than it would if I were to just
> >> run a langevin dynamics calculation on a single node for 10000
> >> steps.  For 10 nodes, it is approximately 5-10X slower.  While
> >> no MPI commands are being executed during this time, the
> >> duration of the dynamics portion appears to depend on the
> >> number of nodes.  For this reason, it was suspected that this
> >> slowdown may be due to a suboptimal distribution of tasks,
> >> unnecessary barriers, or IO waits.
> >
> >Is the computation in the single-node case identical to the
> computation on
> >10 nodes. If not, you can't really compare.
> >
> >If there was no MPI in this phase, there should be no
> barriers either, and
> >even suboptimal distribution of tasks should not cause a
> slowdown if it is
> >all local computation. Is there any file I/O?
> >
> >Rajeev
> >
> >
> >
> 
> 




More information about the mpich-discuss mailing list