[mpich-discuss] mpirun on 1500~2000 cores

Rajeev Thakur thakur at mcs.anl.gov
Sat Jul 4 23:24:33 CDT 2009


There was a fix applied recently that significantly improves the time taken by MPI_Init on large numbers of processes when using the
Nemesis communication channel and the MPD process manager (both are the default options). The fix will be included in the 1.1.1
release due out late next week.

In the meanwhile, please try out one of the nightly snapshots of the svn source from
www.mcs.anl.gov/mpich2/downloads/tarballs/nightly/trunk/ and let us know if it improves the time taken to start your job.

Thanks,
Rajeev


 

> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of dvg
> Sent: Saturday, July 04, 2009 10:04 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [mpich-discuss] mpirun on 1500~2000 cores
> 
> Hello,
> 
> What would be considered as reasonable time for mpirun to 
> start a job on 1500~2000 cores, 1 gige cluster?
> 
> Are there any kernel (linux) or eth-related parameters which 
> can be tuned to speed it up?  MPICH2 libraries were compiled 
> with most/all optimization options enabled.
> 
> Thank you,
> Dmitry
> 
> 



More information about the mpich-discuss mailing list