[mpich-discuss] MPICH2 (or MPI_Init) limitation | scalability

Darius Buntinas buntinas at mcs.anl.gov
Tue Jan 10 12:20:36 CST 2012


I think Dave has the right idea.  You may not have enough shared memory available to support that many processes.  There are two ways MPICH2 allocates shared memory, System V or mmap.  System V typically has very low limits on the size of shared memory regions, so we use mmap be default.  To make sure mmap is being used, send us the output of:

grep "shared memory" src/mpid/ch3/channels/nemesis/config.log

Thanks,
-d


On Jan 10, 2012, at 9:39 AM, Dave Goodell wrote:

> On Jan 10, 2012, at 7:20 AM, Bernard Chambon wrote:
> 
>> Le 10 janv. 2012 à 00:52, Dave Goodell a écrit :
>> 
>>> Make sure you include a call to MPI_Finalize in your test program as well.
>>> 
>>> -Dave
>> 
>> 
>> I'm afraid that not the problem
>> 
>> The question is that there is a limitation in mpich2 software or more probably in my OS|Machine, but I can't find it ?
>> there is clearly a limit at 152 tasks even after getting rid of limits (*),  and increasing shared memory values (**)
> [snip]
>> >sysctl -A | grep kernel.sh
>> kernel.shmmni = 16000
>> kernel.shmall = 8388608000
>> kernel.shmmax = 33554432
> 
> 
> This number (shmmax) looks to be too low to me.  It's only 32 MiB, which is pretty small.  Try increasing this by a factor of 8 or so and see if you end up fixing your problem.
> 
> How many processes are you launching on each node?
> 
> -Dave
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list