[mpich-discuss] MPICH2 (or MPI_Init) limitation | scalability
Bernard Chambon
bernard.chambon at cc.in2p3.fr
Tue Jan 10 09:20:46 CST 2012
Le 10 janv. 2012 à 00:52, Dave Goodell a écrit :
> Make sure you include a call to MPI_Finalize in your test program as well.
>
> -Dave
I'm afraid that not the problem
The question is that there is a limitation in mpich2 software or more probably in my OS|Machine, but I can't find it ?
there is clearly a limit at 152 tasks even after getting rid of limits (*), and increasing shared memory values (**)
>mpiexec -genvall -profile -np 152 bin/my_test (my_test = MPI_Init + MPI_Finalize)
================================================================================
[mpiexec at ccdvli10] Number of PMI calls seen by the server: 306
================================================================================
>mpiexec -genvall -profile -np 153 bin/my_test
Assertion failed in file /scratch/BC/mpich2-1.4.1p1/src/util/wrappers/mpiu_shm_wrappers.h at line 889: seg_sz > 0
internal ABORT - process 0
[proxy:0:0 at ccdvli10] send_cmd_downstream (./pm/pmiserv/pmip_pmi_v1.c:80): assert (!closed) failed
[proxy:0:0 at ccdvli10] fn_get (./pm/pmiserv/pmip_pmi_v1.c:349): error sending PMI response
[proxy:0:0 at ccdvli10] pmi_cb (./pm/pmiserv/pmip_cb.c:327): PMI handler returned error
[proxy:0:0 at ccdvli10] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at ccdvli10] main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
[mpiexec at ccdvli10] control_cb (./pm/pmiserv/pmiserv_cb.c:215): assert (!closed) failed
[mpiexec at ccdvli10] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec at ccdvli10] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
[mpiexec at ccdvli10] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion
(*)
>limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 1000000
memorylocked unlimited
maxproc 409600
(**)
>sysctl -A | grep kernel.sh
kernel.shmmni = 16000
kernel.shmall = 8388608000
kernel.shmmax = 33554432
Best regards
---------------
Bernard CHAMBON
IN2P3 / CNRS
04 72 69 42 18
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120110/519d48f7/attachment.htm>
More information about the mpich-discuss
mailing list