[mpich-discuss] Failure when specifying -iface

Bernard Chambon bernard.chambon at cc.in2p3.fr
Fri Jan 13 09:52:37 CST 2012


Hi,

I encountered failure  when specifying -iface and, perhaps, related to the number of tasks (but not sure with the last point)

More clearly : here is a small test between 2 machines, without any limit (*) and the following machines file :
>more /tmp/machines 
ccwpge0061:128
ccwpge0062:128

1/ without specifying -iface, It's OK (more than 10 tries)

mpiexec -f /tmp/machines -n 150 bin/advance_test
bchambon at ccwpge0062's password: 

I am there 
Running MPI version 2, subversion 2 
ref_message is ready 
I am the master task 0 sur ccwpge0061, for 149 slaves tasks, we will exchange a buffer of 1 MB

slave number 1, iteration = 1
slave number 2, iteration = 1
slave number 3, iteration = 1
…

>echo $status
0


2/ When specifying -iface eth0  (or eth2 : 10Gb/s) I always get failure (assert (!closed) failed)

>mpiexec -iface eth0 -f /tmp/machines -n 150 bin/advance_test  (as previous, more than 10 tries)
bchambon at ccwpge0062's password: 

Segmentation fault
[mpiexec at ccwpge0061] control_cb (./pm/pmiserv/pmiserv_cb.c:215): assert (!closed) failed
[mpiexec at ccwpge0061] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec at ccwpge0061] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
[mpiexec at ccwpge0061] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion

see attachment for -verbose option

Best regards.


(*) 
>limit
cputime      unlimited
filesize     unlimited
datasize     unlimited
stacksize    unlimited
coredumpsize unlimited
memoryuse    unlimited
vmemoryuse   unlimited
descriptors  1000000 
memorylocked unlimited
maxproc      409600 


>more /tmp/machines 
ccwpge0061:128
ccwpge0062:128



>mpich2version 
MPICH2 Version:    	1.4.1p1
MPICH2 Release date:	Thu Sep  1 13:53:02 CDT 2011
MPICH2 Device:    	ch3:nemesis
MPICH2 configure: 	--prefix=/scratch/BC/mpich2-1.4 --enable-threads=multiple
MPICH2 CC: 	/usr/bin/gcc -m64   -O2
MPICH2 CXX: 	c++ -m64  -O2
MPICH2 F77: 	/usr/bin/f77   -O2
MPICH2 FC: 	f95  



>mpiexec -verbose -iface eth0 -f /tmp/machines -n 150 bin/advance_test 

---------------
Bernard CHAMBON
IN2P3 / CNRS
04 72 69 42 18

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120113/8d1d7bb4/attachment-0002.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mpich2-1.4.1p1_iface.stderr.txt
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120113/8d1d7bb4/attachment-0001.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120113/8d1d7bb4/attachment-0003.htm>


More information about the mpich-discuss mailing list