[mpich-discuss] Failure when specifying -iface
Bernard Chambon
bernard.chambon at cc.in2p3.fr
Fri Jan 13 09:52:37 CST 2012
Hi,
I encountered failure when specifying -iface and, perhaps, related to the number of tasks (but not sure with the last point)
More clearly : here is a small test between 2 machines, without any limit (*) and the following machines file :
>more /tmp/machines
ccwpge0061:128
ccwpge0062:128
1/ without specifying -iface, It's OK (more than 10 tries)
mpiexec -f /tmp/machines -n 150 bin/advance_test
bchambon at ccwpge0062's password:
I am there
Running MPI version 2, subversion 2
ref_message is ready
I am the master task 0 sur ccwpge0061, for 149 slaves tasks, we will exchange a buffer of 1 MB
slave number 1, iteration = 1
slave number 2, iteration = 1
slave number 3, iteration = 1
…
>echo $status
0
2/ When specifying -iface eth0 (or eth2 : 10Gb/s) I always get failure (assert (!closed) failed)
>mpiexec -iface eth0 -f /tmp/machines -n 150 bin/advance_test (as previous, more than 10 tries)
bchambon at ccwpge0062's password:
Segmentation fault
[mpiexec at ccwpge0061] control_cb (./pm/pmiserv/pmiserv_cb.c:215): assert (!closed) failed
[mpiexec at ccwpge0061] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec at ccwpge0061] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
[mpiexec at ccwpge0061] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion
see attachment for -verbose option
Best regards.
(*)
>limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize unlimited
coredumpsize unlimited
memoryuse unlimited
vmemoryuse unlimited
descriptors 1000000
memorylocked unlimited
maxproc 409600
>more /tmp/machines
ccwpge0061:128
ccwpge0062:128
>mpich2version
MPICH2 Version: 1.4.1p1
MPICH2 Release date: Thu Sep 1 13:53:02 CDT 2011
MPICH2 Device: ch3:nemesis
MPICH2 configure: --prefix=/scratch/BC/mpich2-1.4 --enable-threads=multiple
MPICH2 CC: /usr/bin/gcc -m64 -O2
MPICH2 CXX: c++ -m64 -O2
MPICH2 F77: /usr/bin/f77 -O2
MPICH2 FC: f95
>mpiexec -verbose -iface eth0 -f /tmp/machines -n 150 bin/advance_test
---------------
Bernard CHAMBON
IN2P3 / CNRS
04 72 69 42 18
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120113/8d1d7bb4/attachment-0002.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mpich2-1.4.1p1_iface.stderr.txt
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120113/8d1d7bb4/attachment-0001.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120113/8d1d7bb4/attachment-0003.htm>
More information about the mpich-discuss
mailing list