[mpich-discuss] mpich-discuss Digest, Vol 34, Issue 28
c cook
csecook at gmail.com
Tue Jul 26 15:00:16 CDT 2011
Hi,
I had some time ago problems running a parallel application using the mpich2
with the mpd daemon. One of the users from the mpich-list suggested I should
install the new version of mpich with hydra process manager.
Now I can run the application but at some poitn it stops with this error:
InitMesh: Mesh cutoff (required, used) = 400.000 418.568 Ry
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
[proxy:0:1 at cn102.cluster.local] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
[proxy:0:1 at cn102.cluster.local] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:1 at cn102.cluster.local] main (./pm/pmiserv/pmip.c:226): demux engine
error waiting for event
[proxy:0:3 at cn104.cluster.local] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
[proxy:0:3 at cn104.cluster.local] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:3 at cn104.cluster.local] main (./pm/pmiserv/pmip.c:226): demux engine
error waiting for event
[mpiexec at headnode.cluster.local] HYDT_bscu_wait_for_completion
(./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated
badly; aborting
[mpiexec at headnode.cluster.local] HYDT_bsci_wait_for_completion
(./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for
completion
[mpiexec at headnode.cluster.local] HYD_pmci_wait_for_completion
(./pm/pmiserv/pmiserv_pmci.c:189): launcher returned error waiting for
completion
[mpiexec at headnode.cluster.local] main (./ui/mpich/mpiexec.c:397): process
manager error waiting for completion
I am using a cluster with 8 nodes (cn101 to cn108) having 2 procs each
The example with the cpi works fine.
AAny idea what could be the problem?
Thank you, Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110726/ec46cbea/attachment.htm>
More information about the mpich-discuss
mailing list