[mpich-discuss] error running mpi program
Pavan Balaji
balaji at mcs.anl.gov
Fri Oct 28 08:18:17 CDT 2011
Please keep mpich-discuss cc'ed.
Apart from the fact that something is wrong with pexe-note, it's hard to
guess anything else from the below error message.
Does a non-MPI program such as /bin/hostname work correctly?
% mpiexec -f machinefile -n 10 /bin/hostname
-- Pavan
On 10/27/2011 11:03 PM, Charles Sartori wrote:
> Rajeev and Pavan, both executable hellow are in the same location with
> permissions.
> I added another machine, now i have in machinefile:
>
> pexe-pc
> loiva-note
> pexe-note
>
>
> when i run cpi/hellow example with pexe-pc and loiva-note all works
> fine, but, when i try run cpi exemple with all 3 nodes i got this:
>
> pexe-note at pexe-pc:~$ mpiexec -f machinefile -n 10
> ./mpich2-1.4/examples/cpi
> Process 1 of 10 is on pexe-pc
> Process 4 of 10 is on pexe-pc
> Process 7 of 10 is on pexe-pc
> Process 5 of 10 is on pexe-note
> Process 8 of 10 is on pexe-note
> Process 3 of 10 is on loiva-note
> Process 6 of 10 is on loiva-note
> Process 2 of 10 is on pexe-note
> Process 0 of 10 is on loiva-note
> Process 9 of 10 is on loiva-note
>
> =====================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = EXIT CODE: 11
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> =====================================================================================
> [proxy:0:1 at pexe-pc] HYD_pmcd_pmip_control_cmd_cb
> (./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
> [proxy:0:1 at pexe-pc] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [proxy:0:1 at pexe-pc] main (./pm/pmiserv/pmip.c:226): demux engine
> error waiting for event
> [proxy:0:0 at loiva-note] HYD_pmcd_pmip_control_cmd_cb
> (./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
> [proxy:0:0 at loiva-note] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [proxy:0:0 at loiva-note] main (./pm/pmiserv/pmip.c:226): demux engine
> error waiting for event
> [mpiexec at pexe-pc] HYDT_bscu_wait_for_completion
> (./tools/bootstrap/utils/bscu_wait.c:70): one of the processes
> terminated badly; aborting
> [mpiexec at pexe-pc] HYDT_bsci_wait_for_completion
> (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error
> waiting for completion
> [mpiexec at pexe-pc] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:189): launcher returned error waiting
> for completion
> [mpiexec at pexe-pc] main (./ui/mpich/mpiexec.c:397): process manager
> error waiting for completion
> pexe-note at pexe-pc:~$
>
>
> --
> *Charles Sartori
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list