[mpich-discuss] Error when calling mpiexec from within a process

Reuti reuti at staff.uni-marburg.de
Thu Oct 6 18:28:32 CDT 2011


Hi,

Am 06.10.2011 um 22:06 schrieb Pramod:

> Hi,
> 
> I have an application where I need to call mpiexec from within a child
> process launched by mpiexec. I am using "system()" to call the mpiexec
> process from the child process.  I am using mpich2-1.4.1 and the hydra
> process manger. The errors I see are below. I am attaching the source
> file main.c. Let me know what I am doing wrong here and if you need
> more information.
> 
> To compile:
> 
> /home/install/mpich/mpich2-1.4.1/linux_x86_64//bin/mpicc   main.c
> -I/home/install/mpich/mpich2-1.4.1/linux_x86_64/include
> 
> When I run the test on multiple nodes I get the following errors:
> mpiexec -n 3 -f hosts.list a.out

what do you want to achieve in detail? Would you like to use another hostlist for this call, so that each child decides on its own where to start grandson processes?

Spawning additional processes within MPI is not an option?

-- Reuti


> proxy:0:0 at machine3] HYDU_create_process
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/utils/launch/launch.c:36):
> dup2 error (Bad file descriptor)
> [proxy:0:0 at machine3] launch_procs
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:751):
> create process returned error
> [proxy:0:0 at machine3] HYD_pmcd_pmip_control_cmd_cb
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:935):
> launch_procs returned error
> [proxy:0:0 at machine3] HYDT_dmxu_poll_wait_for_event
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/tools/demux/demux_poll.c:77):
> callback returned error status
> [proxy:0:0 at machine3] main
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmip.c:226):
> demux engine error waiting for event
> [mpiexec at machine1.abc.com] control_cb
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmiserv_cb.c:215):
> assert (!closed) failed
> [mpiexec at machine1.abc.com] HYDT_dmxu_poll_wait_for_event
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/tools/demux/demux_poll.c:77):
> callback returned error status
> [mpiexec at machine1.abc.com] HYD_pmci_wait_for_completion
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:181):
> error waiting for event
> [mpiexec at machine1.abc.com] main
> (/home/install/mpich/src/mpich2-1.4.1/src/pm/hydra/ui/mpich/mpiexec.c:405):
> process manager error waiting for completion
> 
> ------
> On a single node I get the following.
> mpiexec -n 3 a.out
> [proxy:0:0 at machine1.abc.com] [proxy:0:0 at machine1.abc.com] Killed
> <main.c>_______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list