[mpich-discuss] Hydra issues
Pavan Balaji
balaji at mcs.anl.gov
Wed Aug 26 09:49:53 CDT 2009
> On a side note, starting the proxies and then immediately calling
> mpiexec.hydra fails:
>
> $ time mpiexec.hydra -boot-proxies -f hosts && time mpiexec.hydra
> -use-persistent -f hosts -n 16 $PWD/IMB-MPI1 -npmin 16
>
> real 0m0.004s
> user 0m0.000s
> sys 0m0.002s
> HYDU_sock_connect (141): connect error (Connection refused)
> launch_helper (57): unable to connect to proxy
> HYDU_sock_connect (141): connect error (Connection refused)
> launch_helper (57): unable to connect to proxy
> HYDU_sock_read (276): read errno (Transport endpoint is not connected)
> HYD_PMCD_pmi_serv_control_cb (271): unable to read status from proxy
> HYD_DMX_wait_for_event (167): callback returned error status
> HYD_PMCI_wait_for_completion (479): error waiting for event
> main (248): process manager error waiting for completion
Thanks for reporting this. I've created a ticket for it. Note that
persistent proxies is experimental right now. We are happy if you try it
out and report problems, but if you want a quick working solution
there's the runtime launching option which is the default anyway. You
should not see much difference in launch time between the two, except
for very large systems.
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list