[mpich-discuss] Hydra issues

Pavan Balaji balaji at mcs.anl.gov
Wed Aug 26 09:49:53 CDT 2009


> On a side note, starting the proxies and then immediately calling 
> mpiexec.hydra fails:
> 
> $ time mpiexec.hydra -boot-proxies -f hosts && time mpiexec.hydra 
> -use-persistent -f hosts -n 16 $PWD/IMB-MPI1 -npmin 16
> 
> real    0m0.004s
> user    0m0.000s
> sys    0m0.002s
> HYDU_sock_connect (141): connect error (Connection refused)
> launch_helper (57): unable to connect to proxy
> HYDU_sock_connect (141): connect error (Connection refused)
> launch_helper (57): unable to connect to proxy
> HYDU_sock_read (276): read errno (Transport endpoint is not connected)
> HYD_PMCD_pmi_serv_control_cb (271): unable to read status from proxy
> HYD_DMX_wait_for_event (167): callback returned error status
> HYD_PMCI_wait_for_completion (479): error waiting for event
> main (248): process manager error waiting for completion

Thanks for reporting this. I've created a ticket for it. Note that 
persistent proxies is experimental right now. We are happy if you try it 
out and report problems, but if you want a quick working solution 
there's the runtime launching option which is the default anyway. You 
should not see much difference in launch time between the two, except 
for very large systems.

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list