[mpich-discuss] Hydra issues
Scott Atchley
atchley at myri.com
Wed Aug 26 08:58:22 CDT 2009
On Aug 26, 2009, at 9:22 AM, Scott Atchley wrote:
> On Aug 26, 2009, at 7:14 AM, Scott Atchley wrote:
>
>> On Aug 25, 2009, at 9:48 PM, Pavan Balaji wrote:
>>
>>> Scott,
>>>
>>> Are you using Hydra from mpich2-1.1.1p1? Are other programs
>>> running fine?
>>
>> Yes, this is based on 1.1.1p1. I have the same issue with 1.1.1p1's
>> ch3:nemesis:mx.
>
> I meant to add that I have not tried any other programs yet.
>
> Scott
Ok, I was not patient enough. If I let it run, it eventually starts.
The actual walltime is nearly the same as when I use proxies, but the
stdout is delayed until the application completes.
On a side note, starting the proxies and then immediately calling
mpiexec.hydra fails:
$ time mpiexec.hydra -boot-proxies -f hosts && time mpiexec.hydra -use-
persistent -f hosts -n 16 $PWD/IMB-MPI1 -npmin 16
real 0m0.004s
user 0m0.000s
sys 0m0.002s
HYDU_sock_connect (141): connect error (Connection refused)
launch_helper (57): unable to connect to proxy
HYDU_sock_connect (141): connect error (Connection refused)
launch_helper (57): unable to connect to proxy
HYDU_sock_read (276): read errno (Transport endpoint is not connected)
HYD_PMCD_pmi_serv_control_cb (271): unable to read status from proxy
HYD_DMX_wait_for_event (167): callback returned error status
HYD_PMCI_wait_for_completion (479): error waiting for event
main (248): process manager error waiting for completion
real 0m0.006s
user 0m0.001s
sys 0m0.004s
Scott
More information about the mpich-discuss
mailing list