[mpich-discuss] good system gone bad..
SULLIVAN David (AREVA)
David.Sullivan at areva.com
Tue Dec 7 07:22:54 CST 2010
Darius,
Thank you again. I looked at SOME of the NFS mounts before I posted, but
obviously not enough of them. One of the nodes did not properly mount
/global. Embarrassingly simple, but I am so happy that its up and
running I will deal with it.
Dave
-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Darius Buntinas
Sent: Monday, December 06, 2010 9:54 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] good system gone bad..
I wonder if something got messed up in the restart. Can you try ssh'ing
into each node like this:
ssh a_node ls -l /global/mpich2-1.3/bin/hydra_pmi_proxy
-d
On Dec 6, 2010, at 6:30 PM, SULLIVAN David (AREVA) wrote:
> Yes, it is. The installation was built in an NFS mounted folder. As I
said, this worked in the morning. After a restart It fails to run.
>
> Dave
>
>
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov on behalf of Darius Buntinas
> Sent: Mon 12/6/2010 5:53 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] good system gone bad..
>
> Check that the hydra proxy installed (and accessible) on every node.
>
> -d
>
> On Dec 6, 2010, at 4:27 PM, SULLIVAN David (AREVA) wrote:
>
>> All,
>>
>> I am trying to troubleshoot a problem with my MPICH2 install. I
compiled version 1.30 with gcc and intel fortran. When I try to execute
anything I am greeted with the following:
>>
>> bash: /global/mpich2-1.3/bin/hydra_pmi_proxy: No such file or
>> directory It is lying though:
>>
>> total 2916
>> -rwxr-xr-x 1 root root 1656 Dec 6 16:54 bt2line
>> -rwxr-xr-x 1 root root 10008 Dec 6 16:54 check_callstack -rwxr-xr-x
>> 1 root root 56622 Dec 6 16:54 clog2_join
>> -rwxr-xr-x 1 root root 1970 Dec 6 16:54 clog2print
>> -rwxr-xr-x 1 root root 53798 Dec 6 16:54 clog2_print -rwxr-xr-x 1
>> root root 53799 Dec 6 16:54 clog2_repair
>> -rwxr-xr-x 1 root root 1959 Dec 6 16:54 clog2TOslog2
>> -rwxr-xr-x 1 root root 1960 Dec 6 16:54 clogprint
>> -rwxr-xr-x 1 root root 1956 Dec 6 16:54 clogTOslog2
>> -rwxr-xr-x 1 root root 84887 Dec 6 16:54 hwloc-bind -rwxr-xr-x 1
>> root root 83889 Dec 6 16:54 hwloc-calc -rwxr-xr-x 1 root root
>> 79196 Dec 6 16:54 hwloc-distrib
>> lrwxrwxrwx 1 root root 6 Dec 6 16:54 hwloc-info -> lstopo
>> lrwxrwxrwx 1 root root 6 Dec 6 16:54 hwloc-ls -> lstopo
>> lrwxrwxrwx 1 root root 10 Dec 6 16:54 hwloc-mask -> hwloc-calc
>> -rwxr-xr-x 1 root root 422199 Dec 6 16:54 hydra_nameserver
>> -rwxr-xr-x 1 root root 420869 Dec 6 16:54 hydra_persist -rwxr-xr-x 1
>> root root 586873 Dec 6 16:54 hydra_pmi_proxy
>> -rwxr-xr-x 1 root root 1946 Dec 6 16:54 jumpshot
>> -rwxr-xr-x 1 root root 1944 Dec 6 16:54 logconvertor
>> -rwxr-xr-x 1 root root 136310 Dec 6 16:54 lstopo
>> lrwxrwxrwx 1 root root 6 Dec 6 16:54 mpic++ -> mpicxx
>> -rwxr-xr-x 1 root root 8974 Dec 6 16:54 mpicc
>> -rwxr-xr-x 1 root root 8874 Dec 6 16:54 mpich2version
>> -rwxr-xr-x 1 root root 8659 Dec 6 16:54 mpicxx
>> lrwxrwxrwx 1 root root 13 Dec 6 16:54 mpiexec -> mpiexec.hydra
>> -rwxr-xr-x 1 root root 748519 Dec 6 16:54 mpiexec.hydra -rwxr-xr-x 1
>> root root 10640 Dec 6 16:54 mpif77 -rwxr-xr-x 1 root root 12615
>> Dec 6 16:54 mpif90
>> lrwxrwxrwx 1 root root 13 Dec 6 16:54 mpirun -> mpiexec.hydra
>> -rwxr-xr-x 1 root root 3430 Dec 6 16:54 parkill
>> -rwxr-xr-x 1 root root 20427 Dec 6 16:54 plpa-info -rwxr-xr-x 1
>> root root 40775 Dec 6 16:54 plpa-taskset
>> -rwxr-xr-x 1 root root 1965 Dec 6 16:54 slog2filter
>> -rwxr-xr-x 1 root root 1983 Dec 6 16:54 slog2navigator
>> -rwxr-xr-x 1 root root 2015 Dec 6 16:54 slog2print
>> -rwxr-xr-x 1 root root 1974 Dec 6 16:54 slog2updater
>> clearly it IS there, and it is accessible :
>>
>> [dsullivan at athos bin]$ which hydra_pmi_proxy
>> /global/mpich2-1.3/bin/hydra_pmi_proxy
>> [dsullivan at athos bin]$
>> This all worked this morning and little has changed since (a restart
maybe). Any direction would be greatly appreciated.
>>
>>
>> Regards,
>>
>> David Sullivan
>>
>>
>>
>> AREVA NP INC
>> 400 Donald Lynch Boulevard
>> Marlborough, MA, 01752
>> Phone: (508) 573-6721
>> Fax: (434) 382-5597
>> David.Sullivan at AREVA.com
>>
>> The information in this e-mail is AREVA property and is intended
solely for the addressees. Reproduction and distribution are prohibited.
Thank you .
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> <winmail.dat>_______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list