[mpich-discuss] ssh, mpiexec, and path to working dir

Thomas Ruedas ruedas at dtm.ciw.edu
Fri Feb 5 22:40:51 CST 2010


Hi again,
after finally having recovered the use of the cluster after some 
filesystem and hardware problems, I am trying to get my parallel jobs 
going again. While on the last occasion (immediately before those system 
failures and likely related to them) I hadn't been able to use mpdboot 
anymore, these things seem to work now. The problem this time lies in 
finding the path to the working directory on the nodes.
I have my executable and the files and directories the program should 
use (and used to use correctly) in some sub-subdirectory of my HOME, 
which I find to be mounted seemingly correctly on the head node and the 
other nodes. The PATH variable is also apparently set correctly 
everywhere. Nonetheless, when I go to that sub-subdirectory and start 
the job:
mpiexec -machinefile machines -n 8 myprog < /dev/null >& scr.out &
It fails with this error:
problem with execution of myprog  on  compute-0-2.local:  [Errno 2] No 
such file or directory
... (and so on, for all nodes)
indicating that it doesn't find the executable myprog. However, when I do
mpiexec -machinefile machines -n 8 ~/sub/subdir/myprog < /dev/null >& 
scr.out &
it starts, but then crashes because it can't find its input file in 
~/sub/subdir/, because it is still somewhere else (presumably in /, see 
below).
Another parallel test program that doesn't read input files can be 
started correctly if I use the full path as in the previous example.
System commands work ok, e.g.
mpiexec -machinefile machines -n 8 hostname
gives correct results. Remarkably,
mpiexec -machinefile machines -n 8 ls
gives the contents of / rather than of the (NFS-mounted, or so it should 
be) directory in which I invoke it on the head node.
Something must be wrong with the way ssh works, but I don't know what. 
Does anybody have an idea what the problem is and how I could try to fix it?
Thanks,
Thomas
-- 
-----------------------------------
Thomas Ruedas
Department of Terrestrial Magnetism
Carnegie Institution of Washington
http://www.dtm.ciw.edu/users/ruedas/


More information about the mpich-discuss mailing list