[mpich-discuss] Using mpd MPICH2-1.0.8 on 64-bit Mac cluster under SGE
Edric Ellis
Edric.Ellis at mathworks.co.uk
Thu Mar 18 10:51:31 CDT 2010
Hi all,
I'm trying to get an MPD build of MPICH2-1.0.8 working on a 64-bit Mac cluster, with jobs being scheduled by SGE.
I'm submitting a shell script which calls mpdboot based on the hosts allocated by SGE, using "rsh". When I then attempt to run my process under mpiexec, the process that gets launched on the remote nodes seems to be "broken" in some way. For example, imagine I get allocated nodes "node01" and "node02". If were to run (from within my shell script that I've submitted to SGE)
mpiexec -n 2 whoami
this would give me the expected output on "node01" where SGE has launched my wrapper script, but on "node02", I just see a numeric result from whoami. Also, if I attempt to do something more adventurous like
mpiexec -n 2 ping -c node01
this shows that the process launched on "node02" cannot access the network. If I try and launch my actual application, it is completely broken by this lack of access to the network.
Has anyone seen anything like this? I have no idea if the problem is with MPICH2, SGE, Mac, or something else. Any clues gratefully received. (At the moment, I haven't been able to attempt to use mpiexec outside of the control of SGE to remove that piece, but I should be able to do that)
Cheers,
Edric.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100318/6b2a6323/attachment-0001.htm>
More information about the mpich-discuss
mailing list