[mpich-discuss] Using mpd MPICH2-1.0.8 on 64-bit Mac cluster under SGE
Reuti
reuti at Staff.Uni-Marburg.DE
Thu Mar 18 10:58:43 CDT 2010
Hi,
Am 18.03.2010 um 16:51 schrieb Edric Ellis:
> I’m trying to get an MPD build of MPICH2-1.0.8 working on a 64-bit
> Mac cluster, with jobs being scheduled by SGE.
>
>
>
> I’m submitting a shell script which calls mpdboot based on the
> hosts allocated by SGE, using “rsh”. When I then attempt to run my
> process under mpiexec, the process that gets launched on the remote
> nodes seems to be “broken” in some way. For example, imagine I get
> allocated nodes “node01” and “node02”. If were to run (from within
> my shell script that I’ve submitted to SGE)
>
using one MPD ring per job on a unique port might help:
http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-
integration.html
I never tried it on a Mac though.
-- Reuti
> mpiexec -n 2 whoami
>
>
>
> this would give me the expected output on “node01” where SGE has
> launched my wrapper script, but on “node02”, I just see a numeric
> result from whoami. Also, if I attempt to do something more
> adventurous like
>
>
>
> mpiexec -n 2 ping -c node01
>
>
>
> this shows that the process launched on “node02” cannot access the
> network. If I try and launch my actual application, it is
> completely broken by this lack of access to the network.
>
>
>
> Has anyone seen anything like this? I have no idea if the problem
> is with MPICH2, SGE, Mac, or something else. Any clues gratefully
> received. (At the moment, I haven’t been able to attempt to use
> mpiexec outside of the control of SGE to remove that piece, but I
> should be able to do that)
>
>
>
> Cheers,
>
>
>
> Edric.
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list