[mpich-discuss] sge and mpich2
Dave Goodell
goodell at mcs.anl.gov
Mon Jun 20 17:02:20 CDT 2011
It looks like you are having problems with mpd: http://wiki.mcs.anl.gov/mpich2/index.php/Frequently_Asked_Questions#Q:_I_don.27t_like_.3CWHATEVER.3E_about_mpd.2C_or_I.27m_having_a_problem_with_mpdboot.2C_can_you_fix_it.3F
Please use hydra instead of MPD: http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager
-Dave
On Jun 20, 2011, at 4:25 PM CDT, c cook wrote:
> Hello,
>
> I was using the sge6.01u4 to runs some serial jobs for some time.
>
> The cluster I am using has 8+1 nodes with Opteron procs.
>
> I wanted to take advantage of this as the software I am using has a parallel version.
> So I've installed mpich2 as the parallel enviroment, I've activated the mpd demon. when doing mpdtrace -l it sees all the 8 nodes(slave) + 1 headnode
> Now when I am submitting the job using this script:
>
> #!/bin/bash
> #$ -S /bin/bash
> #$ -o test.log
> #$ -e test.err
> #$ -N TEST_Parallel
> #$ -pe mpich 2
> #$ -cwd
>
> mpiexec -n $NSLOTS siesta <input> output
>
> the scheduler submits the job, when doing qstat I see that it's running but no output is produced, and this will go on for days, nothing happens, the job will stay in the queue with status "r" forever.
> the only info i get is in the test.log file is:
>
> -catch_rsh /home/sge6.01u4/default/spool/cn105/active_jobs/4049.1/pe_hostfile
> cn105
> cn102
>
> so it seems that the scheduler did the job
> nothing in the test.err, the output is created, but it's empty.
> the nodes are from cn101 to cn108
>
>
> The serial version works fine, this is the script I am using
>
> #!/bin/bash
> #$ -S /bin/bash
> #$ -o test.log
> #$ -e test.err
> #$ -N TEST
> #$ -cwd
>
> siesta <input> output
>
>
> I may have missed something during the instalation of mpich2.
>
> Maybe some of you encountered similar problems, any ideas are welcomed.
>
> Thanks,
> Eli
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list