[mpich-discuss] mpich2 does not work with SGE
tilakraj dattaram
tilakraj1985 at gmail.com
Tue Jul 26 01:37:36 CDT 2011
Hi Reuti
Here is how we submitted the jobs through a defined SGE queue.
1. Submit the job using a job script (job_lammps.sh)
$ qsub -q molsim.q job_lammps.sh (In the previous message there was a typo
because I had mistakenly typed; $ qsub -q queue ./a.out<input>output )
The job script looks like this
----------------------------------------------------------
#!/bin/sh
# request Bourne shell as shell for job
#$ -S /bin/sh
# Name of the job
#$ -N mpich2_lammps_test
# Name of the output log file
#$ -o lammps_test.log
# Combine output/ error messages into one file
#$ -j y
# Use current working directory
#$ -cwd
# Specify the parallel environment (pe)
#$ -pe mpich2 8
# Commands to be executed
mpirun ./lmp_g++ < in.shear > thermo_shear
----------------------------------------------------------
2. Below is output for qstat and shows the job running on compute-0-5
corresponding to molsim.q
$ qstat -f
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q at compute-0-1.local BIP 0/0/16 0.00 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-10.local BIP 0/0/16 0.09 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-2.local BIP 0/0/16 0.03 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-3.local BIP 0/0/16 0.00 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-4.local BIP 0/0/16 0.00 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-5.local BIP 0/0/16 0.02 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-6.local BIP 0/0/16 0.01 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-7.local BIP 0/0/16 0.02 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-8.local BIP 0/0/16 0.04 lx26-amd64
---------------------------------------------------------------------------------
all.q at compute-0-9.local BIP 0/0/16 0.04 lx26-amd64
---------------------------------------------------------------------------------
guru.q at compute-0-10.local BIP 0/0/16 0.09 lx26-amd64
---------------------------------------------------------------------------------
guru.q at compute-0-7.local BIP 0/0/16 0.02 lx26-amd64
---------------------------------------------------------------------------------
guru.q at compute-0-8.local BIP 0/0/16 0.04 lx26-amd64
---------------------------------------------------------------------------------
guru.q at compute-0-9.local BIP 0/0/16 0.04 lx26-amd64
---------------------------------------------------------------------------------
molsim.q at compute-0-1.local BIP 0/0/16 0.00 lx26-amd64
---------------------------------------------------------------------------------
molsim.q at compute-0-2.local BIP 0/0/16 0.03 lx26-amd64
---------------------------------------------------------------------------------
molsim.q at compute-0-3.local BIP 0/0/16 0.00 lx26-amd64
---------------------------------------------------------------------------------
molsim.q at compute-0-4.local BIP 0/0/16 0.00 lx26-amd64
---------------------------------------------------------------------------------
molsim.q at compute-0-5.local BIP 0/8/16 0.02 lx26-amd64
301 0.55500 mpich2_lam ajay r 07/26/2011 19:40:11
8
---------------------------------------------------------------------------------
molsim.q at compute-0-6.local BIP 0/0/16 0.01 lx26-amd64
---------------------------------------------------------------------------------
test_mpi.q at compute-0-10.local BIP 0/0/8 0.09 lx26-amd64
---------------------------------------------------------------------------------
test_mpi.q at compute-0-9.local BIP 0/0/8 0.04 lx26-amd64
3. I log into compute-0-5 and do a ps -e f
compute-0-5 ~]$ ps -e f
It seems to show the jobs bound to sge_shephard
3901 ? S 0:00 \_ hald-runner
3909 ? S 0:00 \_ hald-addon-acpi: listening on acpid
socket /v
3915 ? S 0:00 \_ hald-addon-keyboard: listening on
/dev/input/
4058 ? Ssl 0:00 automount
4122 ? Sl 0:00 /opt/gridengine/bin/lx26-amd64/sge_execd
5466 ? S 0:00 \_ sge_shepherd-301 -bg
5467 ? Ss 0:00 \_ -sh
/opt/gridengine/default/spool/execd/compute-0-5/job_scripts/301
5609 ? S 0:00 \_ mpirun ./lmp_g++
5610 ? S 0:00 \_
/opt/mpich2/gnu/bin//hydra_pmi_proxy --control-port compute-0-5.loc
5611 ? R 0:25 \_ ./lmp_g++
5612 ? R 0:25 \_ ./lmp_g++
5613 ? R 0:25 \_ ./lmp_g++
5614 ? R 0:25 \_ ./lmp_g++
5615 ? R 0:25 \_ ./lmp_g++
5616 ? R 0:25 \_ ./lmp_g++
5617 ? R 0:25 \_ ./lmp_g++
5618 ? R 0:25 \_ ./lmp_g++
4143 ? Sl 0:00 /usr/sbin/snmpd -Lsd -Lf /dev/null -p
/var/run/snmpd.
4158 ? Ss 0:00 /usr/sbin/sshd
5619 ? Ss 0:00 \_ sshd: ajay [priv]
5621 ? S 0:00 \_ sshd: ajay at pts/0
5622 pts/0 Ss 0:00 \_ -bash
5728 pts/0 R+ 0:00 \_ ps -e f
The following shows that the mpirun is indeed pointing to the correct
location and the mpirun defined inside the
job script file is the right one.
5610 ? S 0:00 \_
/opt/mpich2/gnu/bin//hydra_pmi_proxy --control-port compute-0-5.loc
4. However, the compute time is still slower (159 seconds) than that for a
job run
through the command line using mpirun (42 seconds, mpiexec -f hostfile -np 8
./lmp_g++ < in.shear > thermo.shear).
I can't understand why there should be large difference between the plain
mpiexec and that started through the sge.
Thanks in advance
Regards
Tilak
>
> ------------------------------
>
> Message: 2
> Date: Mon, 25 Jul 2011 13:47:01 +0200
> From: Reuti <reuti at staff.uni-marburg.de>
> Subject: Re: [mpich-discuss] mpich2 does not work with SGE
> To: mpich-discuss at mcs.anl.gov
> Message-ID:
> <6724EB30-A6E2-4172-A073-222820574652 at staff.uni-marburg.de>
> Content-Type: text/plain; charset=us-ascii
>
> Hi,
>
> Am 25.07.2011 um 11:16 schrieb tilakraj dattaram:
>
> > Thanks a lot for all your help.
> >
> > Now we can run parallel jobs through the SGE (using a script file and
> > qsub). We submitted some test jobs and were keeping all the records
> > about the timings to compare the speeds with respect to mpirun
> > executed from the command line
> >
> > For some reason (which I can't find out) running jobs through SGE is
> > much slower than the command line. Is it so that command line works
> > faster then the SGE ?
> >
> > This is the comparison table,
> >
> > CASE 1
> > =======
> > mpiexec (both mpiexec and mpirun point to the same link, i.e.
> > mpiexec.hydra) from command line with a hostfile
> >
> > # mpiexec -f hostfile -np = n ./a.out<input
> >
> ..................................................................................
> > 16 cores per node
> >
> > N cpus time (in secs) speedup
> >
> > 1 262.27 1
> > 2 161.34 1.63
> > 4 82.41 3.18
> > 8 42.45 6.18
> > 16 33.56 7.82
> > 24 25.33 10.35
> > 32 20.38 12.87
> >
> ..................................................................................
> >
> > CASE 2
> > =======
> > mpich2 integrated with SGE
> >
> > mpirun executed from within the job script
> >
> > This is our PE
> >
> > pe_name mpich2
> > slots 999
> > user_lists NONE
> > xuser_lists NONE
> > start_proc_args NONE
> > stop_proc_args NONE
> > allocation_rule $fill_up
> > control_slaves TRUE
> > job_is_first_task FALSE
> > urgency_slots min
> > accounting_summary FALSE
> >
> > # qsub -q queue ./a.out<input>output (the PE is defined inside the
> > script file using the following option, #$ -pe mpich2 n)
> >
> .................................................................................
> > qsub [ 16 cores per node ]
> >
> > N cpus time (in secs) speedup
> >
> > 1 383.5 1
> > 2 205.1 1.87
> > 4 174.3 2.2
> > 8 159.3 2.4
> > 16 123.2 3.1
> > 24 136.8 2.8
> > 32 124.6 3.1
> >
> .................................................................................
> >
> >
> > As you can notice, the speed up is only about 3-times for a job run on
> > 16 processors and submitted through the SGE interface, whereas it is
> > nearly 8-fold when the parallel jobs are submitted from the command
> > line using mpirun. Another thing to note is that speed nearly
> > saturates around 3 for SGE, whereas it keeps increasing to around 13
> > for 32 processors for command line execution. In fact, we had earlier
> > found that the speed up keeps increasing till about 144 processors,
> > where it gives a maximum speed up of about 20-fold over the serial
> > job.
>
> no, this shouldn't be. There might be a small delay for the very first
> startup as SGE will be used by it's integrated `qrsh` startup instead of a
> plain `ssh`, but this shouldn't create such a huge difference. Once it's
> started, the usual communication inside MPICH2 will be used and there
> shouldn't be any difference noticeable.
>
> Can you please check on the various nodes, whether the allocation of the
> processes of your job is correct:
>
> $ ps -e f
>
> (f w/o -) and all are bound to the sge_shepherd on the intended nodes and
> no node is overloaded? Did you define a special queue for this (in SGE it's
> not necessary but possible depending on your personal taste), and limtit the
> available cores per node across all queues to avoid oversubscription?
>
> -- Reuti
>
>
> > Your help will be greatly appreciated!
> >
> > Thank you
> >
> > Regards
> > Tilak
> >
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
> End of mpich-discuss Digest, Vol 34, Issue 32
> *********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110726/189d7e53/attachment-0001.htm>
More information about the mpich-discuss
mailing list