[Swift-user] cray (Edison)

Mihael Hategan hategan at mcs.anl.gov
Tue Dec 23 16:35:53 CST 2014


Hi Ketan,

I would try nodeGranularity/maxNodes = 600 and mppppn = 24. The term
"node" there is a slight misnomer.

Mihael

On Thu, 2014-12-18 at 09:00 -0600, Ketan Maheshwari wrote:
> Hi,
> 
> I am trying to submit to a cray machine (Edison) with 24 cores per
> node. I
> am looking to submit a 25 node 600 tasks job but with my sites
> configuration, it results in a 25 node 25 tasks submission.
> 
> The sites file bits are:
> 
>    <profile namespace="globus" key="jobsPerNode">24</profile>
>     <profile namespace="globus"
> key="providerAttributes">pbs.aprun;pbs.mpp;depth=24</profile>
>     <profile namespace="globus" key="maxTime">1000</profile>
>     <profile namespace="globus" key="wallTime">00:30:00</profile>
>     <profile namespace="globus" key="slots">1</profile>
>     <profile namespace="globus" key="nodeGranularity">25</profile>
>     <profile namespace="globus" key="maxNodes">25</profile>
> 
> The resulting job is:
> 
> 
>       Req'd    Req'd       Elap
> Job ID                  Username    Queue    Jobname          SessID
> NDS
> TSK   Memory   Time    S   Time
> ----------------------- ----------- -------- ---------------- ------
> -----
> ------ ------ --------- - ---------
> 2204800.edique02        ketan       debug    B1218-5106290-0     --
> 25
>     25    --   00:16:00 C       --
> 
> The resulting submit script is:
> 
> #PBS -S /bin/bash
> #PBS -N B1218-5106290-0
> #PBS -m n
> #PBS -l mppwidth=25,mppnppn=1,mppdepth=24
> #PBS -l walltime=00:16:00
> #PBS -q debug
> #PBS -o
> /scratch2/scratchdirs/ketan/WRF_LES/WRFV3/ketanrun/run001/scripts/PBS8090417396448920648.submit.stdout
> #PBS -e
> /scratch2/scratchdirs/ketan/WRF_LES/WRFV3/ketanrun/run001/scripts/PBS8090417396448920648.submit.stderr
> export WORKER_LOGGING_LEVEL=NONE
> #PBS -v WORKER_LOGGING_LEVEL
> cd / && aprun -n 25 -N 1 -cc none -d 24 -F exclusive /bin/sh -c
> '/usr/bin/perl /global/homes/k/ketan/.globus/coasters/
> cscript6966010727500767046.pl http://10.10.20.170:58984,
> http://10.100.100.52:58984,http://10.141.1.2:58984,http://127.0.0.2:58984,
> http://128.55.34.2:58984,http://128.55.72.100:58984,
> http://128.55.72.22:58984 1218-5106290-000000 NOLOGGING'
> /bin/echo $?
> >/scratch2/scratchdirs/ketan/WRF_LES/WRFV3/ketanrun/run001/scripts/PBS8090417396448920648.submit.exitcode
> 
> 
> I am looking to get mppwidth and -n switch of aprun to 600
> 
> Thanks for any suggestions.
> 
> Ketan
> 
> --001a11394b96d0fcc6050a7eda85
> Content-Type: text/html; charset="UTF-8"
> Content-Transfer-Encoding: quoted-printable
> 
> <meta http-equiv=3D"Content-Type" content=3D"text/html;
> charset=3Dutf-8"><d=
> iv dir=3D"ltr">Hi,<div><br></div><div>I am trying to submit to a cray
> machi=
> ne (Edison) with 24 cores per node. I am looking to submit a 25 node
> 600 ta=
> sks job but with my sites configuration, it results in a 25 node 25
> tasks s=
> ubmission.</div><div><br></div><div>The sites file bits
> are:</div><div><br>=
> </div><div><div>   <profile
> namespace=3D"globus" key=
> =3D"jobsPerNode">24</profile></div><div> 
>   &l=
> t;profile namespace=3D"globus"
> key=3D"providerAttributes&quo=
> t;>pbs.aprun;pbs.mpp;depth=3D24</profile></div><div> 
>   =
> <profile namespace=3D"globus"
> key=3D"maxTime">100=
> 0</profile></div><div>    <profile
> namespace=3D"glo=
> bus"
> key=3D"wallTime">00:30:00</profile></div><div>=
>     <profile namespace=3D"globus"
> key=3D"slots&=
> quot;>1</profile></div><div>    <profile
> namespace=3D=
> "globus"
> key=3D"nodeGranularity">25</profile><=
> /div><div>    <profile namespace=3D"globus"
> key=3D&q=
> uot;maxNodes">25</profile></div></div><div><br></div><div>The=
>  resulting job is:</div><div><br></div><div><div>     
> &nbsp=
> ;                  
>   &nb=
> sp;                  
>   &=
> nbsp;                  
>  =
>            Req'd    Req'd  
>   =
>   Elap<br></div><div>Job ID          
>   =
>      Username    Queue    Jobname
>  =
>         SessID  NDS   TSK   Memory
> &nbsp=
> ; Time    S   Time</div><div>-----------------------
> -------=
> ---- -------- ---------------- ------ ----- ------ ------ --------- -
> -----=
> ----</div><div>2204800.edique02        ketan
>   &nb=
> sp;   debug    B1218-5106290-0     --  
> &nbsp=
> ; 25     25    --   00:16:00 C    
> &nbsp=
> ; --</div></div><div><br></div><div>The resulting submit script
> is:</div><d=
> iv><br></div><div><div>#PBS -S /bin/bash</div><div>#PBS -N
> B1218-5106290-0<=
> /div><div>#PBS -m n</div><div>#PBS -l
> mppwidth=3D25,mppnppn=3D1,mppdepth=3D=
> 24</div><div>#PBS -l walltime=3D00:16:00</div><div>#PBS -q
> debug</div><div>=
> #PBS
> -o /scratch2/scratchdirs/ketan/WRF_LES/WRFV3/ketanrun/run001/scripts/P=
> BS8090417396448920648.submit.stdout</div><div>#PBS
> -e /scratch2/scratchdirs=
> /ketan/WRF_LES/WRFV3/ketanrun/run001/scripts/PBS8090417396448920648.submit.=
> stderr</div><div>export WORKER_LOGGING_LEVEL=3DNONE</div><div>#PBS -v
> WORKE=
> R_LOGGING_LEVEL</div><div>cd / && aprun -n 25 -N 1 -cc none -d
> 24 -=
> F exclusive /bin/sh -c
> '/usr/bin/perl /global/homes/k/ketan/.globus/coaster=
> s/<a
> href=3D"http://cscript6966010727500767046.pl">cscript69660107275007670=
> 46.pl</a> <a
> href=3D"http://10.10.20.170:58984">http://10.10.20.170:58984</=
> a>,<a
> href=3D"http://10.100.100.52:58984">http://10.100.100.52:58984</a>,<a=
>  href=3D"http://10.141.1.2:58984">http://10.141.1.2:58984</a>,<a
> href=3D"ht=
> tp://127.0.0.2:58984">http://127.0.0.2:58984</a>,<a
> href=3D"http://128.55.3=
> 4.2:58984">http://128.55.34.2:58984</a>,<a
> href=3D"http://128.55.72.100:589=
> 84">http://128.55.72.100:58984</a>,<a
> href=3D"http://128.55.72.22:58984">ht=
> tp://128.55.72.22:58984</a> 1218-5106290-000000
> NOLOGGING'</div><div>/bin/e=
> cho $?
> >/scratch2/scratchdirs/ketan/WRF_LES/WRFV3/ketanrun/run001/script=
> s/PBS8090417396448920648.submit.exitcode</div><div><br></div></div><div><br=
> ></div><div>I am looking to get mppwidth and -n switch of aprun to
> 600&nbsp=
> ;</div><div><br></div><div>Thanks for any
> suggestions.</div><div><br></div>=
> <div>Ketan</div></div>
> 
> --001a11394b96d0fcc6050a7eda85--
> 
> --===============0073910834839521412==
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> --===============0073910834839521412==--
> 





More information about the Swift-user mailing list