[ExM Users] Submitting tasks on Raven
Scott Krieder
skrieder at iit.edu
Thu Apr 25 16:47:43 CDT 2013
Hi Mike,
It was a problem with the number of nodes and cores requested via qsub.
Once we got GeMTC working with the latest version of Swift in trunk we were
able to use the new turbine-aprun-run.zsh which made running on Raven much
easier.
-Scott
On Thu, Apr 25, 2013 at 4:13 PM, Michael Wilde <wilde at mcs.anl.gov> wrote:
> What was causing the error message "apsched: claim exceeds reservation's
> memory" ?
>
> ----- Original Message -----
> > From: "Justin M Wozniak" <wozniak at mcs.anl.gov>
> > To: exm-user at lists.mcs.anl.gov
> > Sent: Thursday, April 25, 2013 1:19:19 PM
> > Subject: Re: [ExM Users] Submitting tasks on Raven
> >
> >
> > This definitely works now- we were able to launch tasks on 6 GPUs on
> > Raven.
> >
> > On 04/25/2013 09:40 AM, Michael Wilde wrote:
> > > Can someone help Scott with this?
> > >
> > > Scott, I think #PBS -m is the email notification flag, not memory.
> > >
> > > Its complaining that your aprun command is asking for more
> > > resources
> > > than the qsub command requested for the job.
> > >
> > > Check that the aprun -n -N -d do not exceed mppwidth etc from pns
> > >
> > > Test your aprun args with a qsub -I asking for the correct numnber
> > > of
> > > nodes and cores
> > >
> > > On 4/24/13, Scott Krieder <skrieder at iit.edu> wrote:
> > >> Hi All,
> > >>
> > >> I'm trying to run noop.tcl with turbine on Raven. I keep getting a
> > >> memory
> > >> error:
> > >> apsched: claim exceeds reservation's memory
> > >>
> > >> I tried a few different values(100, 100M) for
> > >> #PBS -m
> > >> but I keep getting the error.
> > >>
> > >> Is there a way to let the PBS job take as much memory as it needs?
> > >>
> > >> Thanks,
> > >> Scott
> > >>
> > >> =====aprun.sh script that I'm running=====
> > >> # USAGE: qsub aprun.sh
> > >>
> > >> # The user should copy and edit the parameters throughout this
> > >> script
> > >> # marked USER:
> > >>
> > >> # USER: Directory available from compute nodes:
> > >> USER_WORK=/ufs/home/users/p01684
> > >>
> > >> # USER: (optional) Change the qstat name
> > >> #PBS -N turbine
> > >> # USER: Set the job size
> > >> #PBS -l mppwidth=1,mppnppn=3,mppdepth=1
> > >> # USER: Set the wall time
> > >> #PBS -l walltime=10:00
> > >> # USER: (optional) Redirect output from its default location
> > >> ($PWD)
> > >> #PBS -o /ufs/home/users/p01684/pbs.out
> > >>
> > >> #PBS -j oe
> > >> #PBS -m n
> > >>
> > >> # USER: Set configuration of Turbine processes
> > >> export TURBINE_ENGINES=1
> > >> export ADLB_SERVERS=1
> > >>
> > >> echo "Turbine: aprun.sh"
> > >> date "+%m/%d/%Y %I:%M%p"
> > >> echo
> > >>
> > >> # Be sure we are in an accessible directory
> > >> cd $PBS_O_WORKDIR
> > >>
> > >> set -x
> > >> # USER: Set Turbine installation path
> > >> export TURBINE_HOME=${USER_WORK}/Public/sfw/turbine
> > >> # USER: Select program name
> > >> # PROGRAM=${USER_WORK}/adlb-data.tcl
> > >> PROGRAM=${TURBINE_HOME}/test/noop.tcl
> > >>
> > >> source ${TURBINE_HOME}/scripts/turbine-config.sh
> > >> if [[ ${?} != 0 ]]
> > >> then
> > >> echo "turbine: configuration error!"
> > >> exit 1
> > >> fi
> > >>
> > >> # Send environment variables to PBS job:
> > >> #PBS -v TURBINE_ENGINES ADLB_SERVERS TURBINE_HOME
> > >> # USER: Set aprun parameters to agree with PBS -l settings
> > >> # aprun -n 1 -N 1 -cc none -d 1 ${TCLSH} ${PROGRAM}
> > >> aprun -n 3 -N 3 -cc none -d 1 ${TURBINE_HOME}/bin/turbine
> > >> ${PROGRAM}
> > >>
> > >> --
> > >> Scott J. Krieder
> > >> C: 419-685-0410
> > >> E: skrieder at iit.edu
> > >> http://datasys.cs.iit.edu/~skrieder/
> > >>
> >
> >
> > --
> > Justin M Wozniak
> >
> > _______________________________________________
> > ExM-user mailing list
> > ExM-user at lists.mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/exm-user
> >
> _______________________________________________
> ExM-user mailing list
> ExM-user at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/exm-user
>
--
Scott J. Krieder
C: 419-685-0410
E: skrieder at iit.edu
http://datasys.cs.iit.edu/~skrieder/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/exm-user/attachments/20130425/f837fe25/attachment.html>
More information about the ExM-user
mailing list