[Swift-user] Job bundles

Ioan Raicu iraicu at cs.uchicago.edu
Tue Nov 6 12:36:48 CST 2007


Here is what I get at the UC/ANL TG site:
qsub -I -l nodes=1:ppn=1:ia32-compute,walltime=0:30:00 -A TG-CCR070008T
qsub -I -l nodes=1:ppn=2:ia32-compute,walltime=0:30:00 -A TG-CCR070008T

iraicu at tg-viz-login2:~> showq -u iraicu

active jobs------------------------
JOBID              USERNAME      STATE PROCS   REMAINING            
STARTTIME

1574623              iraicu    Running     2    00:29:55  Tue Nov  6 
12:34:23
1574621              iraicu    Running     2    00:29:21  Tue Nov  6 
12:33:49

2 active jobs             4 of 242 processors in use by local jobs (1.65%)
                         20 of 121 nodes active      (16.53%)

eligible jobs----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            
QUEUETIME


0 eligible jobs  

blocked jobs-----------------------
JOBID              USERNAME      STATE PROCS     WCLIMIT            
QUEUETIME


0 blocked jobs  

Total jobs:  2

Notice that both jobs have 2 processors allocated!  These same commands 
on TeraPort would have yielded one allocation with 1 processor and 
another with 2 processors.  This is what I meant by "it a policy thing", 
because PBS can be configured to ignore the ppn field.

Ioan

Ben Clifford wrote:
> That's what the ppn parameter specifies to PBS.
>
> On Tue, 6 Nov 2007, Ioan Raicu wrote:
>
>   
>> Right, its not that PBS doesn't support it, its more of a policy thing.  On
>> the TeraGrid, my experience has been that when PBS (or whatever LRM is being
>> used) allocates CPUs, it always allocates at the machine level, not at the CPU
>> level.  That means, if you have an 8 processor machine, and you get 1
>> processor on that machine, then you get (and are charged for) the whole
>> machine as you have exclusive rights to this machine for the duration of your
>> reservation.  I have seen this behave differently in other environments, such
>> as TeraPort, where PBS was allocating at the processor level, and not the
>> machine level.  This is why I said that I think Swift would need to somehow
>> handle this at the worker node scripts, and not rely necessarily on the LRM
>> doing this. 
>> Ioan
>>
>> Ben Clifford wrote:
>>     
>>> On Tue, 6 Nov 2007, Ioan Raicu wrote:
>>>
>>>   
>>>       
>>>> 2) the LRM allows the partitioning of the SMP machine into smaller pieces;
>>>> for
>>>> example, with 8 processor node, if it lets you submit 8 jobs that only
>>>> need 1
>>>> processor, and it will launch 8 different jobs on the same node, then you
>>>> are
>>>> fine... the parallelism will be done automatically by the LRM, as long as
>>>> you
>>>> ask for only 1 process at a time; on the TG at least, I don't think this
>>>> is
>>>> how things work, and when you get a node, regardless of how many
>>>> processors it
>>>> has, you get full access to all processors, not just the ones you asked
>>>> for.
>>>>     
>>>>         
>>> PBS allows the specification of multiple processes per node, like this
>>> (grabbed from google)
>>>
>>>   
>>>       
>>>> qsub -l walltime=15:00,nodes=1:ppn=1 script.pbs
>>>>     
>>>>         
>>> It looks like abe runs PBS.
>>>
>>> So I think you could specify a globus profile key in the sites.xml, perhaps
>>> something like this:
>>>
>>>  <profile namespace="globus" key="ppn">8</profile>
>>>
>>> I haven't tried this myself, but I'd be interested to hear your results.
>>>   
>>>       
>>     
>
>   

-- 
============================================
Ioan Raicu
Ph.D. Student
============================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
============================================
Email: iraicu at cs.uchicago.edu
Web:   http://www.cs.uchicago.edu/~iraicu
       http://dsl.cs.uchicago.edu/
============================================
============================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20071106/e0fd7741/attachment.html>


More information about the Swift-user mailing list