[Swift-devel] pilot job paper

Allan Espinosa aespinosa at cs.uchicago.edu
Mon Jul 18 01:22:23 CDT 2011


I did a bunch of tests last year by varying walltime (actually they
were sleep job durations) and then measure the total CPU hours
consumed over a 24 hour time window.  It was a different set of
metrics but I think we can get the data needed for the same methods.


2011/7/17 Mihael Hategan <hategan at mcs.anl.gov>:
> On Sun, 2011-07-17 at 19:53 -0500, Ketan Maheshwari wrote:
>>
>> On Sun, Jul 17, 2011 at 4:23 PM, Mihael Hategan <hategan at mcs.anl.gov>
>> wrote:
>>         On Sun, 2011-07-17 at 16:19 -0500, Ketan Maheshwari wrote:
>>
>>         >
>>         >         I'm not sure where cybershake fits in. The problem
>>         would be to
>>         >         submit
>>         >         dummy jobs of various walltimes to various clusters
>>         and record
>>         >         the
>>         >         amount of time they sit in the queue.
>>         >
>>         > It is just a use-case. I thought, stats based on an (any)
>>         real
>>         > application will have more merit compared to dummy jobs.
>>         >
>>
>>         I don't see how it would in this case. We're trying to measure
>>         queuing
>>         time vs walltime in a general sense. I think it makes more
>>         sense to not
>>         have a specific application, which would simply be irrelevant
>>         information.
>>
>> Ok, we can use bionimbus cloud with Mike's catsnsleep script with
>> following possible variations:
>>
>> Job Completion times: sleep parameter 1s-100s
>> Data : cat file size 1M-1G
>> Bionimbus VMs 1-32
>>
>> Number of jobs n, 1-1000
>
> I need queuing time vs. advertised job walltime on various clusters
> (with various/random degrees of utilization). That's to see whether it's
> useful to have coasters at all.
>
> The number of jobs is an orthogonal dimension (i.e. we may want to
> measure the queuing time vs. #of jobs for various walltimes, but later).
> The actual job duration is not relevant. The amount of data is not
> relevant.
>
> Clouds are an interesting environment, but not for this particular
> problem. That's because we need to see how much it takes to acquire
> resources, not how fast some job middleware is after we got hold of
> those resources.
>



More information about the Swift-devel mailing list