[Fwd: Re: [Swift-devel] Re: swift-falkon problem... plots to explain plateaus...]

Tue Mar 25 10:16:21 CDT 2008

Great, thanks Mihael.  Thats a useful step. I'll test.

- Mike

On 3/25/08 10:09 AM, Mihael Hategan wrote:
> I just wrote a version of the wrapper that opens the log in a descriptor
> (so opening happens once). I need to test it first, but I'll commit
> shortly.
> 
> On Tue, 2008-03-25 at 10:06 -0500, Michael Wilde wrote:
>> One thing I'll test is generating the info file on /tmp, and moving it 
>> when done to the final job dir.
>>
>> I can see adjusting wrapper.sh to go from very light to very logged with 
>> a few increments in the middle that would be most useful.
>>
>> The main option I think we want to leave for users to toggle in common 
>> usage, is whether to run the app with its jobdir on local disk, 
>> typically below /tmp, or on shared disk.  The user would decide based on 
>> the job's I/O profile and on local disk space availability.
>>
>> Also, I recall some discussion on the success file. Thats acceptable 
>> overhead for all but the tiniest of jobs, but when a BGP is eventually 
>> running 100K+ short jobs at once, the rate of success file creation 
>> could become a bottleneck. Seems like we could have an option that 
>> avoids creating and expecting the success file if that proved useful - 
>> need to measure.
>>
>> - Mike
>>
>>
>> On 3/25/08 9:32 AM, Mihael Hategan wrote:
>>> Problem may be that, as a quick test shows, bash opens and closes the
>>> info file every time a redirect is done.
>>>
>>> On Tue, 2008-03-25 at 08:44 -0500, Michael Wilde wrote:
>>>> I did runs the day before with a modified wrapper that bypassed the INFO 
>>>> logging. It saved a good amount - I recall about 30% but need to 
>>>> re-check the numbers.
>>>>
>>>> Yes, I came to the same conclusion on the mkdirs.  Im looking at 
>>>> reducing these, likely moving the jobdir to /tmp.  I think I can do that 
>>>> within the current structure.  wrapper.sh is ver clear and nicely 
>>>> written. (Ben: yes, eyeballing the log #s was easy and no problem).
>>>>
>>>> First thing I want to do, though, is run some large scale tests on our 
>>>> two science workflows, increasing the petro-modelling one (the 
>>>> sub-second application) to a larger runtime through app-level batching.
>>>>
>>>> Zhao's latest test indicate that if we do batches of 40, bringing the 
>>>> jobs from .5 sec to 20 sec, we can saturate the BGP's 4K cores and keep 
>>>> it running efficiently. Given the extra wrapper.sh overhead, I might 
>>>> need to increase that another 10X, but once the app is wrapped in a 
>>>> loop, it makes little difference to the user how big we make that.
>>>>
>>>> The other app is a molecule-docking app, that can be batched similarly.
>>>>
>>>> Once we get those running nicely at a larger, less brutal job time, I'll 
>>>> come back to wrapper.sh tuning.  If you or Ben want to do this in the 
>>>> meantime, though, that would be great.  We have the use-local-disk 
>>>> scenario on our development stack anyways - this would be a good time to 
>>>> do it.  If I do it, it will be only a prototype for measurement purposes.
>>>>
>>>> Mike
>>>>
>>>>
>>>>
>>>>
>>>> On 3/25/08 8:34 AM, Mihael Hategan wrote:
>>>>> On Tue, 2008-03-25 at 08:16 -0500, Michael Wilde wrote:
>>>>>> On 3/25/08 3:31 AM, Mihael Hategan wrote:
>>>>>>> On Tue, 2008-03-25 at 00:28 -0500, Michael Wilde wrote:
>>>>>>>> I eyeballed the wrapperlogs to get a rough idea of what was happening.
>>>>>>>>
>>>>>>>> I ran with wrapperlog saving and no other changes for wf's of 10, 100 
>>>>>>>> and 500 jobs, to see how the exec time grew.  At 500 jobs it grew to 
>>>>>>>> about 30+ seconds for a core app exec time of about 1 sec. (Im just 
>>>>>>>> recollecting the times as at this point I didnt write much down).
>>>>>>>>
>>>>>>> I would personally like to see those logs.
>>>>>> I listed all the runs in the previous mail (below), Mihael. They are on 
>>>>>> CI NFS at ~benc/swift-logs/wilde/run{345-350}.
>>>>> Sorry about that.
>>>>>
>>>>>>  Let us know what you find.
>>>>>>
>>>>> It looks like this:
>>>>> - 5 seconds between LOG_START and CREATE_JOBDIR. Likely hogs:
>>>>> mkdir -p $WFDIR/info/$JOBDIR
>>>>> mkdir -p $WFDIR/status/$JOBDIR
>>>>> and the creation of the info file.
>>>>> - 2.5 seconds between CREATE_JOBDIR and CREATE_INPUTDIR. Likely problem:
>>>>> mkdir -p $DIR
>>>>> (on a very fuzzy note, if one mkdir takes 2.5 seconds, two will take 5,
>>>>> which seems to roughly fit the observed numbers).
>>>>> - 3.5 seconds for COPYING_OUTPUTS
>>>>> - 2.5 seconds for RM_JOBDIR
>>>>>
>>>>> I'd be curious to know how much of the time is actually spent writing to
>>>>> the logs. That's because I see one second between EXECUTE_DONE and
>>>>> COPYING_OUTPUTS, a place where the only meaningful things that are done
>>>>> are two log messages.
>>>>>
>>>>> Perhaps it may be useful to run the whole thing through strace -T.
>>>>>
>>>>> Mihael
>>>>>
>>>>>
>>>
> 
>