[Fwd: Re: [Swift-devel] Re: swift-falkon problem... plots to explain plateaus...]
Michael Wilde
wilde at mcs.anl.gov
Tue Mar 25 00:28:44 CDT 2008
I eyeballed the wrapperlogs to get a rough idea of what was happening.
I ran with wrapperlog saving and no other changes for wf's of 10, 100
and 500 jobs, to see how the exec time grew. At 500 jobs it grew to
about 30+ seconds for a core app exec time of about 1 sec. (Im just
recollecting the times as at this point I didnt write much down).
First results showed more time spent in the app wrapper than in
wrapper.sh. I remedied this by using /tmp as the app-wrapper's working
dir, and caching the app binary on /tmp. This brought a 20+ sec app
exec time down to about 3 seconds.
With this fixed, the total time in wrapper.sh including the app is now
about 15 seconds, with 3 being in the app-wrapper itself. The time seems
about evenly spread over the several wrapper.sh operations, which is not
surprising when 500 wrappers hit NFS all at once.
I then tried 3 more tests:
- a run to see if the app-executable caching on /tmp had an effect
(it didnt)
- a run to see if turning of wrapperlog retrieval had an effect
- a run with data operation throttles (both) set to 100 from 10
None of these last three things had a significant effect.
Tomorrow I will try some mods to the wrapper script. Turning off wrapper
logging in a previous trial yesterday *seemed* to shave 20-30% off the
run time. I need to verify this.
I'm also going to try to use /tmp for the jobdir and reduce wrapper.sh
overhead; also will leave the (tiny) job output on /tmp for later
aggregation (will have some swift questions on that).
Ben, if you want to look at any of these logs, the runs are in
swift-logs/wilde in the order described above (w/comment files):
346: 10 job workflow
347: 100 job wf
348: 500 job wf
349: 500 jobs w/ improved app-wrapper
350: 500 jobs w/ improved app-wrapper & executable on /tmp
351: 500 jobs, wrapperlog saving off
352: 500 jobs, wrapperlog saving off, data throttles at 100 (from 20)
All but the first of these should have falkon logs saved as well.
I have several ideas on how to proceed, but welcome advice and any
discoveries from log analysis.
Thanks,
Mike
On 3/24/08 10:15 PM, Michael Wilde wrote:
> Ben, do you have a script to sum the time spent per step of wrapper.sh,
> over a set in -info files?
>
> On 3/24/08 6:36 PM, Ben Clifford wrote:
>> On Mon, 24 Mar 2008, Mihael Hategan wrote:
>>
>>> As far as I can remember, Ben added fairly comprehensive logging to the
>>> wrapper. That may shed some light on the issue.
>>
>> Indeed I did; and that logging information can be sent back to the
>> submit host by enabling wrapperlog.always.transfer=true
>>
>
More information about the Swift-devel
mailing list