[Swift-devel] Performance problem with CDM direct processing

Jonathan Monette jonmon at mcs.anl.gov
Mon Aug 22 13:02:42 CDT 2011


Using bash to do the wildcard matching was one of the ideas we came up with. 

----- Reply message -----
From: "Justin M Wozniak" <wozniak at mcs.anl.gov>
Date: Mon, Aug 22, 2011 12:46 pm
Subject: [Swift-devel] Performance problem with CDM direct processing
To: "Jonathan Monette" <jonmon at mcs.anl.gov>
Cc: "Michael Wilde" <wilde at mcs.anl.gov>, "Jonathan Monette" <jon.monette at gmail.com>, "swift-devel Devel" <swift-devel at ci.uchicago.edu>



This has to do with the way the _swiftwrap shell script looks up those 
files.  To avoid the external use of perl, I will take a look at using 
bash to do the wildcard matching and lookup.  Either that or I will batch 
multiple lookups into one perl call.
 	Justin

On Mon, 22 Aug 2011, Jonathan Monette wrote:

> Correct. I suspect if we can improve the performance of this section we 
> can go from a run 12 hour run to a 6-8 hour run.
>
> The number of files that are being procesed by cdm look up is 320K. 
> What was observed was several processes were spawned for each file and 
> took maybe a second to run(i think that was the time).
>
> Mike and me had a discussion on how we can replicate it with a simple 
> test case to show the delay as well as some simple fixes to try out.
>
> ----- Reply message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> Date: Mon, Aug 22, 2011 10:41 am
> Subject: [Swift-devel] Performance problem with CDM direct processing
> To: "Jonathan Monette" <jon.monette at gmail.com>, "Justin M Wozniak" <wozniak at mcs.anl.gov>
> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>
>
>
> Justin,
>
> In testing Montage, Jon observed what looks like a performance bottleneck in the processing of CDM direct output passing.
>
> I *think* what was happening was that a large number of jobs (say 25,000 or more, but I dont recall the exact number, it may have been larger) produced an output file, and all those files were being passed as input to a merge job.
>
> What we observed was that the scripts being called from _swiftwrap (and perhaps some processing at the vdl-int.k level??? as well) were running very slowly, and that a fairly large number of scripts were being invoked per file. I think (but am not sure) that the high overhead was being observed at the start of the merge job in CDM scripts called by _swiftwrap.
>
> Jon, can you explain what you know about this problem, and then lets see if we can enhance the performance?  This is now the main bottleneck in this application, which is otherwise now performing quite well.
>
> Thanks,
>
> - Mike
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel

-- 
Justin M Wozniak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110822/1a3ffc16/attachment.html>


More information about the Swift-devel mailing list