[Swift-devel] CPU usage with provider-deef

Ian Foster foster at mcs.anl.gov
Sat Jun 16 10:59:17 CDT 2007


It seems important that Ioan sit down with Mihael and work through the 
Falkon code to see where it can be simplified, improved, etc. I am sure 
that this will result in problems being identified and fixed that will 
otherwise cost us time later.

Mihael Hategan wrote:
> Yourkit (www.yourkit.com) has free licenses for open source projects for
> their profiler. Point them to a globus web page that has your name, and
> they'll send you the license. Alternatively, there are other profilers
> out there, and I strongly recommend using them on such issues.
>
> Mihael
>
> On Sat, 2007-06-16 at 10:00 -0500, Ioan Raicu wrote:
>   
>> Nope, I think this is a different problem, or at least a subset of the
>> problems we were having before.
>>
>> Since we fixed the CPU utilization, and we moved to a bigger box (4
>> CPUs with 2GB of memory), everything is happening in a timely fashion
>> (a few ms per notification delivery throughout the experiment).  Plus,
>> I believe the view is consistent (the same tasks look complete on both
>> ends) between Falkon and Swift, but we are still checking on this as
>> the run was made just last night for the 100 mol run.  We'll keep you
>> posted with what we find.
>>
>> Ioan
>>
>> Ben Clifford wrote: 
>>     
>>> On Sat, 16 Jun 2007, Ioan Raicu wrote:
>>>
>>>   
>>>       
>>>> having problems with the 100 molecule run in MolDyn.  Its not clear where the
>>>> problem is, on the surface Falkon looks fine... we are looking into where
>>>> everything breaks to cause Swift to not continue with the workflow to
>>>> completion!
>>>>     
>>>>         
>>> The same problem that you showed me the other day or different?
>>>
>>> with 'the same problem' being that falkon thinks all the jobs are done; 
>>> but that falkon's measure response time for sending completion 
>>> notifications gets approximately linearly longer over time and the swift 
>>> JVM uses ~100% and doesn't inidicate job completion at all after a certain 
>>> period.
>>>
>>> or different symptoms now?
>>>
>>>   
>>>       
>> -- 
>> ============================================
>> Ioan Raicu
>> Ph.D. Student
>> ============================================
>> Distributed Systems Laboratory
>> Computer Science Department
>> University of Chicago
>> 1100 E. 58th Street, Ryerson Hall
>> Chicago, IL 60637
>> ============================================
>> Email: iraicu at cs.uchicago.edu
>> Web:   http://www.cs.uchicago.edu/~iraicu
>>        http://dsl.cs.uchicago.edu/
>> ============================================
>> ============================================
>>     
>
>   

-- 

   Ian Foster, Director, Computation Institute
Argonne National Laboratory & University of Chicago
Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
      Globus Alliance: www.globus.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20070616/2e6f77a3/attachment.html>


More information about the Swift-devel mailing list