[Swift-devel] CPU usage with provider-deef

Mike Wilde wilde at mcs.anl.gov
Sat Jun 16 12:03:26 CDT 2007


This should be fun, and a nice break from the I2U2 work that you've 
been immersed in, Mihael.

Want to do a read-through soon, and send out comments for discussion 
that can turn into a list of code improvements to bugzilize?

What I think is important about Falkon is that its working, its 
proving out the value of the provisioned direct-scheduling approach 
with numbers, and that its working for Ioan as a vehicle for his 
research.

What we want to get from the effort is a) Ioan progresses towards 
his PhD; b) the immediate needs of our app-users get met; and c) we 
learn whats needed in architecture, protocol and algorithm for a 
successful long-term approach to running swift programs efficiently.

Point is that everyone is open to changes and towards an eventual 
re-design and re-write. This, Mihael, would be where you can 
propose, design and implement the ideas you've expressed about 
implementing provisioned direct-scheduling using Karajan's remote 
execution mechanisms.

- Mike




Ian Foster wrote, On 6/16/2007 10:59 AM:
> It seems important that Ioan sit down with Mihael and work through the 
> Falkon code to see where it can be simplified, improved, etc. I am sure 
> that this will result in problems being identified and fixed that will 
> otherwise cost us time later.
> 
> Mihael Hategan wrote:
>> Yourkit (www.yourkit.com) has free licenses for open source projects for
>> their profiler. Point them to a globus web page that has your name, and
>> they'll send you the license. Alternatively, there are other profilers
>> out there, and I strongly recommend using them on such issues.
>>
>> Mihael
>>
>> On Sat, 2007-06-16 at 10:00 -0500, Ioan Raicu wrote:
>>   
>>> Nope, I think this is a different problem, or at least a subset of the
>>> problems we were having before.
>>>
>>> Since we fixed the CPU utilization, and we moved to a bigger box (4
>>> CPUs with 2GB of memory), everything is happening in a timely fashion
>>> (a few ms per notification delivery throughout the experiment).  Plus,
>>> I believe the view is consistent (the same tasks look complete on both
>>> ends) between Falkon and Swift, but we are still checking on this as
>>> the run was made just last night for the 100 mol run.  We'll keep you
>>> posted with what we find.
>>>
>>> Ioan
>>>
>>> Ben Clifford wrote: 
>>>     
>>>> On Sat, 16 Jun 2007, Ioan Raicu wrote:
>>>>
>>>>   
>>>>       
>>>>> having problems with the 100 molecule run in MolDyn.  Its not clear where the
>>>>> problem is, on the surface Falkon looks fine... we are looking into where
>>>>> everything breaks to cause Swift to not continue with the workflow to
>>>>> completion!
>>>>>     
>>>>>         
>>>> The same problem that you showed me the other day or different?
>>>>
>>>> with 'the same problem' being that falkon thinks all the jobs are done; 
>>>> but that falkon's measure response time for sending completion 
>>>> notifications gets approximately linearly longer over time and the swift 
>>>> JVM uses ~100% and doesn't inidicate job completion at all after a certain 
>>>> period.
>>>>
>>>> or different symptoms now?
>>>>
>>>>   
>>>>       
>>> -- 
>>> ============================================
>>> Ioan Raicu
>>> Ph.D. Student
>>> ============================================
>>> Distributed Systems Laboratory
>>> Computer Science Department
>>> University of Chicago
>>> 1100 E. 58th Street, Ryerson Hall
>>> Chicago, IL 60637
>>> ============================================
>>> Email: iraicu at cs.uchicago.edu
>>> Web:   http://www.cs.uchicago.edu/~iraicu
>>>        http://dsl.cs.uchicago.edu/
>>> ============================================
>>> ============================================
>>>     
>>
>>   
> 
> -- 
> 
>    Ian Foster, Director, Computation Institute
> Argonne National Laboratory & University of Chicago
> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
> Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
> Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
>       Globus Alliance: www.globus.org.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Mike Wilde
Computation Institute, University of Chicago
Math & Computer Science Division
Argonne National Laboratory
Argonne, IL   60439    USA
tel 630-252-7497 fax 630-252-1997



More information about the Swift-devel mailing list