[Swift-devel] CPU usage with provider-deef

Ioan Raicu iraicu at cs.uchicago.edu
Sat Jun 16 14:47:02 CDT 2007


Yes, I know I need to clean up the code, and remove unused (dead) code.  
Can this wait for the next version I am working on, so I don't do this 
clean-up twice?  The version that is out there in testing currently is 
v0.8.  My development version is v0.9.  I have been distracted lately 
from finishing up v0.9, but its not far from being complete.  Mihael, 
when do you get back in town? 

If this is something more urgent, then perhaps I can get you a clean-up 
version of v0.8 in the coming week.

Ioan

Mihael Hategan wrote:
> On Sat, 2007-06-16 at 12:03 -0500, Mike Wilde wrote:
>   
>> This should be fun, and a nice break from the I2U2 work that you've 
>> been immersed in, Mihael.
>>     
>
> I've already looked at the Falkon code and it's... a lot of code that
> does stuff that I understand only in principle. What you want isn't
> easy, and I have my reservations towards the amount of fun it involves.
>
> That being said, Ioan, would it be possible to have a cleaned up version
> of the code where there are no duplicate classes? It's hard for me to
> figure what's relevant or not in that case. And perhaps dead
> code/comments removed?
>
> Mihael
>
>   
>> Want to do a read-through soon, and send out comments for discussion 
>> that can turn into a list of code improvements to bugzilize?
>>
>> What I think is important about Falkon is that its working, its 
>> proving out the value of the provisioned direct-scheduling approach 
>> with numbers, and that its working for Ioan as a vehicle for his 
>> research.
>>
>> What we want to get from the effort is a) Ioan progresses towards 
>> his PhD; b) the immediate needs of our app-users get met; and c) we 
>> learn whats needed in architecture, protocol and algorithm for a 
>> successful long-term approach to running swift programs efficiently.
>>
>> Point is that everyone is open to changes and towards an eventual 
>> re-design and re-write. This, Mihael, would be where you can 
>> propose, design and implement the ideas you've expressed about 
>> implementing provisioned direct-scheduling using Karajan's remote 
>> execution mechanisms.
>>
>> - Mike
>>
>>
>>
>>
>> Ian Foster wrote, On 6/16/2007 10:59 AM:
>>     
>>> It seems important that Ioan sit down with Mihael and work through the 
>>> Falkon code to see where it can be simplified, improved, etc. I am sure 
>>> that this will result in problems being identified and fixed that will 
>>> otherwise cost us time later.
>>>
>>> Mihael Hategan wrote:
>>>       
>>>> Yourkit (www.yourkit.com) has free licenses for open source projects for
>>>> their profiler. Point them to a globus web page that has your name, and
>>>> they'll send you the license. Alternatively, there are other profilers
>>>> out there, and I strongly recommend using them on such issues.
>>>>
>>>> Mihael
>>>>
>>>> On Sat, 2007-06-16 at 10:00 -0500, Ioan Raicu wrote:
>>>>   
>>>>         
>>>>> Nope, I think this is a different problem, or at least a subset of the
>>>>> problems we were having before.
>>>>>
>>>>> Since we fixed the CPU utilization, and we moved to a bigger box (4
>>>>> CPUs with 2GB of memory), everything is happening in a timely fashion
>>>>> (a few ms per notification delivery throughout the experiment).  Plus,
>>>>> I believe the view is consistent (the same tasks look complete on both
>>>>> ends) between Falkon and Swift, but we are still checking on this as
>>>>> the run was made just last night for the 100 mol run.  We'll keep you
>>>>> posted with what we find.
>>>>>
>>>>> Ioan
>>>>>
>>>>> Ben Clifford wrote: 
>>>>>     
>>>>>           
>>>>>> On Sat, 16 Jun 2007, Ioan Raicu wrote:
>>>>>>
>>>>>>   
>>>>>>       
>>>>>>             
>>>>>>> having problems with the 100 molecule run in MolDyn.  Its not clear where the
>>>>>>> problem is, on the surface Falkon looks fine... we are looking into where
>>>>>>> everything breaks to cause Swift to not continue with the workflow to
>>>>>>> completion!
>>>>>>>     
>>>>>>>         
>>>>>>>               
>>>>>> The same problem that you showed me the other day or different?
>>>>>>
>>>>>> with 'the same problem' being that falkon thinks all the jobs are done; 
>>>>>> but that falkon's measure response time for sending completion 
>>>>>> notifications gets approximately linearly longer over time and the swift 
>>>>>> JVM uses ~100% and doesn't inidicate job completion at all after a certain 
>>>>>> period.
>>>>>>
>>>>>> or different symptoms now?
>>>>>>
>>>>>>   
>>>>>>       
>>>>>>             
>>>>> -- 
>>>>> ============================================
>>>>> Ioan Raicu
>>>>> Ph.D. Student
>>>>> ============================================
>>>>> Distributed Systems Laboratory
>>>>> Computer Science Department
>>>>> University of Chicago
>>>>> 1100 E. 58th Street, Ryerson Hall
>>>>> Chicago, IL 60637
>>>>> ============================================
>>>>> Email: iraicu at cs.uchicago.edu
>>>>> Web:   http://www.cs.uchicago.edu/~iraicu
>>>>>        http://dsl.cs.uchicago.edu/
>>>>> ============================================
>>>>> ============================================
>>>>>     
>>>>>           
>>>>   
>>>>         
>>> -- 
>>>
>>>    Ian Foster, Director, Computation Institute
>>> Argonne National Laboratory & University of Chicago
>>> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
>>> Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
>>> Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
>>>       Globus Alliance: www.globus.org.
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>>       
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
>   

-- 
============================================
Ioan Raicu
Ph.D. Student
============================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
============================================
Email: iraicu at cs.uchicago.edu
Web:   http://www.cs.uchicago.edu/~iraicu
       http://dsl.cs.uchicago.edu/
============================================
============================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20070616/d85c39b6/attachment.html>


More information about the Swift-devel mailing list