[Swift-devel] running 1 billion tasks through Falkon

Ian Foster foster at anl.gov
Thu Oct 16 14:27:37 CDT 2008


Way to go -- GigaJobs!


On Oct 16, 2008, at 2:09 PM, Ioan Raicu wrote:

> Hi again,
> Up to a few days ago, the largest test I ever did with Falkon, in  
> terms of number of tasks, was 20 million tasks, which worked as  
> expected. For the sake of pushing Falkon, perhaps to the point where  
> it might break, I tried the 20M task experiment again, but now with  
> 1B (billion) tasks. Note, this 1 billion tasks is from a single  
> invocation of the Falkon command line client.
>
> On an orthogonal issue, I noticed that on simple sleep 0 tasks, I  
> can't seem to saturate my 8 CPU cores where I run the service, and  
> usually, I get 1~2 cores utilized. So, I decided to run 4 Falkon  
> services on the same machine, and use the load-balancing client  
> (designed for the BG/P) and ran 1B tasks (from another node with  
> dual CPUs) across these 4 services (running on the 8-core node).   
> Each service managed 32 CPUs in ANL/UC TG cluster, for a total of  
> 128 CPUs, and each task was a simple sleep 0, with no I/O.
>
> Here is the plot of the run.
> http://people.cs.uchicago.edu/~iraicu/projects/Falkon/plots/Falkon-1B-4serv-128cpu.jpg
>
> The good news is that the test ran great, getting an average of  
> 15558 tasks/sec. Now, the bad news is that the throughput seemed to  
> drop from 17000 tasks/sec (at the beginning) to 15500 tasks/sec (at  
> the end). The explanation to the drop in throughput come from the  
> memory management in Java on the client side, which apparently was  
> spending 5~10 seconds in garbage collection every 60 seconds or so,  
> and the amount of free heap space was monotonically decreasing, on  
> average.  See the graph (http://people.cs.uchicago.edu/~iraicu/projects/Falkon/plots/Falkon-1B-4serv-128cpu-mem.jpg 
> ), which shows the free heap size decreased down to around 200MB  
> (from the max of 1536MB). At the current rate of the potential  
> memory leak I have in the client, the free heap would get diminished  
> to 0MB by 1.5 billion tasks.
>
> I just thought this was an interesting experiment, which revealed a  
> memory leak in the client, that did not show up in the smaller tests  
> I had done so far.
>
> Cheers,
> Ioan
>
> -- 
> ===================================================
> Ioan Raicu
> Ph.D. Candidate
> ===================================================
> Distributed Systems Laboratory
> Computer Science Department
> University of Chicago
> 1100 E. 58th Street, Ryerson Hall
> Chicago, IL 60637
> ===================================================
> Email: iraicu at cs.uchicago.edu
> Web:   http://www.cs.uchicago.edu/~iraicu
> http://dev.globus.org/wiki/Incubator/Falkon
> http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
> ===================================================
> ===================================================
>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list