[Swift-devel] Clustering and Temp Dirs with Swift

Veronika Nefedova nefedova at mcs.anl.gov
Fri Oct 26 16:17:01 CDT 2007


Andrew,

I am not sure if I understand you correctly. If you want to have all  
your working directories to be on a local disk, why don't you specify  
that local directory in you sites.xml file as a 'workddirectory'? All  
temp dirs will be relative to that workdirectory from the sites.xml  
file.

Nika

On Oct 26, 2007, at 4:05 PM, Andrew Robert Jamieson wrote:

> Ioan,
>
>   Thanks for the explaination.  It seems like you characterized  
> what is going on pretty well.
>
> One question I have is, does this case occur only for situations in  
> which it is in the same directory or is it anywhere at any given  
> time in the shared GPFS?
>
> Furthermore, why can't the short lived directory live on the local  
> node's /tmp/* somewhere?  I have wrapped all my programs to ensure  
> that things are ONLY executed on the local node directories to  
> specifically aviod this type of problem. Now Swift is making that  
> effort irrelevant it seems.
>
> Does this seem reasonable?
>
> Thanks,
> Andrew
>
> On Fri, 26 Oct 2007, Ioan Raicu wrote:
>
>> I am not sure what configuration exists on TP, but on the TeraGrid  
>> ANL/UC cluster, with 8 servers behind GPFS, the wrapper script  
>> performance (create dir, create symbolic links, remove  
>> directory... all on GPFS) is anywhere between 20~40 / sec,  
>> depending on how many nodes you have doing this concurrently.  The  
>> throughput increases first as you add nodes, but then decreases  
>> down to about 20/sec with 20~30+ nodes.  What this means is that  
>> even if you bundle jobs up, you will not get anything better than  
>> this, throughput wise, regardless of how short the jobs are.  Now,  
>> if TP has less than 8 servers, its likely that the throughput it  
>> can sustain is even lower, and if you push it over the edge, even  
>> to the point of thrashing where the throughput can be extremely  
>> small.   I don't have any suggestions of how you can get around  
>> this, with the exception of making your job sizes larger on  
>> average, and hence have fewer jobs over the same period of time.
>>
>> Ioan
>>
>> Andrew Robert Jamieson wrote:
>>> I am kind of at a stand still for getting anything done on TP  
>>> right now with this problem. Are there any suggestions to  
>>> overcome this for the time being?
>>> On Fri, 26 Oct 2007, Andrew Robert Jamieson wrote:
>>>> Hello all,
>>>>  I am encountering the following problem on Teraport.  I submit  
>>>> a clustered swift WF which should amount to something on the  
>>>> order of 850x3 individual jobs total. I have clustered the jobs  
>>>> because they are very fast (somewhere around 20 sec to 1 min  
>>>> long).  When I submit the WF on TP things start out fantastic, I  
>>>> get 10s of output files in a matter of seconds and nodes would  
>>>> start and finish clustered batches in a matter of minutes or  
>>>> less. However, after waiting about 3-5 mins, when clustered jobs  
>>>> are begin to line up in the queue and more start running at the  
>>>> same time, things start to slow down to a trickle in terms of  
>>>> output.
>>>> One thing I noticed is when I try a simply ls on TP in the swift  
>>>> temp running directory where the temp job dirs are created and  
>>>> destroyed, it take a very long time.  And when it is done only  
>>>> five or so things are in the dir. (this is the dir with "info   
>>>> kickstart  shared  status wrapper.log" in it).  What I think is  
>>>> happening is that TP's filesystem cant handle this extremely  
>>>> rapid creation/destruction of directories in that shared  
>>>> location. From what I have been told these temp dirs come and go  
>>>> as long as the job runs successfully.
>>>> What I am wondering is if there is anyway to move that dir to  
>>>> the local node tmp diretory not the shared file system, while it  
>>>> is running and if something fails then have it sent to the  
>>>> appropriate place.
>>>> Or, if another layer of temp dir wrapping could be applied with  
>>>> labeld perhaps with respect to the clustered job grouping and  
>>>> not simply the individual jobs (since there are thousands being  
>>>> computed at once).
>>>> That these things would only be generated/deleted every 5 mins  
>>>> or 10 mins (if clustered properly on my part) instead of one  
>>>> event every milli second or what have you.
>>>> I don't know which solution is feasible or if any are at all,  
>>>> but this seems to be a major problem for my WFs.  In general it  
>>>> is never good to have a million things coming and going on a  
>>>> shared file system in one place, from my experience at least.
>>>> Thanks,
>>>> Andrew
>>>> _______________________________________________
>>>> Swift-devel mailing list
>>>> Swift-devel at ci.uchicago.edu
>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>
>> -- 
>> ============================================
>> Ioan Raicu
>> Ph.D. Student
>> ============================================
>> Distributed Systems Laboratory
>> Computer Science Department
>> University of Chicago
>> 1100 E. 58th Street, Ryerson Hall
>> Chicago, IL 60637
>> ============================================
>> Email: iraicu at cs.uchicago.edu
>> Web:   http://www.cs.uchicago.edu/~iraicu
>>      http://dsl.cs.uchicago.edu/
>> ============================================
>> ============================================
>>
>>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>




More information about the Swift-devel mailing list