[Swift-devel] Clustering and Temp Dirs with Swift

Mihael Hategan hategan at mcs.anl.gov
Fri Oct 26 16:29:45 CDT 2007


On Fri, 2007-10-26 at 16:20 -0500, Ioan Raicu wrote:
> Nika,
> Can it really be that simple?  How does the data then move from the 
> local disk scratch directory to the shared directory on GPFS?  At the 
> very least, you'd have to modify the wrapper script to not do symbolic 
> linking, but actually to copy the input data to the local disk temporary 
> scratch directory.

Yes, you would.

> 
> Ioan
> 
> Veronika Nefedova wrote:
> > Andrew,
> >
> > I am not sure if I understand you correctly. If you want to have all 
> > your working directories to be on a local disk, why don't you specify 
> > that local directory in you sites.xml file as a 'workddirectory'? All 
> > temp dirs will be relative to that workdirectory from the sites.xml file.
> >
> > Nika
> >
> > On Oct 26, 2007, at 4:05 PM, Andrew Robert Jamieson wrote:
> >
> >> Ioan,
> >>
> >>   Thanks for the explaination.  It seems like you characterized what 
> >> is going on pretty well.
> >>
> >> One question I have is, does this case occur only for situations in 
> >> which it is in the same directory or is it anywhere at any given time 
> >> in the shared GPFS?
> >>
> >> Furthermore, why can't the short lived directory live on the local 
> >> node's /tmp/* somewhere?  I have wrapped all my programs to ensure 
> >> that things are ONLY executed on the local node directories to 
> >> specifically aviod this type of problem. Now Swift is making that 
> >> effort irrelevant it seems.
> >>
> >> Does this seem reasonable?
> >>
> >> Thanks,
> >> Andrew
> >>
> >> On Fri, 26 Oct 2007, Ioan Raicu wrote:
> >>
> >>> I am not sure what configuration exists on TP, but on the TeraGrid 
> >>> ANL/UC cluster, with 8 servers behind GPFS, the wrapper script 
> >>> performance (create dir, create symbolic links, remove directory... 
> >>> all on GPFS) is anywhere between 20~40 / sec, depending on how many 
> >>> nodes you have doing this concurrently.  The throughput increases 
> >>> first as you add nodes, but then decreases down to about 20/sec with 
> >>> 20~30+ nodes.  What this means is that even if you bundle jobs up, 
> >>> you will not get anything better than this, throughput wise, 
> >>> regardless of how short the jobs are.  Now, if TP has less than 8 
> >>> servers, its likely that the throughput it can sustain is even 
> >>> lower, and if you push it over the edge, even to the point of 
> >>> thrashing where the throughput can be extremely small.   I don't 
> >>> have any suggestions of how you can get around this, with the 
> >>> exception of making your job sizes larger on average, and hence have 
> >>> fewer jobs over the same period of time.
> >>>
> >>> Ioan
> >>>
> >>> Andrew Robert Jamieson wrote:
> >>>> I am kind of at a stand still for getting anything done on TP right 
> >>>> now with this problem. Are there any suggestions to overcome this 
> >>>> for the time being?
> >>>> On Fri, 26 Oct 2007, Andrew Robert Jamieson wrote:
> >>>>> Hello all,
> >>>>>  I am encountering the following problem on Teraport.  I submit a 
> >>>>> clustered swift WF which should amount to something on the order 
> >>>>> of 850x3 individual jobs total. I have clustered the jobs because 
> >>>>> they are very fast (somewhere around 20 sec to 1 min long).  When 
> >>>>> I submit the WF on TP things start out fantastic, I get 10s of 
> >>>>> output files in a matter of seconds and nodes would start and 
> >>>>> finish clustered batches in a matter of minutes or less. However, 
> >>>>> after waiting about 3-5 mins, when clustered jobs are begin to 
> >>>>> line up in the queue and more start running at the same time, 
> >>>>> things start to slow down to a trickle in terms of output.
> >>>>> One thing I noticed is when I try a simply ls on TP in the swift 
> >>>>> temp running directory where the temp job dirs are created and 
> >>>>> destroyed, it take a very long time.  And when it is done only 
> >>>>> five or so things are in the dir. (this is the dir with "info  
> >>>>> kickstart  shared  status wrapper.log" in it).  What I think is 
> >>>>> happening is that TP's filesystem cant handle this extremely rapid 
> >>>>> creation/destruction of directories in that shared location. From 
> >>>>> what I have been told these temp dirs come and go as long as the 
> >>>>> job runs successfully.
> >>>>> What I am wondering is if there is anyway to move that dir to the 
> >>>>> local node tmp diretory not the shared file system, while it is 
> >>>>> running and if something fails then have it sent to the 
> >>>>> appropriate place.
> >>>>> Or, if another layer of temp dir wrapping could be applied with 
> >>>>> labeld perhaps with respect to the clustered job grouping and not 
> >>>>> simply the individual jobs (since there are thousands being 
> >>>>> computed at once).
> >>>>> That these things would only be generated/deleted every 5 mins or 
> >>>>> 10 mins (if clustered properly on my part) instead of one event 
> >>>>> every milli second or what have you.
> >>>>> I don't know which solution is feasible or if any are at all, but 
> >>>>> this seems to be a major problem for my WFs.  In general it is 
> >>>>> never good to have a million things coming and going on a shared 
> >>>>> file system in one place, from my experience at least.
> >>>>> Thanks,
> >>>>> Andrew
> >>>>> _______________________________________________
> >>>>> Swift-devel mailing list
> >>>>> Swift-devel at ci.uchicago.edu
> >>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >>>> _______________________________________________
> >>>> Swift-devel mailing list
> >>>> Swift-devel at ci.uchicago.edu
> >>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >>>
> >>> -- 
> >>> ============================================
> >>> Ioan Raicu
> >>> Ph.D. Student
> >>> ============================================
> >>> Distributed Systems Laboratory
> >>> Computer Science Department
> >>> University of Chicago
> >>> 1100 E. 58th Street, Ryerson Hall
> >>> Chicago, IL 60637
> >>> ============================================
> >>> Email: iraicu at cs.uchicago.edu
> >>> Web:   http://www.cs.uchicago.edu/~iraicu
> >>>      http://dsl.cs.uchicago.edu/
> >>> ============================================
> >>> ============================================
> >>>
> >>>
> >> _______________________________________________
> >> Swift-devel mailing list
> >> Swift-devel at ci.uchicago.edu
> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >>
> >
> >
> 




More information about the Swift-devel mailing list