<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

But the scenario is probably more paralelizable, in theory.  There is

some common path, say /shared/common/path, and then you have x

directories that you want to create in path, say dir1, dir2, ... ,

dirx.  If the meta-data information is distributed over the 8 I/O

servers, than the creating these x directories should be load balanced

across the 8 I/O servers.  If the meta-data is centralized, they will

all hit the same server.  In the end, it doesn't really matter.  What

matters is that it limits the job granularity you can really have, as

the cost of the mkdir and rm dir can quickly outpace the cost of

computation and data staging in and out.  It would be great to have

some alternatives, for workflows that need more throughput than GPFS

can handle.<br>

<br>

Ioan<br>

<br>

Mihael Hategan wrote:

<blockquote cite="mid:1193459247.21362.6.camel@blabla.mcs.anl.gov"

 type="cite">

  <pre wrap="">On Fri, 2007-10-26 at 23:02 -0500, Ioan Raicu wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">If it doesn't apply to meta-data operations, such as directories, then

it means that meta-data changes in the file system is rather

centralized (maybe this explains the relatively poor performance for

creating and removing directories).

    </pre>

  </blockquote>

  <pre wrap=""><!---->

On GPFS, according to my understanding of their documentation, exactly

one node controls access to one file at any given time. If, for all

observable aspects of the implementation, a directory is a file with a

bunch of metadata for the files it contains, then doing things in a

directory from multiple places is similar to accessing the same file

from multiple places.

Unless I'm blatantly wrong. Probably some complications of that model

exist even if I'm not.

  </pre>

  <blockquote type="cite">

    <pre wrap="">  I would be curious to see how well the solution works to move data

to the local disk first prior to processing, to avoid working from the

shared file system (including the creation and removal of the scratch

temp directory on GPFS).

Ioan  

Mihael Hategan wrote: 

    </pre>

    <blockquote type="cite">

      <pre wrap="">On Fri, 2007-10-26 at 15:11 -0500, Ioan Raicu wrote:

      </pre>

      <blockquote type="cite">

        <pre wrap="">I am not sure what configuration exists on TP, but on the TeraGrid 

ANL/UC cluster, with 8 servers behind GPFS, the wrapper script 

performance (create dir, create symbolic links, remove directory... all 

on GPFS) is anywhere between 20~40 / sec, depending on how many nodes 

you have doing this concurrently.  The throughput increases first as you 

add nodes, but then decreases down to about 20/sec with 20~30+ nodes.  

What this means is that even if you bundle jobs up, you will not get 

anything better than this, throughput wise, regardless of how short the 

jobs are.  Now, if TP has less than 8 servers, its likely that the 

throughput it can sustain is even lower,

        </pre>

      </blockquote>

      <pre wrap="">Perhaps in terms of bytes/s. But I wouldn't be so sure that this applies

to other file stuff.

      </pre>

      <blockquote type="cite">

        <pre wrap="">and if you push it over the 

edge, even to the point of thrashing where the throughput can be 

extremely small.   I don't have any suggestions of how you can get 

around this, with the exception of making your job sizes larger on 

average, and hence have fewer jobs over the same period of time.

Ioan

Andrew Robert Jamieson wrote:

        </pre>

        <blockquote type="cite">

          <pre wrap="">I am kind of at a stand still for getting anything done on TP right 

now with this problem. Are there any suggestions to overcome this for 

the time being?

On Fri, 26 Oct 2007, Andrew Robert Jamieson wrote:

          </pre>

          <blockquote type="cite">

            <pre wrap="">Hello all,

 I am encountering the following problem on Teraport.  I submit a 

clustered swift WF which should amount to something on the order of 

850x3 individual jobs total. I have clustered the jobs because they 

are very fast (somewhere around 20 sec to 1 min long).  When I submit 

the WF on TP things start out fantastic, I get 10s of output files in 

a matter of seconds and nodes would start and finish clustered 

batches in a matter of minutes or less. However, after waiting about 

3-5 mins, when clustered jobs are begin to line up in the queue and 

more start running at the same time, things start to slow down to a 

trickle in terms of output.

One thing I noticed is when I try a simply ls on TP in the swift temp 

running directory where the temp job dirs are created and destroyed, 

it take a very long time.  And when it is done only five or so things 

are in the dir. (this is the dir with "info  kickstart  shared  

status wrapper.log" in it).  What I think is happening is that TP's 

filesystem cant handle this extremely rapid creation/destruction of 

directories in that shared location. From what I have been told these 

temp dirs come and go as long as the job runs successfully.

What I am wondering is if there is anyway to move that dir to the 

local node tmp diretory not the shared file system, while it is 

running and if something fails then have it sent to the appropriate 

place.

Or, if another layer of temp dir wrapping could be applied with 

labeld perhaps with respect to the clustered job grouping and not 

simply the individual jobs (since there are thousands being computed 

at once).

That these things would only be generated/deleted every 5 mins or 10 

mins (if clustered properly on my part) instead of one event every 

milli second or what have you.

I don't know which solution is feasible or if any are at all, but 

this seems to be a major problem for my WFs.  In general it is never 

good to have a million things coming and going on a shared file 

system in one place, from my experience at least.

Thanks,

Andrew

_______________________________________________

Swift-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>

<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

            </pre>

          </blockquote>

          <pre wrap="">_______________________________________________

Swift-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>

<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

          </pre>

        </blockquote>

        <pre wrap="">-- 

============================================

Ioan Raicu

Ph.D. Student

============================================

Distributed Systems Laboratory

Computer Science Department

University of Chicago

1100 E. 58th Street, Ryerson Hall

Chicago, IL 60637

============================================

Email: <a class="moz-txt-link-abbreviated" href="mailto:iraicu@cs.uchicago.edu">iraicu@cs.uchicago.edu</a>

Web:   <a class="moz-txt-link-freetext" href="http://www.cs.uchicago.edu/~iraicu">http://www.cs.uchicago.edu/~iraicu</a>

       <a class="moz-txt-link-freetext" href="http://dsl.cs.uchicago.edu/">http://dsl.cs.uchicago.edu/</a>

============================================

============================================

_______________________________________________

Swift-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>

<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

        </pre>

      </blockquote>

      <pre wrap="">

      </pre>

    </blockquote>

    <pre wrap="">-- 

============================================

Ioan Raicu

Ph.D. Student

============================================

Distributed Systems Laboratory

Computer Science Department

University of Chicago

1100 E. 58th Street, Ryerson Hall

Chicago, IL 60637

============================================

Email: <a class="moz-txt-link-abbreviated" href="mailto:iraicu@cs.uchicago.edu">iraicu@cs.uchicago.edu</a>

Web:   <a class="moz-txt-link-freetext" href="http://www.cs.uchicago.edu/~iraicu">http://www.cs.uchicago.edu/~iraicu</a>

       <a class="moz-txt-link-freetext" href="http://dsl.cs.uchicago.edu/">http://dsl.cs.uchicago.edu/</a>

============================================

============================================

    </pre>

  </blockquote>

  <pre wrap=""><!---->

  </pre>

</blockquote>

<br>

<pre class="moz-signature" cols="72">-- 

============================================

Ioan Raicu

Ph.D. Student

============================================

Distributed Systems Laboratory

Computer Science Department

University of Chicago

1100 E. 58th Street, Ryerson Hall

Chicago, IL 60637

============================================

Email: <a class="moz-txt-link-abbreviated" href="mailto:iraicu@cs.uchicago.edu">iraicu@cs.uchicago.edu</a>

Web:   <a class="moz-txt-link-freetext" href="http://www.cs.uchicago.edu/~iraicu">http://www.cs.uchicago.edu/~iraicu</a>

       <a class="moz-txt-link-freetext" href="http://dsl.cs.uchicago.edu/">http://dsl.cs.uchicago.edu/</a>

============================================

============================================</pre>

</body>

</html>