<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

<br>

<br>

Mihael Hategan wrote:

<blockquote cite="mid:1228170919.3817.11.camel@localhost" type="cite">

  <pre wrap="">On Mon, 2008-12-01 at 16:15 -0600, Zhao Zhang wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">Desired Data Flow: 1st stage of computation knows the output data will 

be used as the input for the next

stage, thus the data is not copied back to GPFS, then the 2nd stage task 

arrived and consumed this data.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

This assumes a sequential workflow (t1 -> t2 ->... -> tn). For anything

more complex, this becomes a nasty scheduling problem. For example:

(t1, t2) -> t3

The outputs of which of t1 or t2 should not be copied back?

  </pre>

  <blockquote type="cite">

    <pre wrap="">Key Issue: the 2nd stage task has no idea of where the 1st stage output 

data is.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

I beg to disagree. Swift provides the mechanism to record where data is.

The key issue is that queuing systems don't allow control over the exact

nodes that tasks go to.

  </pre>

</blockquote>

Well, Falkon with data diffusion gives you that level of control :)<br>

<blockquote cite="mid:1228170919.3817.11.camel@localhost" type="cite">

  <pre wrap="">

Another key issue is that you may not even want to do so, because that

node may be better used running a different task (scheduling problem

again).

  </pre>

  <blockquote type="cite">

    <pre wrap="">Design Alternatives:

1. Data aware task scheduling:

    Both swift and falkon need to be data aware. Swift should know where 

the output of 1st stage is, which

    means, which pset, or say which falkon service.

    And the falkon service should know which CN has the data for the 2nd 

stage computation.

2. Swift patch jobs vertically

    Before sending out any jobs, swift knows those 2 stage jobs has data 

dependency, thus send out 1 batched

    job as 1 to each worker.

3. Collective IO

   Build a shared file system which could be accessed by all CN, instead 

of writing output data to GPFS, workers

   copy intermediate output data to this shared ram-disk. And retrieve 

the data from IFS.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

That seems awfully close to implementing a distributed filesystem, which

I think is a fairly bad idea. If you're trying to avoid GPFS contention,

then avoid it by carefully sticking your data in different directories.

And do keep in mind that most operating systems cache filesystem data in

memory, so a read after write of a reasonably small file will be very

fast with any filesystem.

  </pre>

</blockquote>

I don't think you realize how expensive GPFS access is when doing so at

100K CPU scale.  Simple operations that should take milliseconds take

tens of seconds to complete, maybe more.  For example, the GPFS locking

of writes to a single directory can take 1000s of seconds at only 16K

CPU scale... the idea of creating these islands of shared file systems,

that are localized to a small portion of the total number of workers,

seems like a viable solution to allow more data intensive applications

to scale. The problem is how is the CIO expressed in such a way that it

works well, reliably, and transparently. We also have to do more

measurements to see how much we gain performance wise, for the efforts

we are throwing at the problem.<br>

<br>

Ioan<br>

<blockquote cite="mid:1228170919.3817.11.camel@localhost" type="cite">

  <pre wrap="">

_______________________________________________

Swift-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>

<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

  </pre>

</blockquote>

<br>

<pre class="moz-signature" cols="72">-- 

===================================================

Ioan Raicu

Ph.D. Candidate

===================================================

Distributed Systems Laboratory

Computer Science Department

University of Chicago

1100 E. 58th Street, Ryerson Hall

Chicago, IL 60637

===================================================

Email: <a class="moz-txt-link-abbreviated" href="mailto:iraicu@cs.uchicago.edu">iraicu@cs.uchicago.edu</a>

Web:   <a class="moz-txt-link-freetext" href="http://www.cs.uchicago.edu/~iraicu">http://www.cs.uchicago.edu/~iraicu</a>

<a class="moz-txt-link-freetext" href="http://dev.globus.org/wiki/Incubator/Falkon">http://dev.globus.org/wiki/Incubator/Falkon</a>

<a class="moz-txt-link-freetext" href="http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page">http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page</a>

===================================================

===================================================

</pre>

</body>

</html>