<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
<br>
Mihael Hategan wrote:
<blockquote cite="mid:1228170919.3817.11.camel@localhost" type="cite">
<pre wrap="">On Mon, 2008-12-01 at 16:15 -0600, Zhao Zhang wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Desired Data Flow: 1st stage of computation knows the output data will
be used as the input for the next
stage, thus the data is not copied back to GPFS, then the 2nd stage task
arrived and consumed this data.
</pre>
</blockquote>
<pre wrap=""><!---->
This assumes a sequential workflow (t1 -> t2 ->... -> tn). For anything
more complex, this becomes a nasty scheduling problem. For example:
(t1, t2) -> t3
The outputs of which of t1 or t2 should not be copied back?
</pre>
<blockquote type="cite">
<pre wrap="">Key Issue: the 2nd stage task has no idea of where the 1st stage output
data is.
</pre>
</blockquote>
<pre wrap=""><!---->
I beg to disagree. Swift provides the mechanism to record where data is.
The key issue is that queuing systems don't allow control over the exact
nodes that tasks go to.
</pre>
</blockquote>
Well, Falkon with data diffusion gives you that level of control :)<br>
<blockquote cite="mid:1228170919.3817.11.camel@localhost" type="cite">
<pre wrap="">
Another key issue is that you may not even want to do so, because that
node may be better used running a different task (scheduling problem
again).
</pre>
<blockquote type="cite">
<pre wrap="">Design Alternatives:
1. Data aware task scheduling:
Both swift and falkon need to be data aware. Swift should know where
the output of 1st stage is, which
means, which pset, or say which falkon service.
And the falkon service should know which CN has the data for the 2nd
stage computation.
2. Swift patch jobs vertically
Before sending out any jobs, swift knows those 2 stage jobs has data
dependency, thus send out 1 batched
job as 1 to each worker.
3. Collective IO
Build a shared file system which could be accessed by all CN, instead
of writing output data to GPFS, workers
copy intermediate output data to this shared ram-disk. And retrieve
the data from IFS.
</pre>
</blockquote>
<pre wrap=""><!---->
That seems awfully close to implementing a distributed filesystem, which
I think is a fairly bad idea. If you're trying to avoid GPFS contention,
then avoid it by carefully sticking your data in different directories.
And do keep in mind that most operating systems cache filesystem data in
memory, so a read after write of a reasonably small file will be very
fast with any filesystem.
</pre>
</blockquote>
I don't think you realize how expensive GPFS access is when doing so at
100K CPU scale. Simple operations that should take milliseconds take
tens of seconds to complete, maybe more. For example, the GPFS locking
of writes to a single directory can take 1000s of seconds at only 16K
CPU scale... the idea of creating these islands of shared file systems,
that are localized to a small portion of the total number of workers,
seems like a viable solution to allow more data intensive applications
to scale. The problem is how is the CIO expressed in such a way that it
works well, reliably, and transparently. We also have to do more
measurements to see how much we gain performance wise, for the efforts
we are throwing at the problem.<br>
<br>
Ioan<br>
<blockquote cite="mid:1228170919.3817.11.camel@localhost" type="cite">
<pre wrap="">
_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
===================================================
Ioan Raicu
Ph.D. Candidate
===================================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
===================================================
Email: <a class="moz-txt-link-abbreviated" href="mailto:iraicu@cs.uchicago.edu">iraicu@cs.uchicago.edu</a>
Web: <a class="moz-txt-link-freetext" href="http://www.cs.uchicago.edu/~iraicu">http://www.cs.uchicago.edu/~iraicu</a>
<a class="moz-txt-link-freetext" href="http://dev.globus.org/wiki/Incubator/Falkon">http://dev.globus.org/wiki/Incubator/Falkon</a>
<a class="moz-txt-link-freetext" href="http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page">http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page</a>
===================================================
===================================================
</pre>
</body>
</html>