[Swift-devel] Re: How to wait on functions that return no data?
Mihael Hategan
hategan at mcs.anl.gov
Wed Mar 26 10:51:33 CDT 2008
> I suspect you're not going to like this idea on first consideration. But
> its related to ideas on how to leverage map-reduce, as I mentioned
> earlier, and Ian's suggestion to explore collective operations. Mihael
> thought my take on this was inelegant and inconsistent with data flow.
Somewhat. What I thought you suggested was pretty much "I don't want to
write my program as dataflow but I want to implement it in a dataflow
language". "And if it doesn't work, then the language should be changed
so that I can".
[...]
>
> If a swift job could efficiently return a set of swift objects without
> using a file
In the context of Globus, it seems a bit difficult.
> (specifically without placing files back in the shared
> directory) then many of these apps could work beautifully, by returning
> strings or numeric objects, possibly as structs and/r arrays, that
> travel back through the job submission interface rather than getting
> fetched via the data provider. If a cluster of jobs could return data
> efficiently in a single "package" from the cluster, then we could pretty
> readily do map-reduce in swift, efficiently, in perfect concordance with
> the current dataflow model.
One more time: we CAN do map-reduce in Swift. Stop saying we can't.
Please. It's getting silly.
The efficiency issue comes from the fact that the overhead for
distributing very very very small tasks across a wide area network is
very high compared to the task run time. And in the current Swift
implementation it is higher than in the implementation you seem to think
of.
>
> Perhaps this later approach is the best to consider: I suspect it could
> be readily implemented, could use a simple file to contain an arbitrary
> set of swift object return values, possibly in a format similar to that
> of readdata().
How is this different from the current scheme (besides the data files
being in a different format)?
>
> - Mike
>
>
>
>
>
>
>
> On 3/25/08 6:04 PM, Ben Clifford wrote:
> > On Tue, 25 Mar 2008, Michael Wilde wrote:
> >
> >> From a pure language point of view, we should permit the return of data that
> >> can be grouped (batched) into files files in arbitrary chunks, determined and
> >> optimized by the implementation. Map-reduce tuples seem to work well for this
> >> model, and it seems that Swift could encompass it with minimal semantic change
> >> to the current language.
> >
> > For your example, what way do you want to store the data on the remote
> > side - I'm assuming not individual files.
> >
> > The present dataset model should fairly easily accomodate the description
> > of places to store data that aren't files - there's an abstraction in the
> > implementation to help with that at the moment (DSHandle, which is what
> > deals with the difference between in-memory values and on-disk files; and
> > could fairly straightforwardly deal with other storage forms).
> >
> > One of the project ideas I put in for the google summer of code was to
> > play around with this, in fact.
> >
>
More information about the Swift-devel
mailing list