[Swift-devel] on the semantics of 'array closing'
Mike Wilde
wilde at mcs.anl.gov
Sat Jun 16 10:50:25 CDT 2007
also to note: Ian has suggested several times that we explore
map-reduce. I think this is worth doing: its possible/likely that
swift is already pretty close to m-r in many ways, and could benefit
from a more detailed comparison and assessment of what we can
borrow, adapt, and/or integrate.
We should use this as a chance to create a "swift library" page
where we post good papers that we can cite in our discussions to get
ourselves on a common page.
Some of these might be good material for Thu Grad seminar discussins
as well.
- Mike
Mike Wilde wrote, On 6/16/2007 10:05 AM:
> Hi all,
>
> I'm jumping in late; I re-read the thread a few times but may have
> missed something. So correct me as needed. Also, rather than spending
> more time polishing the thoughts below I just put them out here for
> discussion.
>
> This discussion seems to me very important, as it can close down several
> of the major open issues that are very critical to the language, both to
> give it complete and consistent semantics and to make it practical fr
> the problems that we are applying it to.
>
> Four important but missing aspects of this discussion are: pipelining,
> error handing, restart, and mapping.
>
> I feel that swift needs the following semantics:
>
> 1. Pipelining:
>
> The data dependency aspects of swift are carried out at the atomic level
> in a pipelined manner.
>
> -- elements of an array are written into the stream
>
> -- readers of the array consume the stream
>
> -- the entire program remains active in parallel, across function
> boundaries
>
> Array elements [k,v] are identified by their index, k, which can be an
> int or string.
>
> 2. Error handling
>
> In practice, many large-scale foreach() operations will never complete,
> yet they will deliver a lot of useful results that we want subsequent
> statements in a program to continue to operate on. Thus closing needs to
> permit different criteria other than just "finishing".
>
> An array is "closed" when its producer function/foreach "shuts down".
> Can we permit shutdown/closing to occur based on finishing, time, or
> quota/threshold. These would be parameters of the foreach statements
> that could be overridden.
>
> (For some practical examples, see map-reduce; it has similar problems:
> parallel computations reach a level whwre there is lots of parallelism,
> and as it proceeds, gets to a poiunt where only the "stragglers" are
> left - things waiting in slow queues or for hung data transfers, etc.
> Ive read this in m/r papers, and found that our experiences match those
> reported by the google m/r people).
>
> 3. Restart
>
> We want computations to be restartable. If 50% of a large array/dataset
> gets created in a 10-hour run, and then fails, we want the run to be
> restartable and continue where it left of with minimal lost of
> "completed" results.
>
> 4. Mapping
>
> Lastly, swift mapping should be connected to this whole process: the
> mapped contents of a dataset should be a stream of xml elements rather
> than a "completed" xml document, so that we can practically handle very
> large datasets. So when a foreach() statement processes a array, its
> processing the mapped stream of the array. mappers should be parallel
> processes that produce and consume these streams of xml elements.
>
> - Mike
>
>
>
>
> Ben Clifford wrote, On 6/16/2007 8:34 AM:
>>
>> On Sat, 16 Jun 2007, Mihael Hategan wrote:
>>
>>>> It works because Swift implicitly marks arrays returned from
>>>> compound procedures as closed (which may or may not be correct).
>>> We defined it as correct. Something created in one scope cannot be
>>> modified in a parent scope.
>>
>> That's fine - what was unintuitive to me was that something created in
>> one scope cannot be referred to in that same scope. i.e. you can
>> create a piecewise using a[...]=... but cannot then refer to a.
>>
>
--
Mike Wilde
Computation Institute, University of Chicago
Math & Computer Science Division
Argonne National Laboratory
Argonne, IL 60439 USA
tel 630-252-7497 fax 630-252-1997
More information about the Swift-devel
mailing list