[Swift-devel] Multiple output files

Mihael Hategan hategan at mcs.anl.gov
Thu Mar 27 21:11:13 CDT 2014


I committed the changes.

This only works for normal swift staging and provider staging. It can be
made to work with wrapper staging, but I have not done so.

As far as I can tell, there are only two mappers that can map arrays
based on what's on the filesystem rather than what the mapper parameters
are saying: FilesysMapper and SimpleMapper (well, there's also
ConcurrentMapper and FMRIMapper, and, while they should be useable for
collecting output data, it's unlikely that anyone will try to guess how
to name files so that they match what those mappers are expecting).

To use the feature, do the obvious:

file[] a <SimpleMapper; location="outs", prefix="foo", suffix=".out">;
// OR
// file[] a <FilesysMapper; location="outs", pattern="*.out">;
// OR
// file[] a <FilesysMapper; location="outs", pattern="foo????.out">;

app (file[] oa) gen(int i) {
        gen i; // writes a bunch of files of the form foo????.out to
outs/
}

a = gen(3);

I've also committed tests for this. Yadu, if you have some time, can you
check why the check script doesn't find the output from the run? I don't
remember how that worked.

There are probably going to be a few bugs, since a lot of how the swift
staging data was handled has changed. So please test this as swift in
general.

One other thing to note is that the globbing is only supported for the
file names. For example: /a/b/??/c/*.txt won't work, but /a/b/x/c/*.txt
will. However, that's if you use a pattern with the FilesysMapper.
Complex structures mapped with SimpleMapper should work. For example,
you should be able to do this:

type struct {
  file a;
  file[] b;
}

app (struct[] o) f(...) {}

It should figure out the ranges for both o[] and all of o[x].b[].

Mihael

On Wed, 2014-03-26 at 12:04 -0700, Mihael Hategan wrote:
> Fortunately this would not be something that the users would see.
> 
> But if they did, a*b -> c*d seems pretty intuitive to me. Maybe even
> more so than 
> a(.*)b -> c\1d. It's the implementation of this that is more difficult.
> 
> Mihael
> 
> On Wed, 2014-03-26 at 13:42 -0500, Michael Wilde wrote:
> > Excellent!
> > 
> > The path-difference challenge sounds like the same things we faced in 
> > CDM, revisited.
> > 
> > I suspect we can find ways to smooth this out in a way that works 
> > naturally for users.
> > 
> > - Mike
> > 
> > 
> > On 3/26/14, 1:24 PM, Mihael Hategan wrote:
> > > Almost there.
> > >
> > > It works for non-provider staging.
> > > I'm working on provider staging now. The difficulty is in that providers
> > > must now support glob-pattern staging with a twist. The twist is that
> > > local and remote path names are different. For example, without glob
> > > patterns, a stageout could look like this:
> > >
> > > __root__/dir/a.txt -> /dir/a.txt
> > >
> > > With globs, something like this is possible:
> > >
> > > __root__/dir/a_????_b_????.txt -> /dir/a_????_b_????.txt
> > >
> > > In this case the code needs to recursively glob things and substitute
> > > each glob group in the destination for the respective matching glob in
> > > the source.
> > >
> > > Luckily there are only two providers that support staging at this point:
> > > local and coasters. Unfortunately this has to be implemented twice: once
> > > in Java for local, and once in Perl for coasters.
> > >
> > > Mihael
> > >
> > > On Mon, 2014-03-24 at 08:46 -0700, Mihael Hategan wrote:
> > >> An update on this.
> > >>
> > >> I'm still working on it, but here is the basic idea:
> > >>
> > >> - non-static mappers now support a method that, given a data type,
> > >> returns a list of glob patterns that can be used to search for files
> > >> that could be mapped by that mapper. The list (as opposed to one glob
> > >> pattern) is necessary because there might be cases when you have:
> > >>      type s { file a; file[] b};
> > >>      s[] x;
> > >>      Then s[] could match either s_????.a or s_????.????
> > >> - this (possibly empty) list gets sent to _swiftwrap
> > >> - after the job is done, _swiftwrap creates a list of files matching
> > >> those patterns
> > >> - swift-int copies that list back and the files in it and uses the list
> > >> to populate data in a fashion similar to what is done for input
> > >> variables
> > >>
> > >> This is without provider staging.
> > >>
> > >> For provider staging, providers that support staging need to be modified
> > >> to support staging out of files using glob patterns. There might be some
> > >> complications there due to the local vs. remote path naming conventions.
> > >>
> > >> Mihael
> > >>
> > >> On Sat, 2014-03-22 at 14:08 -0500, Michael Wilde wrote:
> > >>> On 3/22/14, 1:38 PM, Mihael Hategan wrote:
> > >>>> My opinion is that this problem is NOT a language/mapper issue. It is an
> > >>>> issue of implementation: how do you get the information about files to a
> > >>>> place where it can be used.
> > >>>>
> > >>>> So I believe that whether we add a new mapper or make it work with
> > >>>> existing mappers, we still need to fix that other more complex problem.
> > >>>> This is the reason why I believe we shouldn't add anything to the
> > >>>> language.
> > >>> Mihael and I discussed this in a chat just now, and I think we are in
> > >>> fact *in* sync.
> > >>> So he's going to push forward on this.
> > >>>
> > >>> - Mike
> > >>
> > >> _______________________________________________
> > >> Swift-devel mailing list
> > >> Swift-devel at ci.uchicago.edu
> > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > >
> > 
> 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel





More information about the Swift-devel mailing list