[Swift-user] many-to-one mapping

Mihael Hategan hategan at mcs.anl.gov
Mon Dec 10 20:23:15 CST 2012


Inline. All the way to the end.

On Mon, 2012-12-10 at 13:58 -0600, Neil Best wrote:
> On Fri, Dec 7, 2012 at 3:21 PM, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > On Fri, 2012-12-07 at 15:08 -0600, Neil Best wrote:
> >> On Fri, Dec 7, 2012 at 2:35 PM, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> >> >
> >> > foreach {
> >> >    string m = @strcat(...);
> >> >    file[] foo <...;..., match = m,...>;
> >> > }
> >> >
> >>
> >> Shouldn't m also be an array?
> >
> > It's the pattern, so I'm guessing no.
> 
> I think there is a disconnect here because in my scenario there would
> have to be a different regex for each year.  I need to match the
> annual subset of all inputs (which span many years) to the
> corresponding year.

Right. Declaring m inside the foreach year {} says "give me a different
m for each year".

> 
> Similarly the members of foo[] would have to be different for each
> year.  I called it foo originally because I was treating it like a
> throw-away.  Essentially what I need is a list of lists that maps
> years (the loop variable) to subsets of files from nc1[].  I'm not
> sure whether Swift needs an actual mapping of these files or if
> passing the long string that is the concatenation of all of their
> names to the leaf app is adequate.
> 
> So now I am doing this:
> 
> app (file n1) cdo( string op, string ifiles) {
>   cdo op ifiles @n1;
> }
> 
> file annual[]<simple_mapper;
>   location="data/nc/annual/swift",
>   suffix=".nc">;
> 
> string m[];

You don't really need the array outside of the foreach here because of
what I mentioned above. 

> 
> foreach year in [1979:2011] {
>   m[ year] = @strcat( "^(data/nc/", year, "../narr-a_221_", year,
> "...._..00_000.single.nc)$");
>   file foo[]<structured_regexp_mapper;
>     source= nc1,
>     match= m[ year],
>     transform= "\\1">;
>   annual[ year]= cdo( "-O mergetime", @filename( foo));
> }
> 
> where nc1 is a mapped array of files from a previously successful
> Swift run, so the processing statements are commented out.

> I suspect that the files in foo[] need to be mapped, not just have
> their names passed as a concatenated string to the app, which brings
> me back to the opening question: how to do a many to one mapping,
> especially in the context of a looping parameter.  I hope I am asking
> the right question, but I would not be surprised if something salient
> about Swift semantics is eluding me.  Thanks.

Ok, so I think that there's some unclarity about what the structured
regexp mapper does. It doesn't select a subset of an existing collection
of data. It only generates a new set of file names from EACH of the file
names of the source array. The "match" parameter seems to be a misnomer.
It is meant to indicate the parts of the source file name that should be
selected, not select which files should be considered and which
shouldn't. If ANY of the file names in src fail to match against the
provided regexp, swift throws an error.

What you probably want is the filesys mapper:

file foo[] <filesys_mapper; location=@strcat("data/nc/", year, "/"),
pattern=@strcat("narr-a_221_1_", year, "????_??00_000.single.nc")>;

If you have that inside the foreach year loop, and if year = 2000, it
should give you all files that would be returned by "ls
data/nc/2000/narr-a_221_1_2000????_??_00_000.single.nc". Am I
understanding what you need correctly this time?

Mihael




More information about the Swift-user mailing list