[Swift-user] using dual filesys_mappers

Michael Wilde wilde at mcs.anl.gov
Fri Jul 20 13:48:40 CDT 2012


Thanks, Robin. I agree, Swift should have warned you. Ive entered this into bugzilla:

  https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=801

- Mike


----- Original Message -----
> From: "Robin Weiss" <robinweiss at uchicago.edu>
> To: swift-user at ci.uchicago.edu
> Sent: Friday, July 20, 2012 11:50:39 AM
> Subject: Re: [Swift-user] using dual filesys_mappers
> So it turns out this was a PEBKAC issue. While I think I know why
> swift is doing what it's doing, this behavior is a little tricky to
> diagnose and I was totally looking at the problem but not seeing it.
> Let me explain.
> 
> 
> Here is the offending swift script and the output from running it. The
> directory 'in' contains a number of .dat files and corresponding .meta
> files (name differs only in extension). The expected output in each
> outFile file should be "-meta fileX.meta -dat fileX.dat", "-meta
> fileY.meta -dat fileY.dat", etc.
> 
> 
> 
> 
> type datFile;
> type metaFile;
> type outFile;
> 
> 
> app (outFile out) testApp (datFile dat, metaFile meta){
> 
> 
> echo "-meta" @meta "-dat" @dat stdout=@out;
> 
> 
> }
> 
> 
> datFile datFiles[] <filesys_mapper;location="in",sufix="dat">;
> metaFile metaFiles[] <filesys_mapper;location="in",sufix="meta">;
> 
> 
> foreach f,i in datFiles {
> 
> 
> outFile out <concurrent_mapper;prefix="out", suffix=".txt">;
> 
> out = testApp(f, metaFiles[i]);
> 
> }
> 
> 
> 
> 
> [robinweiss at midway037 bad_swift]$ pwd
> /home/robinweiss/bad_swift
> [robinweiss at midway037 bad_swift]$ cd in
> [robinweiss at midway037 in]$ ls
> file0.dat file0.meta file1.dat file1.meta file2.dat file2.meta
> file3.dat file3.meta
> [robinweiss at midway037 in]$ cd ..
> [robinweiss at midway037 bad_swift]$ ./runLocal.sh
> Swift 0.93 swift-r5483 cog-r3339
> 
> 
> RunID: 20120720-1623-cpgd5xr9
> (input): found 8 files
> (input): found 8 files
> Progress: time: Fri, 20 Jul 2012 16:23:48 +0000 Initializing:1
> Final status: time: Fri, 20 Jul 2012 16:23:49 +0000 Finished
> successfully:8
> [robinweiss at midway037 bad_swift]$ cd _concurrent/
> [robinweiss at midway037 _concurrent]$ ls
> out-4-0.txt out-4-1.txt out-4-2.txt out-4-3.txt out-4-4.txt
> out-4-5.txt out-4-6.txt out-4-7.txt
> [robinweiss at midway037 _concurrent]$ cat *
> -meta in/file0.meta -dat in/file0.meta
> -meta in/file0.dat -dat in/file0.dat
> -meta in/file1.meta -dat in/file1.meta
> -meta in/file1.dat -dat in/file1.dat
> -meta in/file2.dat -dat in/file2.dat
> -meta in/file3.dat -dat in/file3.dat
> -meta in/file3.meta -dat in/file3.meta
> -meta in/file2.meta -dat in/file2.meta
> 
> 
> 
> 
> Can you spot the problem in the script? Hint: "suffix" has two f's,
> not one. Because the parameter 'sufix' is meaningless to
> filesys_mapper it ends up just mapping everything in 'location'. It
> would seem mappers will allow you to put in pretty much any garbage
> you want inside the < >'s so long as its an assignment. For example:
> 
> 
> datFile datFiles[]
> <filesys_mapper;location="in",suffix="dat",foo="bar",biz="baz",garbage="more
> garbage">;
> 
> 
> works just fine also.
> 
> 
> 
> 
> And now for the good run:
> 
> 
> type datFile;
> type metaFile;
> type outFile;
> 
> 
> app (outFile out) testApp (datFile dat, metaFile meta){
> 
> 
> echo "-meta" @meta "-dat" @dat stdout=@out;
> 
> 
> }
> 
> 
> datFile datFiles[] <filesys_mapper;location="in",suffix="dat">;
> metaFile metaFiles[] <filesys_mapper;location="in",suffix="meta">;
> 
> 
> foreach f,i in datFiles {
> 
> 
> outFile out <concurrent_mapper;prefix="out", suffix=".txt">;
> 
> out = testApp(f, metaFiles[i]);
> 
> }
> 
> 
> [robinweiss at midway037 bad_swift]$ pwd
> /home/robinweiss/bad_swift
> [robinweiss at midway037 bad_swift]$ cd in
> [robinweiss at midway037 in]$ ls
> file0.dat file0.meta file1.dat file1.meta file2.dat file2.meta
> file3.dat file3.meta
> [robinweiss at midway037 in]$ cd ..
> [robinweiss at midway037 bad_swift]$ ./runLocal.sh
> Swift 0.93 swift-r5483 cog-r3339
> 
> 
> RunID: 20120720-1626-ou1juqf5
> (input): found 4 files
> (input): found 4 files
> Progress: time: Fri, 20 Jul 2012 16:26:51 +0000 Initializing:1
> Final status: time: Fri, 20 Jul 2012 16:26:51 +0000 Finished
> successfully:4
> [robinweiss at midway037 bad_swift]$ cd _concurrent/
> [robinweiss at midway037 _concurrent]$ ls
> out-4-0.txt out-4-1.txt out-4-2.txt out-4-3.txt
> [robinweiss at midway037 _concurrent]$ cat *
> -meta in/file0.meta -dat in/file0.dat
> -meta in/file1.meta -dat in/file1.dat
> -meta in/file3.meta -dat in/file2.dat
> -meta in/file2.meta -dat in/file3.dat
> [robinweiss at midway037 _concurrent]$
> 
> 
> 
> 
> So like I said, it turns out this whole issue boils down to a typo. I
> would suggest that swift should throw up some sort of warning if you
> pass something to one of the pre-defined mappers that is not defined
> as one of its parameters. I would have expected swift to complain that
> "sufix" is an unknown parameter of filesys_mapper.
> 
> 
> Cheers,
> Robin
> 
> 
> 
> 
> --
> 
> Robin M. Weiss
> Research Programmer
> Research Computing Center
> The University of Chicago
> 6030 S. Ellis Ave., Suite 289C
> Chicago, IL 60637
> robinweiss at uchicago.edu
> 773.702.9030
> 
> 
> From: Robin Weiss < robinweiss at uchicago.edu >
> Date: Mon, 16 Jul 2012 10:43:29 -0500
> To: < swift-user at ci.uchicago.edu >
> Subject: using dual filesys_mappers
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Hello Swifters,
> 
> 
> 
> 
> I have a question about using two filesys_mappers and the foreach
> construct. I have attached the offending .swift script I am working
> with for reference. Here's the gist of what i'm trying to do and what
> appears to be happening instead:
> 
> 
> 
> 
> I have a program called 'boot' that takes as command line arguments 4
> file paths (log, out, data, and meta), and a number of other params
> (all numerical and seem to be getting passed in correctly, so no
> worries there). 'log' and 'out' are output files and 'data' and 'mata'
> are input files (located in directories called 'out' and 'in'
> respectively). The problem i'm having is with getting the correct
> values for 'data' and 'meta' passed in to my app call. I have an app
> section called processData that invokes boot. I will be assuming the
> the directory 'in' contains identically named data and meta files that
> differ only in their suffix ('.dat' or '.meta', respectively). This
> may or may not be safe, but for now it is fine and I'll cross that
> bridge when I come to it. Here's the relevant snippet from my script:
> 
> 
> 
> 
> 
> 
> 
> app (clusFile out) processData (dataFile data, metaFile meta, logFile
> log){
> 
> 
> 
> 
> boot "-log" @log "-results" @out "-meta" @meta "-data" @data "-kmin"
> kmin "-kmax" kmax "-eps" eps "-bootpct" bootpct "-maxiterations"
> maxiterations "-maxtries" maxtries;
> 
> 
> 
> 
> }
> 
> 
> 
> 
> dataFile dataFiles[] <filesys_mapper;location="in",sufix="dat">;
> 
> metaFile metaFiles[] <filesys_mapper;location="in",sufix="meta">;
> 
> 
> 
> 
> foreach f,i in dataFiles {
> 
> 
> 
> 
> clusFile o<single_file_mapper; location="out",
> file=@strcat("out/clusFile.",i,".clus")>;
> 
> logFile l<single_file_mapper; location="out",
> file=@strcat("out/logFile.",i,".log")>;
> 
> o = processData(f, metaFiles[i], l);
> 
> 
> 
> 
> }
> 
> 
> 
> 
> this configuration causes processData to be invoked as:
> 
> 
> 
> 
> out/clusFile.0.clus = processData(dataFile0.dat, dataFile0.dat,
> out/logFile.0.log);
> 
> 
> 
> 
> if i switch around the oder of the filesys_mappers so that the snippet
> reads:
> 
> 
> 
> 
> metaFile metaFiles[] <filesys_mapper;location="in",sufix="meta">;
> 
> dataFile dataFiles[] <filesys_mapper;location="in",sufix="dat">;
> 
> 
> 
> 
> foreach f,i in dataFiles {
> 
> 
> 
> 
> clusFile o<single_file_mapper; location="out",
> file=@strcat("out/clusFile.",i,".clus")>;
> 
> logFile l<single_file_mapper; location="out",
> file=@strcat("out/logFile.",i,".log")>;
> 
> o = processData(f, metaFiles[i], l);
> 
> 
> 
> 
> }
> 
> 
> 
> 
> the app invocation is called as:
> 
> 
> 
> 
> out/clusFile.o.clus = processData(dataFile0.meta, dataFile0.meta,
> out/logFile.0.log);
> 
> 
> 
> 
> I guess the real question is this: what is the most appropriate way in
> swift to pass into a app invocation two corresponding input files?
> Ideally, it would be something like 'foreach file1,file2,i in
> inputFiles[][] { … } but that doesn't really make too much sense
> either.
> 
> 
> 
> 
> Anyway, any ideas would be appreciated.
> 
> 
> 
> 
> Cheers,
> 
> Robin
> 
> 
> 
> 
> 
> --
> 
> Robin M. Weiss
> Research Programmer
> Research Computing Center
> The University of Chicago
> 6030 S. Ellis Ave., Suite 289C
> Chicago, IL 60637
> robinweiss at uchicago.edu
> 773.702.9030
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list