[Swift-user] Output Files, ReadData and Order of Execution

Michael Wilde wilde at mcs.anl.gov
Fri Sep 7 14:52:43 CDT 2012


Carolyn, to follow up on your question: below is an example of using the "direct" file access mode.

- Mike

#----- The swift script

$ cat catsndirect.swift

type file;

app (file o) cat (file i)
{
  cat @i stdout=@o;
}

file out[]<simple_mapper; location="/tmp/wilde/outdir", prefix="f.",suffix=".out">;

foreach j in [1:@toint(@arg("n","1"))] {
  file data<"/tmp/wilde/indir/data.txt">;
  out[j] = cat(data);
}

#----- The "cdm" file:

$ cat direct
rule  .* DIRECT /

#----- The command line:

$ swift -config cf -cdm.file direct -tc.file tc -sites.file sites.xml catsndirect.swift -n=10

#----- The output and input dirs:

$ ls -lr /tmp/wilde/{in,out}dir

/tmp/wilde/outdir:

total 40
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0010.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0009.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0008.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0007.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0006.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0005.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0004.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0003.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0002.out
-rw-r--r-- 1 wilde ci-users 8 Sep  7 14:11 f.0001.out

/tmp/wilde/indir:

total 4
-rw-r--r-- 1 wilde ci-users 8 Sep  7 13:47 data.txt
com$ 


----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "Carolyn Phillips" <cphillips at mcs.anl.gov>
> Cc: swift-user at ci.uchicago.edu
> Sent: Wednesday, September 5, 2012 8:40:28 AM
> Subject: Re: [Swift-user] Output Files, ReadData and Order of Execution
> Hi Carolyn,
> 
> I think this error is due to the fact that the launchjob script is not
> coded to match Swift's file management conventions.
> 
> Unless you declare that you want to use "Direct" file management,
> Swift will expect output files to be created relative to the directory
> in which it runs your app() scripts. Thats why it was pulling off the
> leading "/". By putting a // at the front of the pathname, you were
> inadvertantly causing your launchjob script to place its output file
> in a different directory than where Swift was expecting it.
> 
> There's a further mismatch I think between the mapped filename from
> simple_mapper (which defaults to 4-digit strings for indices) and the
> names that launchjob is trying to create.
> 
> I'll need to send further clarification later, but for now, could you
> try the following:
> 
> - go back to using a single leading "/"
> - comment out the mkdir and cd in launchjob, as $3 contains the
> correct pathname to write to (which will be a long relative pathname
> without the leaning "/")
> 
> I think what you really want here is "DIRECT" file management mode,
> explained at:
> 
> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_policy_descriptions
> 
> We need to enhance the User Guide to explain this clearly and fully.
> 
> - Mike
> 
> ----- Original Message -----
> > From: "Carolyn Phillips" <cphillips at mcs.anl.gov>
> > To: "Mihael Hategan" <hategan at mcs.anl.gov>
> > Cc: swift-user at ci.uchicago.edu
> > Sent: Sunday, September 2, 2012 10:16:05 PM
> > Subject: Re: [Swift-user] Output Files, ReadData and Order of
> > Execution
> > You are right. That was my problem. (dumb!)
> >
> > Anyway. My next issue is that Swift is telling me it can't find a
> > file
> > that exists. Perhaps this is because it does not understand absolute
> > directory paths the way I am specifying them?
> >
> > The short version is that, using a simple mapper, I specify the
> > location as ;location="//scratch/midway/phillicl/SwiftJobs/" Note
> > that
> > I had to put two // at the beginning because the first backslash
> > gets
> > removed for some reason. Then I pass that file name to a script
> > write
> > to that file. But then Swift doesn't see the file
> >
> > Here is the more detailed version
> >
> > I have a script called launch jobs that does the following
> > > cd /scratch/midway/phillicl/SwiftJobs
> > >
> > > mkdir Job.${1}.${2}
> > > cd Job.${1}.${2}
> > >
> > > # Copy in some files and do some work
> > >
> > > pwd > ${3}
> >
> >
> >
> > Here is the swift script
> >
> > > # Types
> > > type file;
> > > type unlabeleddata;
> > > type labeleddata;
> > > type errorlog;
> > >
> > > # Structured Types
> > > type pointfile {
> > > unlabeleddata points;
> > > errorlog error;
> > > }
> > >
> > > type simulationfile {
> > > file output;
> > > }
> > >
> > > # Apps
> > > app (file o) cat (file i)
> > > {
> > >   cat @i stdout=@o;
> > > }
> > >
> > > app (file o) cat2 (file i)
> > > {
> > >   systeminfo stdout=@o;
> > > }
> > >
> > > app (pointfile o) generatepoints (file c, labeleddata f, string
> > > mode, int Npoints)
> > > {
> > >   matlab_callgeneratepoints @c @f mode Npoints @o.points @o.error;
> > > }
> > >
> > > app (simulationfile o) runSimulation(string p,int passes, int
> > > pindex)
> > > {
> > >   launchjob passes pindex @o.output;
> > > }
> > >
> > > #Files (using single file mapper)
> > > file config <"designspace.config">;
> > > labeleddata labeledpoints <"emptypoints.dat">;
> > >
> > > type pointlog;
> > >
> > > # Loop
> > > iterate passes {
> > >
> > >     # Generate Parameters
> > >     pointfile np <simple_mapper;prefix="mypoints.",suffix=".dat">;
> > >     np = generatepoints(config,labeledpoints, "uniform", 50);
> > >
> > >     errorlog fe = np.error;
> > >     int checkforerror = readData(fe);
> > >     tracef("%s: %i\n", "Generate Parameters Error Value",
> > >     checkforerror);
> > >
> > >     # Issue Jobs
> > >     simulationfile simfiles[]
> > >     <simple_mapper;location="//scratch/midway/phillicl/SwiftJobs/",prefix=@strcat(passes,"."),suffix=".job">;
> > >     if(checkforerror==0) {
> > >         unlabeleddata pl = np.points;
> > >     	string parameters[] =readData(pl);
> > >     	foreach p,pindex in parameters {
> > >       		tracef("Launch Job for Parameters: %s\n", p);
> > >                 simfiles[pindex] = runSimulation(p,passes,pindex);
> > >     	}
> > >      }
> > >
> > >     # Analyze Jobs
> > >
> > >     # Generate Prediction
> > >
> > >
> > >
> > >     # creates an array of datafiles named swifttest.<passes>.out
> > >     to
> > >     write to
> > >     file out[]<simple_mapper; location=".",
> > >     prefix=@strcat("swifttest.",passes,"."),suffix=".out">;
> > >
> > >     # creates a default of 10 files
> > >     foreach j in [1:@toInt(@arg("n","10"))] {
> > >       file data<"data.txt">;
> > >       out[j] = cat2(data);
> > >     }
> > >
> > >     # try writing the iteration to a log file
> > >     file passlog <"passes.log">;
> > >     passlog = writeData(passes);
> > >
> > >     # try reading from another log file
> > >     int readpasses = readData(passlog);
> > >
> > >     # Write to the Output Log
> > >     tracef("%s: %i\n", "Iteration :", passes);
> > >     tracef("%s: %i\n", "Iteration Read :", readpasses);
> > >
> > > #} until (readpasses == 2); # Determine if Done
> > > } until (passes == 1); # Determine if Done
> >
> >
> > And Here is the error I get:
> >
> > EXCEPTION Exception in launchjob:
> > Arguments: [0, 1,
> > /scratch/midway/phillicl/SwiftJobs/0.0001.output.job]
> > Host: pbs
> > Directory: test-20120903-0303-pedfpqu8/jobs/f/launchjob-fff1mjxk
> > stderr.txt:
> > stdout.txt:
> > ----
> >
> > sys:exception @ vdl-int.k, line: 601
> > sys:throw @ vdl-int.k, line: 600
> > sys:catch @ vdl-int.k, line: 567
> > sys:try @ vdl-int.k, line: 469
> > task:allocatehost @ vdl-int.k, line: 419
> > vdl:execute2 @ execute-default.k, line: 23
> > sys:ignoreerrors @ execute-default.k, line: 21
> > sys:parallelfor @ execute-default.k, line: 20
> > sys:restartonerror @ execute-default.k, line: 16
> > sys:sequential @ execute-default.k, line: 14
> > sys:try @ execute-default.k, line: 13
> > sys:if @ execute-default.k, line: 12
> > sys:then @ execute-default.k, line: 11
> > sys:if @ execute-default.k, line: 10
> > vdl:execute @ test.kml, line: 182
> > run_simulation @ test.kml, line: 480
> > sys:parallel @ test.kml, line: 465
> > foreach @ test.kml, line: 456
> > sys:parallel @ test.kml, line: 427
> > sys:then @ test.kml, line: 409
> > sys:if @ test.kml, line: 404
> > sys:sequential @ test.kml, line: 402
> > sys:parallel @ test.kml, line: 315
> > iterate @ test.kml, line: 229
> > vdl:sequentialwithid @ test.kml, line: 226
> > vdl:mainp @ test.kml, line: 225
> > mainp @ vdl.k, line: 118
> > vdl:mains @ test.kml, line: 223
> > vdl:mains @ test.kml, line: 223
> > rlog:restartlog @ test.kml, line: 222
> > kernel:project @ test.kml, line: 2
> > test-20120903-0303-pedfpqu8
> > Caused by: The following output files were not created by the
> > application: /scratch/midway/phillicl/SwiftJobs/0.0001.output.job
> >
> > Note that
> > > ls /scratch/midway/phillicl/SwiftJobs/0.0001.output.job
> > /scratch/midway/phillicl/SwiftJobs/0.0001.output.job
> >
> >
> > On Sep 1, 2012, at 5:45 PM, Mihael Hategan <hategan at mcs.anl.gov>
> > wrote:
> >
> > > The error comes from int checkforerror = readData(np.error);
> > >
> > > You have to use the workaround for both.
> > >
> > > On Sat, 2012-09-01 at 15:23 -0500, Carolyn Phillips wrote:
> > >> Sure
> > >>
> > >> There are a lot of extra stuff running around in the script, fyi
> > >>
> > >> # Types
> > >> type file;
> > >> type unlabeleddata;
> > >> type labeleddata;
> > >> type errorlog;
> > >>
> > >> # Structured Types
> > >> type pointfile {
> > >> unlabeleddata points;
> > >> errorlog error;
> > >> }
> > >>
> > >> type simulationfile {
> > >> file output;
> > >> }
> > >>
> > >> # Apps
> > >> app (file o) cat (file i)
> > >> {
> > >>  cat @i stdout=@o;
> > >> }
> > >>
> > >> app (file o) cat2 (file i)
> > >> {
> > >>  systeminfo stdout=@o;
> > >> }
> > >>
> > >> app (pointfile o) generatepoints (file c, labeleddata f, string
> > >> mode, int Npoints)
> > >> {
> > >>  matlab_callgeneratepoints @c @f mode Npoints @o.points @o.error;
> > >> }
> > >>
> > >> #app (simulationfile o) runSimulation(string p)
> > >> #{
> > >> # launchjob p @o.output;
> > >> #}
> > >>
> > >> #Files (using single file mapper)
> > >> file config <"designspace.config">;
> > >> labeleddata labeledpoints <"emptypoints.dat">;
> > >>
> > >> type pointlog;
> > >>
> > >> # Loop
> > >> iterate passes {
> > >>
> > >>    # Generate Parameters
> > >>    pointfile np <simple_mapper;prefix="mypoints.",suffix=".dat">;
> > >>    np = generatepoints(config,labeledpoints, "uniform", 50);
> > >>
> > >>    int checkforerror = readData(np.error);
> > >>    tracef("%s: %i\n", "Generate Parameters Error Value",
> > >>    checkforerror);
> > >>
> > >>    # Issue Jobs
> > >>    #simulationfile simfiles[]
> > >>    <simple_mapper;location="/scratch/midway/phillicl/SwiftJobs/",prefix=@strcat("output.",passes,"."),suffix=".job">;
> > >>    if(checkforerror==0) {
> > >>        unlabeleddata pl = np.points;
> > >>    	string parameters[] =readData(pl);
> > >>    	foreach p,pindex in parameters {
> > >>      		tracef("Launch Job for Parameters: %s\n", p);
> > >>                #simfiles[pindex] = runSimulation(p);
> > >>    	}
> > >>     }
> > >>
> > >>    # Analyze Jobs
> > >>
> > >>    # Generate Prediction
> > >>
> > >>
> > >>
> > >>    # creates an array of datafiles named swifttest.<passes>.out
> > >>    to
> > >>    write to
> > >>    file out[]<simple_mapper; location=".",
> > >>    prefix=@strcat("swifttest.",passes,"."),suffix=".out">;
> > >>
> > >>    # creates a default of 10 files
> > >>    foreach j in [1:@toInt(@arg("n","10"))] {
> > >>      file data<"data.txt">;
> > >>      out[j] = cat2(data);
> > >>    }
> > >>
> > >>    # try writing the iteration to a log file
> > >>    file passlog <"passes.log">;
> > >>    passlog = writeData(passes);
> > >>
> > >>    # try reading from another log file
> > >>    int readpasses = readData(passlog);
> > >>
> > >>    # Write to the Output Log
> > >>    tracef("%s: %i\n", "Iteration :", passes);
> > >>    tracef("%s: %i\n", "Iteration Read :", readpasses);
> > >>
> > >> #} until (readpasses == 2); # Determine if Done
> > >> } until (passes == 1); # Determine if Done
> > >>
> > >>
> > >> On Sep 1, 2012, at 1:57 PM, Mihael Hategan <hategan at mcs.anl.gov>
> > >> wrote:
> > >>
> > >>> Can you post the entire script?
> > >>>
> > >>> On Sat, 2012-09-01 at 12:29 -0500, Carolyn Phillips wrote:
> > >>>> Yes, I tried that
> > >>>>
> > >>>>       unlabeleddata pl = np.points;
> > >>>>   	string parameters[] =readData(pl);
> > >>>>
> > >>>>
> > >>>> and I got
> > >>>>
> > >>>> Execution failed:
> > >>>> 	mypoints..dat (No such file or directory)
> > >>>>
> > >>>> On Aug 31, 2012, at 8:27 PM, Mihael Hategan
> > >>>> <hategan at mcs.anl.gov>
> > >>>> wrote:
> > >>>>
> > >>>>> On Fri, 2012-08-31 at 20:11 -0500, Carolyn Phillips wrote:
> > >>>>>> How would this line work for what I have below?
> > >>>>>>
> > >>>>>>>> string parameters[] =readData(np.points);
> > >>>>>>
> > >>>>>
> > >>>>> unlabeleddata tmp = np.points;
> > >>>>> string parameters[] = readData(tmp);
> > >>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Aug 31, 2012, at 7:49 PM, Mihael Hategan
> > >>>>>> <hategan at mcs.anl.gov> wrote:
> > >>>>>>
> > >>>>>>> Another bug.
> > >>>>>>>
> > >>>>>>> I committed a fix. In the mean time, the solution is:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> errorlog fe = np.errorlog;
> > >>>>>>>
> > >>>>>>> int error = readData(fe);
> > >>>>>>>
> > >>>>>>> On Fri, 2012-08-31 at 19:29 -0500, Carolyn Phillips wrote:
> > >>>>>>>> Hi Mihael,
> > >>>>>>>>
> > >>>>>>>> the reason I added the "@" was because
> > >>>>>>>>
> > >>>>>>>> now this (similar) line
> > >>>>>>>>
> > >>>>>>>> if(checkforerror==0) {
> > >>>>>>>>     string parameters[] =readData(np.points);
> > >>>>>>>>    }
> > >>>>>>>>
> > >>>>>>>> gives me this:
> > >>>>>>>>
> > >>>>>>>> Execution failed:
> > >>>>>>>> 	mypoints..dat (No such file or directory)
> > >>>>>>>>
> > >>>>>>>> as in now its not getting the name of the file correct
> > >>>>>>>>
> > >>>>>>>> On Aug 31, 2012, at 7:17 PM, Mihael Hategan
> > >>>>>>>> <hategan at mcs.anl.gov> wrote:
> > >>>>>>>>
> > >>>>>>>>> @np.error means the file name of np.error which is known
> > >>>>>>>>> statically. So
> > >>>>>>>>> readData(@np.error) can run as soon as the script starts.
> > >>>>>>>>>
> > >>>>>>>>> You probably want to say readData(np.error).
> > >>>>>>>>>
> > >>>>>>>>> Mihael
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Fri, 2012-08-31 at 18:55 -0500, Carolyn Phillips wrote:
> > >>>>>>>>>> So I execute an atomic procedure to generate a datafile,
> > >>>>>>>>>> and then next
> > >>>>>>>>>> I want to do something with that data file. However, my
> > >>>>>>>>>> program is
> > >>>>>>>>>> trying to do something with the datafile before it has
> > >>>>>>>>>> been
> > >>>>>>>>>> written
> > >>>>>>>>>> to. So something with order of execution is not working.
> > >>>>>>>>>> I
> > >>>>>>>>>> think the
> > >>>>>>>>>> problem is that the name of my file exists, but the file
> > >>>>>>>>>> itself does
> > >>>>>>>>>> not yet, but execution proceeds anyway!
> > >>>>>>>>>>
> > >>>>>>>>>> Here are my lines
> > >>>>>>>>>>
> > >>>>>>>>>> type pointfile {
> > >>>>>>>>>> unlabeleddata points;
> > >>>>>>>>>> errorlog error;
> > >>>>>>>>>> }
> > >>>>>>>>>>
> > >>>>>>>>>> # Generate Parameters
> > >>>>>>>>>> pointfile np
> > >>>>>>>>>> <simple_mapper;prefix="mypoints.",suffix=".dat">;
> > >>>>>>>>>> np = generatepoints(config,labeledpoints, "uniform", 50);
> > >>>>>>>>>>
> > >>>>>>>>>> int checkforerror = readData(@np.error);
> > >>>>>>>>>>
> > >>>>>>>>>> This gives an error :
> > >>>>>>>>>> mypoints.error.dat (No such file or directory)
> > >>>>>>>>>>
> > >>>>>>>>>> If I comment out the last line.. all the files show up in
> > >>>>>>>>>> the directory. (e.g. mypoints.points.dat and
> > >>>>>>>>>> mypoints.error.dat) ) and if forget to remove the .dat
> > >>>>>>>>>> files from a prior run, it also runs fine!
> > >>>>>>>>>>
> > >>>>>>>>>> How do you fix a problem like that?
> > >>>>>>>>>>
> > >>>>>>>>>> _______________________________________________
> > >>>>>>>>>> Swift-user mailing list
> > >>>>>>>>>> Swift-user at ci.uchicago.edu
> > >>>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>
> > >
> > >
> >
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> 
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list