[Swift-user] Output Files, ReadData and Order of Execution

Michael Wilde wilde at mcs.anl.gov
Wed Sep 5 08:40:28 CDT 2012


Hi Carolyn,

I think this error is due to the fact that the launchjob script is not coded to match Swift's file management conventions.

Unless you declare that you want to use "Direct" file management, Swift will expect output files to be created relative to the directory in which it runs your app() scripts. Thats why it was pulling off the leading "/". By putting a // at the front of the pathname, you were inadvertantly causing your launchjob script to place its output file in a different directory than where Swift was expecting it. 

There's a further mismatch I think between the mapped filename from simple_mapper (which defaults to 4-digit strings for indices) and the names that launchjob is trying to create.

I'll need to send further clarification later, but for now, could you try the following:

- go back to using a single leading "/"
- comment out the mkdir and cd in launchjob, as $3 contains the correct pathname to write to (which will be a long relative pathname without the leaning "/")

I think what you really want here is "DIRECT" file management mode, explained at:

http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_policy_descriptions

We need to enhance the User Guide to explain this clearly and fully.

- Mike

----- Original Message -----
> From: "Carolyn Phillips" <cphillips at mcs.anl.gov>
> To: "Mihael Hategan" <hategan at mcs.anl.gov>
> Cc: swift-user at ci.uchicago.edu
> Sent: Sunday, September 2, 2012 10:16:05 PM
> Subject: Re: [Swift-user] Output Files, ReadData and Order of Execution
> You are right. That was my problem. (dumb!)
> 
> Anyway. My next issue is that Swift is telling me it can't find a file
> that exists. Perhaps this is because it does not understand absolute
> directory paths the way I am specifying them?
> 
> The short version is that, using a simple mapper, I specify the
> location as ;location="//scratch/midway/phillicl/SwiftJobs/" Note that
> I had to put two // at the beginning because the first backslash gets
> removed for some reason. Then I pass that file name to a script write
> to that file. But then Swift doesn't see the file
> 
> Here is the more detailed version
> 
> I have a script called launch jobs that does the following
> > cd /scratch/midway/phillicl/SwiftJobs
> >
> > mkdir Job.${1}.${2}
> > cd Job.${1}.${2}
> >
> > # Copy in some files and do some work
> >
> > pwd > ${3}
> 
> 
> 
> Here is the swift script
> 
> > # Types
> > type file;
> > type unlabeleddata;
> > type labeleddata;
> > type errorlog;
> >
> > # Structured Types
> > type pointfile {
> > unlabeleddata points;
> > errorlog error;
> > }
> >
> > type simulationfile {
> > file output;
> > }
> >
> > # Apps
> > app (file o) cat (file i)
> > {
> >   cat @i stdout=@o;
> > }
> >
> > app (file o) cat2 (file i)
> > {
> >   systeminfo stdout=@o;
> > }
> >
> > app (pointfile o) generatepoints (file c, labeleddata f, string
> > mode, int Npoints)
> > {
> >   matlab_callgeneratepoints @c @f mode Npoints @o.points @o.error;
> > }
> >
> > app (simulationfile o) runSimulation(string p,int passes, int
> > pindex)
> > {
> >   launchjob passes pindex @o.output;
> > }
> >
> > #Files (using single file mapper)
> > file config <"designspace.config">;
> > labeleddata labeledpoints <"emptypoints.dat">;
> >
> > type pointlog;
> >
> > # Loop
> > iterate passes {
> >
> >     # Generate Parameters
> >     pointfile np <simple_mapper;prefix="mypoints.",suffix=".dat">;
> >     np = generatepoints(config,labeledpoints, "uniform", 50);
> >
> >     errorlog fe = np.error;
> >     int checkforerror = readData(fe);
> >     tracef("%s: %i\n", "Generate Parameters Error Value",
> >     checkforerror);
> >
> >     # Issue Jobs
> >     simulationfile simfiles[]
> >     <simple_mapper;location="//scratch/midway/phillicl/SwiftJobs/",prefix=@strcat(passes,"."),suffix=".job">;
> >     if(checkforerror==0) {
> >         unlabeleddata pl = np.points;
> >     	string parameters[] =readData(pl);
> >     	foreach p,pindex in parameters {
> >       		tracef("Launch Job for Parameters: %s\n", p);
> >                 simfiles[pindex] = runSimulation(p,passes,pindex);
> >     	}
> >      }
> >
> >     # Analyze Jobs
> >
> >     # Generate Prediction
> >
> >
> >
> >     # creates an array of datafiles named swifttest.<passes>.out to
> >     write to
> >     file out[]<simple_mapper; location=".",
> >     prefix=@strcat("swifttest.",passes,"."),suffix=".out">;
> >
> >     # creates a default of 10 files
> >     foreach j in [1:@toInt(@arg("n","10"))] {
> >       file data<"data.txt">;
> >       out[j] = cat2(data);
> >     }
> >
> >     # try writing the iteration to a log file
> >     file passlog <"passes.log">;
> >     passlog = writeData(passes);
> >
> >     # try reading from another log file
> >     int readpasses = readData(passlog);
> >
> >     # Write to the Output Log
> >     tracef("%s: %i\n", "Iteration :", passes);
> >     tracef("%s: %i\n", "Iteration Read :", readpasses);
> >
> > #} until (readpasses == 2); # Determine if Done
> > } until (passes == 1); # Determine if Done
> 
> 
> And Here is the error I get:
> 
> EXCEPTION Exception in launchjob:
> Arguments: [0, 1,
> /scratch/midway/phillicl/SwiftJobs/0.0001.output.job]
> Host: pbs
> Directory: test-20120903-0303-pedfpqu8/jobs/f/launchjob-fff1mjxk
> stderr.txt:
> stdout.txt:
> ----
> 
> sys:exception @ vdl-int.k, line: 601
> sys:throw @ vdl-int.k, line: 600
> sys:catch @ vdl-int.k, line: 567
> sys:try @ vdl-int.k, line: 469
> task:allocatehost @ vdl-int.k, line: 419
> vdl:execute2 @ execute-default.k, line: 23
> sys:ignoreerrors @ execute-default.k, line: 21
> sys:parallelfor @ execute-default.k, line: 20
> sys:restartonerror @ execute-default.k, line: 16
> sys:sequential @ execute-default.k, line: 14
> sys:try @ execute-default.k, line: 13
> sys:if @ execute-default.k, line: 12
> sys:then @ execute-default.k, line: 11
> sys:if @ execute-default.k, line: 10
> vdl:execute @ test.kml, line: 182
> run_simulation @ test.kml, line: 480
> sys:parallel @ test.kml, line: 465
> foreach @ test.kml, line: 456
> sys:parallel @ test.kml, line: 427
> sys:then @ test.kml, line: 409
> sys:if @ test.kml, line: 404
> sys:sequential @ test.kml, line: 402
> sys:parallel @ test.kml, line: 315
> iterate @ test.kml, line: 229
> vdl:sequentialwithid @ test.kml, line: 226
> vdl:mainp @ test.kml, line: 225
> mainp @ vdl.k, line: 118
> vdl:mains @ test.kml, line: 223
> vdl:mains @ test.kml, line: 223
> rlog:restartlog @ test.kml, line: 222
> kernel:project @ test.kml, line: 2
> test-20120903-0303-pedfpqu8
> Caused by: The following output files were not created by the
> application: /scratch/midway/phillicl/SwiftJobs/0.0001.output.job
> 
> Note that
> > ls /scratch/midway/phillicl/SwiftJobs/0.0001.output.job
> /scratch/midway/phillicl/SwiftJobs/0.0001.output.job
> 
> 
> On Sep 1, 2012, at 5:45 PM, Mihael Hategan <hategan at mcs.anl.gov>
> wrote:
> 
> > The error comes from int checkforerror = readData(np.error);
> >
> > You have to use the workaround for both.
> >
> > On Sat, 2012-09-01 at 15:23 -0500, Carolyn Phillips wrote:
> >> Sure
> >>
> >> There are a lot of extra stuff running around in the script, fyi
> >>
> >> # Types
> >> type file;
> >> type unlabeleddata;
> >> type labeleddata;
> >> type errorlog;
> >>
> >> # Structured Types
> >> type pointfile {
> >> unlabeleddata points;
> >> errorlog error;
> >> }
> >>
> >> type simulationfile {
> >> file output;
> >> }
> >>
> >> # Apps
> >> app (file o) cat (file i)
> >> {
> >>  cat @i stdout=@o;
> >> }
> >>
> >> app (file o) cat2 (file i)
> >> {
> >>  systeminfo stdout=@o;
> >> }
> >>
> >> app (pointfile o) generatepoints (file c, labeleddata f, string
> >> mode, int Npoints)
> >> {
> >>  matlab_callgeneratepoints @c @f mode Npoints @o.points @o.error;
> >> }
> >>
> >> #app (simulationfile o) runSimulation(string p)
> >> #{
> >> # launchjob p @o.output;
> >> #}
> >>
> >> #Files (using single file mapper)
> >> file config <"designspace.config">;
> >> labeleddata labeledpoints <"emptypoints.dat">;
> >>
> >> type pointlog;
> >>
> >> # Loop
> >> iterate passes {
> >>
> >>    # Generate Parameters
> >>    pointfile np <simple_mapper;prefix="mypoints.",suffix=".dat">;
> >>    np = generatepoints(config,labeledpoints, "uniform", 50);
> >>
> >>    int checkforerror = readData(np.error);
> >>    tracef("%s: %i\n", "Generate Parameters Error Value",
> >>    checkforerror);
> >>
> >>    # Issue Jobs
> >>    #simulationfile simfiles[]
> >>    <simple_mapper;location="/scratch/midway/phillicl/SwiftJobs/",prefix=@strcat("output.",passes,"."),suffix=".job">;
> >>    if(checkforerror==0) {
> >>        unlabeleddata pl = np.points;
> >>    	string parameters[] =readData(pl);
> >>    	foreach p,pindex in parameters {
> >>      		tracef("Launch Job for Parameters: %s\n", p);
> >>                #simfiles[pindex] = runSimulation(p);
> >>    	}
> >>     }
> >>
> >>    # Analyze Jobs
> >>
> >>    # Generate Prediction
> >>
> >>
> >>
> >>    # creates an array of datafiles named swifttest.<passes>.out to
> >>    write to
> >>    file out[]<simple_mapper; location=".",
> >>    prefix=@strcat("swifttest.",passes,"."),suffix=".out">;
> >>
> >>    # creates a default of 10 files
> >>    foreach j in [1:@toInt(@arg("n","10"))] {
> >>      file data<"data.txt">;
> >>      out[j] = cat2(data);
> >>    }
> >>
> >>    # try writing the iteration to a log file
> >>    file passlog <"passes.log">;
> >>    passlog = writeData(passes);
> >>
> >>    # try reading from another log file
> >>    int readpasses = readData(passlog);
> >>
> >>    # Write to the Output Log
> >>    tracef("%s: %i\n", "Iteration :", passes);
> >>    tracef("%s: %i\n", "Iteration Read :", readpasses);
> >>
> >> #} until (readpasses == 2); # Determine if Done
> >> } until (passes == 1); # Determine if Done
> >>
> >>
> >> On Sep 1, 2012, at 1:57 PM, Mihael Hategan <hategan at mcs.anl.gov>
> >> wrote:
> >>
> >>> Can you post the entire script?
> >>>
> >>> On Sat, 2012-09-01 at 12:29 -0500, Carolyn Phillips wrote:
> >>>> Yes, I tried that
> >>>>
> >>>>       unlabeleddata pl = np.points;
> >>>>   	string parameters[] =readData(pl);
> >>>>
> >>>>
> >>>> and I got
> >>>>
> >>>> Execution failed:
> >>>> 	mypoints..dat (No such file or directory)
> >>>>
> >>>> On Aug 31, 2012, at 8:27 PM, Mihael Hategan <hategan at mcs.anl.gov>
> >>>> wrote:
> >>>>
> >>>>> On Fri, 2012-08-31 at 20:11 -0500, Carolyn Phillips wrote:
> >>>>>> How would this line work for what I have below?
> >>>>>>
> >>>>>>>> string parameters[] =readData(np.points);
> >>>>>>
> >>>>>
> >>>>> unlabeleddata tmp = np.points;
> >>>>> string parameters[] = readData(tmp);
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Aug 31, 2012, at 7:49 PM, Mihael Hategan
> >>>>>> <hategan at mcs.anl.gov> wrote:
> >>>>>>
> >>>>>>> Another bug.
> >>>>>>>
> >>>>>>> I committed a fix. In the mean time, the solution is:
> >>>>>>>
> >>>>>>>
> >>>>>>> errorlog fe = np.errorlog;
> >>>>>>>
> >>>>>>> int error = readData(fe);
> >>>>>>>
> >>>>>>> On Fri, 2012-08-31 at 19:29 -0500, Carolyn Phillips wrote:
> >>>>>>>> Hi Mihael,
> >>>>>>>>
> >>>>>>>> the reason I added the "@" was because
> >>>>>>>>
> >>>>>>>> now this (similar) line
> >>>>>>>>
> >>>>>>>> if(checkforerror==0) {
> >>>>>>>>     string parameters[] =readData(np.points);
> >>>>>>>>    }
> >>>>>>>>
> >>>>>>>> gives me this:
> >>>>>>>>
> >>>>>>>> Execution failed:
> >>>>>>>> 	mypoints..dat (No such file or directory)
> >>>>>>>>
> >>>>>>>> as in now its not getting the name of the file correct
> >>>>>>>>
> >>>>>>>> On Aug 31, 2012, at 7:17 PM, Mihael Hategan
> >>>>>>>> <hategan at mcs.anl.gov> wrote:
> >>>>>>>>
> >>>>>>>>> @np.error means the file name of np.error which is known
> >>>>>>>>> statically. So
> >>>>>>>>> readData(@np.error) can run as soon as the script starts.
> >>>>>>>>>
> >>>>>>>>> You probably want to say readData(np.error).
> >>>>>>>>>
> >>>>>>>>> Mihael
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Fri, 2012-08-31 at 18:55 -0500, Carolyn Phillips wrote:
> >>>>>>>>>> So I execute an atomic procedure to generate a datafile,
> >>>>>>>>>> and then next
> >>>>>>>>>> I want to do something with that data file. However, my
> >>>>>>>>>> program is
> >>>>>>>>>> trying to do something with the datafile before it has been
> >>>>>>>>>> written
> >>>>>>>>>> to. So something with order of execution is not working. I
> >>>>>>>>>> think the
> >>>>>>>>>> problem is that the name of my file exists, but the file
> >>>>>>>>>> itself does
> >>>>>>>>>> not yet, but execution proceeds anyway!
> >>>>>>>>>>
> >>>>>>>>>> Here are my lines
> >>>>>>>>>>
> >>>>>>>>>> type pointfile {
> >>>>>>>>>> unlabeleddata points;
> >>>>>>>>>> errorlog error;
> >>>>>>>>>> }
> >>>>>>>>>>
> >>>>>>>>>> # Generate Parameters
> >>>>>>>>>> pointfile np
> >>>>>>>>>> <simple_mapper;prefix="mypoints.",suffix=".dat">;
> >>>>>>>>>> np = generatepoints(config,labeledpoints, "uniform", 50);
> >>>>>>>>>>
> >>>>>>>>>> int checkforerror = readData(@np.error);
> >>>>>>>>>>
> >>>>>>>>>> This gives an error :
> >>>>>>>>>> mypoints.error.dat (No such file or directory)
> >>>>>>>>>>
> >>>>>>>>>> If I comment out the last line.. all the files show up in
> >>>>>>>>>> the directory. (e.g. mypoints.points.dat and
> >>>>>>>>>> mypoints.error.dat) ) and if forget to remove the .dat
> >>>>>>>>>> files from a prior run, it also runs fine!
> >>>>>>>>>>
> >>>>>>>>>> How do you fix a problem like that?
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Swift-user mailing list
> >>>>>>>>>> Swift-user at ci.uchicago.edu
> >>>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list