[Swift-user] Advice on mapping input arguments

Michael Wilde wilde at mcs.anl.gov
Mon Jun 10 23:26:17 CDT 2013


TJ,

For the specific example you posted, the technique shown in the prior message works well.

Here's shoot.swift converted to use that approach.

- Mike

mid$ cat shoot.swift

type qvalfile; 
type coordfile;
type shotfile;

type shootargs{
    string coords;
    string qvals;
    int numphi;
    int numshots;
    int nummolec;
    float photons;
    int parallel;
    string output_filename;
}

app (shotfile outfile) shootsim (shootargs args, coordfile coords, qvalfile qvals) {   
     echo "polshoot" "-s" @coords "-f" @qvals "-x" args.numphi "-n" args.numshots 
                     "-m" args.nummolec "-g" args.photons "-p" args.parallel "-o" @outfile
                     stdout=@outfile; # Note double use of outfile, just for testing/demo
}

shootargs myargs[] = readData("particles.dat");

foreach a in myargs {
    shotfile  output <single_file_mapper; file=a.output_filename>; 
    coordfile coords <single_file_mapper; file=a.coords>;
    qvalfile  qvals  <single_file_mapper; file=a.qvals>;
    output = shootsim(a, coords, qvals);
}

mid$ cat particles.dat
coords        qvals          numphi numshots nummolec photons parallel output_filename
gold_5nm.coor gold_qvals.txt 3600   10       1        0.25    12       out0.ring
gold_5nm.coor gold_qvals.txt 3600   10       2        0.5     12       out1.ring
gold_5nm.coor gold_qvals.txt 3600   10       4        1.0     12       out2.ring
gold_5nm.coor gold_qvals.txt 3600   10       8        2.0     12       out3.ring
gold_5nm.coor gold_qvals.txt 3600   10       16       4.0     12       out4.ring


mid$ swift shoot.swift
Swift 0.94 swift-r6414 (swift modified locally) cog-r3648

RunID: 20130611-0423-gfp7pohe
Progress:  time: Tue, 11 Jun 2013 04:23:18 +0000
Progress:  time: Tue, 11 Jun 2013 04:23:19 +0000  Stage in:1  Finished successfully:4
Final status: Tue, 11 Jun 2013 04:23:19 +0000  Finished successfully:5

mid$ more out?.ring
::::::::::::::
out0.ring
::::::::::::::
polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 1 -g 0.25 -p 12 -o out0.ring
::::::::::::::
out1.ring
::::::::::::::
polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 2 -g 0.5 -p 12 -o out1.ring
::::::::::::::
out2.ring
::::::::::::::
polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 4 -g 1 -p 12 -o out2.ring
::::::::::::::
out3.ring
::::::::::::::
polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 8 -g 2 -p 12 -o out3.ring
::::::::::::::
out4.ring
::::::::::::::
polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 16 -g 4 -p 12 -o out4.ring
mid$ 



----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "TJ Lane" <tjlane at stanford.edu>
> Cc: swift-user at ci.uchicago.edu
> Sent: Monday, June 10, 2013 10:58:55 PM
> Subject: Re: [Swift-user] Advice on mapping input arguments
> 
> Hi TJ,
> 
> Here's a quick initial thought. There might be better ways to do
> this.
> 
> The problem as you mention is that there is currently no mapper or
> function that lets you easily handle a mixture of scalars and files
> in a struct.  I think that readData and/or mappers should/could
> handle this, but Ive been outvoted in past discussions of this.
> 
> So for now, assuming you have more scalar params than file params,
> you can just leave the filenames as strings in the parameter struct,
> then map the smaller number of files from these strings, and call
> your app() using the param struct to pass all the scalar params, and
> pass the mapped files as additional input and output parameters.
> 
> This retains, I think, the benefit of having a parameter file that
> provides a handy record of the run's parameters.
> 
> Here's an initial example that I *think* is in the spirit of the
> longer one you supplied below.
> 
> If this sounds like a reasonable approach I think we can use the same
> technique to run your example.
> 
> I provide I think all the files you need to test this.
> 
> I'll look more carefully at your example now to see if this is indeed
> what you're trying to do.
> 
> - Mike
> 
> mid$ cat params.txt
> p1 p2 fn1    fn2    ofn
> 86 99 f1.dat f2.dat row1.out
> 87 98 f3.dat f4.dat row2.out
> 
> mid$ cat structs.swift
> 
> type file;
> 
> type params {
>   int p1;
>   int p2;
>   string fn1;
>   string fn2;
>   string ofn;
> };
> 
> app (file out) echo (params p, file f1, file f2)
> {
>   echo stdout=@out "params:" p.p1 p.p2 "files:" @f1 @f2;
> }
> 
> params plist[] = readData("params.txt");
> 
> foreach p, i in plist {
>   file f1 <single_file_mapper; file=p.fn1>;
>   file f2 <single_file_mapper; file=p.fn2>;
>   file of <single_file_mapper; file=p.ofn>;
>   of = echo(p,f1,f2);
> }
> 
> mid$ cat ~/.swift/swift.properties
> 
> sites.file=sites.xml
> tc.file=apps
> 
> status.mode=provider
> use.provider.staging=false
> use.wrapper.staging=false
> wrapperlog.always.transfer=true
> execution.retries=0
> lazy.errors=false
> provider.staging.pin.swiftfiles=false
> sitedir.keep=true
> file.gc.enabled=false
> #tcp.port.range=50000,51000
> 
> mid$ cat apps
> localhost echo echo
> 
> mid$ cat sites.xml
> <config>
>   <pool handle="localhost">
>     <execution provider="local"/>
>     <filesystem provider="local"/>
>     <workdirectory>/scratch/midway/{env.USER}/swiftwork</workdirectory>
>   </pool>
> </config>
> 
> mid$ ls -l f?.dat
> -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f1.dat
> -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f2.dat
> -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f3.dat
> -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f4.dat
> 
> mid$ swift structs.swift
> Swift 0.94 swift-r6414 (swift modified locally) cog-r3648
> 
> RunID: 20130611-0343-so454ho1
> Progress:  time: Tue, 11 Jun 2013 03:43:56 +0000
> Final status: Tue, 11 Jun 2013 03:43:56 +0000  Finished
> successfully:2
> 
> mid$ more row?.out
> ::::::::::::::
> row1.out
> ::::::::::::::
> params: 86 99 files: f1.dat f2.dat
> ::::::::::::::
> row2.out
> ::::::::::::::
> params: 87 98 files: f3.dat f4.dat
> mid$
> 
> 
> ----- Original Message -----
> > From: "TJ Lane" <tjlane at stanford.edu>
> > To: swift-user at ci.uchicago.edu
> > Sent: Monday, June 10, 2013 8:15:17 PM
> > Subject: [Swift-user] Advice on mapping input arguments
> > 
> > 
> > 
> > 
> > Swift Users,
> > 
> > I am wondering if I could get some advice on the best way to do the
> > following in Swift: I want to run a series of simulations
> > performing
> > a parameter scan, for each parameter combination farming the work
> > out to clusters I have access to here at Stanford, and collect the
> > results back onto my desktop.
> > 
> > I've gotten some minimal working examples of swift up and running,
> > but hit a roadblock on something quite simple: what's the best way
> > to pass a large number of parameters into a swift script? I have a
> > big list of parameter combinations I'd like to run, and am
> > searching
> > for a sane way to pass all of these into my swift app call.
> > 
> > Originally, I thought I'd be able to use the CSV mapper to pass a
> > bunch of arguments from a CSV file into swift -- it seemed perfect!
> > As a bonus, I hoped the CSV file would act as a record of my work,
> > namely what parameters were used to generate what file. But it
> > seems
> > that the CSV mapper automatically maps the entries in the CSV file
> > to swift "mapper" objects -- i.e., it expects my CSV data fields
> > are
> > all files, where as I want some to be ints or floats that get
> > passed
> > directly to the arguments of my command-line script on the slave
> > machine(s).
> > 
> > 
> > For concreteness, here is a test CSV I was working with:
> > 
> > 
> > coords,qvals,numphi,numshots,nummolec,photons,parallel,output_filename
> > gold_5nm.coor,gold_qvals.txt,3600,10,1,0.25,12,out0.ring
> > gold_5nm.coor,gold_qvals.txt,3600,10,2,0.5,12,out1.ring
> > gold_5nm.coor,gold_qvals.txt,3600,10,4,1.0,12,out2.ring
> > gold_5nm.coor,gold_qvals.txt,3600,10,8,2.0,12,out3.ring
> > gold_5nm.coor,gold_qvals.txt,3600,10,16,4.0,12,out4.ring
> > 
> > 
> > and my (non-functional) swift script, which will show what I was
> > trying to do:
> > 
> > # shoot.swift
> > 
> > type messagefile;
> > type pdbfile;
> > type shotfile;
> > 
> > type shootargs{
> > pdbfile coords;
> > messagefile qvals;
> > int numphi;
> > int numshots;
> > int nummolec;
> > int photons;
> > int parallel;
> > string output_filename;
> > }
> > 
> > app (shotfile outputfile) shootsim (shootargs args) {
> > polshoot "-s" @args.coords "-f" @args.qvals "-x" args.numphi "-n"
> > args.numshots "-m" args.nummolec "-g" args.photons "-p"
> > args.parallel "-o" @outputfile;
> > }
> > 
> > 
> > shootargs myargs[] <csv_mapper;file="particles_per_shot.csv">;
> > 
> > foreach a in myargs {
> > shotfile o; // this could be something like myargs.output_filename
> > o = shootsim(a);
> > }
> > 
> > 
> > I'm wondering if someone who's worked a bit with swift can give me
> > a
> > recommendation on how to proceed. Right now I'm playing with just
> > writing a huge number of flat text files, each one containing the
> > parameter flags that will then get cat'd into the arguments of my
> > command-line script "polshoot" on the slave end. This is inelegant
> > for obvious reasons, since I'll have a huge number of input files
> > and no easy way to keep track of which input matches what output...
> > 
> > 
> > If anyone has advice, I'm all ears!
> > 
> > 
> > Thanks,
> > 
> > TJ
> > 
> > 
> > 
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> 



More information about the Swift-user mailing list