[Swift-devel] Dynamic profiles and count
Mihael Hategan
hategan at mcs.anl.gov
Tue Sep 18 23:25:33 CDT 2012
David,
Can you try r5934?
Mihael
On Tue, 2012-09-18 at 20:42 -0700, Mihael Hategan wrote:
> Ah, thanks. I can check and see what's happening.
>
> On Tue, 2012-09-18 at 21:40 -0500, Justin M Wozniak wrote:
> > Just to clarify- this is based on changes already in svn regarding
> > dynamic profiles. That's why the map is there.
> >
> > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_dynamic_profiles
> >
> > David is trying to solve a specific issue with using dynamic profiles
> > and "count": we need to know all about "count" to solve this.
> >
> > This is a paste of my recent email to David about dynamic profiles and
> > issues with "count":
> >
> > Ok, the place to look is in the Swift repo: svn diff -r5206:5207
> >
> > Apparently, no changes were made to CoG.
> >
> > The dynamic profiles should be visible in the KML.
> >
> > From there, you really need to know your Karajan semantics. The
> > attributes are known to execute2() and thus execute() and Swift's Execute.
> >
> > Now Java. You should be able to see the attributes in GridExec. They
> > are applied to the Task's JobSpecification. These good places to add
> > some trace-level logging.
> >
> > Then, the Task is sent to PBSExecutor. You could add logging here too
> > to make sure the attributes made it.
> >
> > One thing to consider is the possibility that the attribute is being
> > overwritten by some other component. If you add logging to
> > JobSpecification.setAttribute(), you might be able to find that.
> >
> > On 9/18/2012 4:58 PM, Mihael Hategan wrote:
> > > I'm not sure. The way I understand things, your change modifies the way
> > > execute is invoked from:
> > >
> > > execute(executable, ..., count=n)
> > >
> > > to:
> > >
> > > execute(executable, ..., attributes=map(map:entry("count", n), ...))
> > >
> > > But the semantics are the same. The task object, in both cases, will
> > > have an attribute named "count" equal to n.
> > >
> > > Can you send me the full diff of your changes?
> > >
> > > Mihael
> > >
> > > On Tue, 2012-09-18 at 13:33 -0500, David Kelly wrote:
> > >> Hello,
> > >>
> > >> I have been working on a namd scaling test using Swift. I am using the plain PBS provider at the moment (no coasters). The swift script sets a minimum number of nodes, a maximum number of nodes, iterates through those values, then uses dynamic profiles to change the value of 'count' to modify the number of nodes to request. Here is the script:
> > >>
> > >> ---
> > >> type file;
> > >>
> > >> app (file out, file err) namd_wrapper (int numnodes, file psf_file, file pdb_file, file coord_restart_file,
> > >> file velocity_restart_file, file system_restart_file)
> > >> {
> > >> profile "count" = numnodes;
> > >> namd_wrapper @psf_file @pdb_file @coord_restart_file @velocity_restart_file @system_restart_file stdout=@out stderr=@err;
> > >> }
> > >>
> > >> # Range of nodes to test on
> > >> int minNodes=1;
> > >> int maxNodes=2;
> > >> int delta=1;
> > >>
> > >> # Files
> > >> file psf <"h0_solvion.psf">;
> > >> file pdb <"h0_solvion.pdb">;
> > >> file coord_restart <"h0_eq.0.restart.coor">;
> > >> file velocity_restart <"h0_eq.0.restart.vel">;
> > >> file system_restart <"h0_eq.0.restart.xsc">;
> > >>
> > >> foreach nodes in [minNodes:maxNodes:delta] {
> > >> file output <single_file_mapper; file=@strcat("logs/scaling-", nodes, ".out.txt")>;
> > >> file error <single_file_mapper; file=@strcat("logs/scaling-", nodes, ".err.txt")>;
> > >> (output, error) = namd_wrapper(nodes, psf, pdb, coord_restart, velocity_restart, system_restart);
> > >> }
> > >> ---
> > >>
> > >> In sites.xml, I also set the jobtype to "single" so it doesn't start a worker on each node (namd uses MPI).
> > >>
> > >> The problem that I'm running into is that, as is, dynamic profiles do not seem to allow you to modify the value for count. I have a workaround for this which involves removing references to "count" in GridExec.java, Execute.java, and TCProfile.java. This works for me in terms of this script, and it works with a few other simple catsn type scripts I've tested. I just wanted to double check to make sure this wouldn't cause any other issues before committing. Here are the changes:
> > >>
> > >> Index: modules/karajan/src/org/globus/cog/karajan/workflow/nodes/grid/GridExec.java
> > >> ===================================================================
> > >> --- modules/karajan/src/org/globus/cog/karajan/workflow/nodes/grid/GridExec.java (revision 3472)
> > >> +++ modules/karajan/src/org/globus/cog/karajan/workflow/nodes/grid/GridExec.java (working copy)
> > >> @@ -56,7 +56,7 @@
> > >> public static final Arg A_STDIN = new Arg.Optional("stdin");
> > >> public static final Arg A_PROVIDER = new Arg.Optional("provider");
> > >> public static final Arg A_SECURITY_CONTEXT = new Arg.Optional("securitycontext");
> > >> - public static final Arg A_COUNT = new Arg.Optional("count");
> > >> + // public static final Arg A_COUNT = new Arg.Optional("count");
> > >> public static final Arg A_HOST_COUNT = new Arg.Optional("hostcount");
> > >> public static final Arg A_JOBTYPE = new Arg.Optional("jobtype");
> > >> public static final Arg A_MAXTIME = new Arg.Optional("maxtime");
> > >> @@ -86,7 +86,8 @@
> > >> static {
> > >> setArguments(GridExec.class, new Arg[] { A_EXECUTABLE, A_ARGS, A_ARGUMENTS, A_HOST,
> > >> A_STDOUT, A_STDERR, A_STDOUTLOCATION, A_STDERRLOCATION, A_STDIN, A_PROVIDER,
> > >> - A_COUNT, A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> > >> + // A_COUNT,
> > >> + A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> > >> A_ENVIRONMENT, A_QUEUE, A_PROJECT, A_MINMEMORY, A_MAXMEMORY, A_REDIRECT,
> > >> A_SECURITY_CONTEXT, A_DIRECTORY, A_NATIVESPEC, A_DELEGATION, A_ATTRIBUTES,
> > >> C_ENVIRONMENT, A_FAIL_ON_JOB_ERROR, A_BATCH, C_STAGEIN, C_STAGEOUT, C_CLEANUP,
> > >> @@ -346,7 +347,7 @@
> > >> }
> > >> }
> > >>
> > >> - protected final static Arg[] MISC_ATTRS = new Arg[] { A_COUNT, A_HOST_COUNT, A_JOBTYPE,
> > >> + protected final static Arg[] MISC_ATTRS = new Arg[] { A_HOST_COUNT, A_JOBTYPE,
> > >> A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME, A_QUEUE, A_PROJECT, A_MINMEMORY, A_MAXMEMORY };
> > >>
> > >> protected void setMiscAttributes(JobSpecification js, VariableStack stack)
> > >>
> > >> Index: src/org/griphyn/vdl/karajan/lib/TCProfile.java
> > >> ===================================================================
> > >> --- src/org/griphyn/vdl/karajan/lib/TCProfile.java (revision 5930)
> > >> +++ src/org/griphyn/vdl/karajan/lib/TCProfile.java (working copy)
> > >> @@ -63,7 +63,6 @@
> > >>
> > >> static {
> > >> PROFILE_T = new HashMap<String, Arg>();
> > >> - PROFILE_T.put("count", GridExec.A_COUNT);
> > >> PROFILE_T.put("jobtype", GridExec.A_JOBTYPE);
> > >> PROFILE_T.put("maxcputime", GridExec.A_MAXCPUTIME);
> > >> PROFILE_T.put("maxmemory", GridExec.A_MAXMEMORY);
> > >>
> > >> Index: src/org/griphyn/vdl/karajan/lib/Execute.java
> > >> ===================================================================
> > >> --- src/org/griphyn/vdl/karajan/lib/Execute.java (revision 5930)
> > >> +++ src/org/griphyn/vdl/karajan/lib/Execute.java (working copy)
> > >> @@ -47,7 +47,7 @@
> > >> static {
> > >> setArguments(Execute.class, new Arg[] { A_EXECUTABLE, A_ARGS, A_ARGUMENTS, A_HOST,
> > >> A_STDOUT, A_STDERR, A_STDOUTLOCATION, A_STDERRLOCATION, A_STDIN, A_PROVIDER,
> > >> - A_COUNT, A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> > >> + A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> > >> A_ENVIRONMENT, A_QUEUE, A_PROJECT, A_MINMEMORY, A_MAXMEMORY, A_REDIRECT,
> > >> A_SECURITY_CONTEXT, A_DIRECTORY, A_NATIVESPEC, A_DELEGATION, A_ATTRIBUTES,
> > >> C_ENVIRONMENT, A_FAIL_ON_JOB_ERROR, A_BATCH, A_REPLICATION_GROUP,
> > >> _______________________________________________
> > >> Swift-devel mailing list
> > >> Swift-devel at ci.uchicago.edu
> > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > >
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
> >
>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list