[Swift-devel] Dynamic profiles and count
Mihael Hategan
hategan at mcs.anl.gov
Tue Sep 18 22:42:51 CDT 2012
Ah, thanks. I can check and see what's happening.
On Tue, 2012-09-18 at 21:40 -0500, Justin M Wozniak wrote:
> Just to clarify- this is based on changes already in svn regarding
> dynamic profiles. That's why the map is there.
>
> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_dynamic_profiles
>
> David is trying to solve a specific issue with using dynamic profiles
> and "count": we need to know all about "count" to solve this.
>
> This is a paste of my recent email to David about dynamic profiles and
> issues with "count":
>
> Ok, the place to look is in the Swift repo: svn diff -r5206:5207
>
> Apparently, no changes were made to CoG.
>
> The dynamic profiles should be visible in the KML.
>
> From there, you really need to know your Karajan semantics. The
> attributes are known to execute2() and thus execute() and Swift's Execute.
>
> Now Java. You should be able to see the attributes in GridExec. They
> are applied to the Task's JobSpecification. These good places to add
> some trace-level logging.
>
> Then, the Task is sent to PBSExecutor. You could add logging here too
> to make sure the attributes made it.
>
> One thing to consider is the possibility that the attribute is being
> overwritten by some other component. If you add logging to
> JobSpecification.setAttribute(), you might be able to find that.
>
> On 9/18/2012 4:58 PM, Mihael Hategan wrote:
> > I'm not sure. The way I understand things, your change modifies the way
> > execute is invoked from:
> >
> > execute(executable, ..., count=n)
> >
> > to:
> >
> > execute(executable, ..., attributes=map(map:entry("count", n), ...))
> >
> > But the semantics are the same. The task object, in both cases, will
> > have an attribute named "count" equal to n.
> >
> > Can you send me the full diff of your changes?
> >
> > Mihael
> >
> > On Tue, 2012-09-18 at 13:33 -0500, David Kelly wrote:
> >> Hello,
> >>
> >> I have been working on a namd scaling test using Swift. I am using the plain PBS provider at the moment (no coasters). The swift script sets a minimum number of nodes, a maximum number of nodes, iterates through those values, then uses dynamic profiles to change the value of 'count' to modify the number of nodes to request. Here is the script:
> >>
> >> ---
> >> type file;
> >>
> >> app (file out, file err) namd_wrapper (int numnodes, file psf_file, file pdb_file, file coord_restart_file,
> >> file velocity_restart_file, file system_restart_file)
> >> {
> >> profile "count" = numnodes;
> >> namd_wrapper @psf_file @pdb_file @coord_restart_file @velocity_restart_file @system_restart_file stdout=@out stderr=@err;
> >> }
> >>
> >> # Range of nodes to test on
> >> int minNodes=1;
> >> int maxNodes=2;
> >> int delta=1;
> >>
> >> # Files
> >> file psf <"h0_solvion.psf">;
> >> file pdb <"h0_solvion.pdb">;
> >> file coord_restart <"h0_eq.0.restart.coor">;
> >> file velocity_restart <"h0_eq.0.restart.vel">;
> >> file system_restart <"h0_eq.0.restart.xsc">;
> >>
> >> foreach nodes in [minNodes:maxNodes:delta] {
> >> file output <single_file_mapper; file=@strcat("logs/scaling-", nodes, ".out.txt")>;
> >> file error <single_file_mapper; file=@strcat("logs/scaling-", nodes, ".err.txt")>;
> >> (output, error) = namd_wrapper(nodes, psf, pdb, coord_restart, velocity_restart, system_restart);
> >> }
> >> ---
> >>
> >> In sites.xml, I also set the jobtype to "single" so it doesn't start a worker on each node (namd uses MPI).
> >>
> >> The problem that I'm running into is that, as is, dynamic profiles do not seem to allow you to modify the value for count. I have a workaround for this which involves removing references to "count" in GridExec.java, Execute.java, and TCProfile.java. This works for me in terms of this script, and it works with a few other simple catsn type scripts I've tested. I just wanted to double check to make sure this wouldn't cause any other issues before committing. Here are the changes:
> >>
> >> Index: modules/karajan/src/org/globus/cog/karajan/workflow/nodes/grid/GridExec.java
> >> ===================================================================
> >> --- modules/karajan/src/org/globus/cog/karajan/workflow/nodes/grid/GridExec.java (revision 3472)
> >> +++ modules/karajan/src/org/globus/cog/karajan/workflow/nodes/grid/GridExec.java (working copy)
> >> @@ -56,7 +56,7 @@
> >> public static final Arg A_STDIN = new Arg.Optional("stdin");
> >> public static final Arg A_PROVIDER = new Arg.Optional("provider");
> >> public static final Arg A_SECURITY_CONTEXT = new Arg.Optional("securitycontext");
> >> - public static final Arg A_COUNT = new Arg.Optional("count");
> >> + // public static final Arg A_COUNT = new Arg.Optional("count");
> >> public static final Arg A_HOST_COUNT = new Arg.Optional("hostcount");
> >> public static final Arg A_JOBTYPE = new Arg.Optional("jobtype");
> >> public static final Arg A_MAXTIME = new Arg.Optional("maxtime");
> >> @@ -86,7 +86,8 @@
> >> static {
> >> setArguments(GridExec.class, new Arg[] { A_EXECUTABLE, A_ARGS, A_ARGUMENTS, A_HOST,
> >> A_STDOUT, A_STDERR, A_STDOUTLOCATION, A_STDERRLOCATION, A_STDIN, A_PROVIDER,
> >> - A_COUNT, A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> >> + // A_COUNT,
> >> + A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> >> A_ENVIRONMENT, A_QUEUE, A_PROJECT, A_MINMEMORY, A_MAXMEMORY, A_REDIRECT,
> >> A_SECURITY_CONTEXT, A_DIRECTORY, A_NATIVESPEC, A_DELEGATION, A_ATTRIBUTES,
> >> C_ENVIRONMENT, A_FAIL_ON_JOB_ERROR, A_BATCH, C_STAGEIN, C_STAGEOUT, C_CLEANUP,
> >> @@ -346,7 +347,7 @@
> >> }
> >> }
> >>
> >> - protected final static Arg[] MISC_ATTRS = new Arg[] { A_COUNT, A_HOST_COUNT, A_JOBTYPE,
> >> + protected final static Arg[] MISC_ATTRS = new Arg[] { A_HOST_COUNT, A_JOBTYPE,
> >> A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME, A_QUEUE, A_PROJECT, A_MINMEMORY, A_MAXMEMORY };
> >>
> >> protected void setMiscAttributes(JobSpecification js, VariableStack stack)
> >>
> >> Index: src/org/griphyn/vdl/karajan/lib/TCProfile.java
> >> ===================================================================
> >> --- src/org/griphyn/vdl/karajan/lib/TCProfile.java (revision 5930)
> >> +++ src/org/griphyn/vdl/karajan/lib/TCProfile.java (working copy)
> >> @@ -63,7 +63,6 @@
> >>
> >> static {
> >> PROFILE_T = new HashMap<String, Arg>();
> >> - PROFILE_T.put("count", GridExec.A_COUNT);
> >> PROFILE_T.put("jobtype", GridExec.A_JOBTYPE);
> >> PROFILE_T.put("maxcputime", GridExec.A_MAXCPUTIME);
> >> PROFILE_T.put("maxmemory", GridExec.A_MAXMEMORY);
> >>
> >> Index: src/org/griphyn/vdl/karajan/lib/Execute.java
> >> ===================================================================
> >> --- src/org/griphyn/vdl/karajan/lib/Execute.java (revision 5930)
> >> +++ src/org/griphyn/vdl/karajan/lib/Execute.java (working copy)
> >> @@ -47,7 +47,7 @@
> >> static {
> >> setArguments(Execute.class, new Arg[] { A_EXECUTABLE, A_ARGS, A_ARGUMENTS, A_HOST,
> >> A_STDOUT, A_STDERR, A_STDOUTLOCATION, A_STDERRLOCATION, A_STDIN, A_PROVIDER,
> >> - A_COUNT, A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> >> + A_HOST_COUNT, A_JOBTYPE, A_MAXTIME, A_MAXWALLTIME, A_MAXCPUTIME,
> >> A_ENVIRONMENT, A_QUEUE, A_PROJECT, A_MINMEMORY, A_MAXMEMORY, A_REDIRECT,
> >> A_SECURITY_CONTEXT, A_DIRECTORY, A_NATIVESPEC, A_DELEGATION, A_ATTRIBUTES,
> >> C_ENVIRONMENT, A_FAIL_ON_JOB_ERROR, A_BATCH, A_REPLICATION_GROUP,
> >> _______________________________________________
> >> Swift-devel mailing list
> >> Swift-devel at ci.uchicago.edu
> >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
>
More information about the Swift-devel
mailing list