[Swift-devel] Problem with 0.92 sending jobs to OSG via Condor-G

Michael Wilde wilde at mcs.anl.gov
Thu Jan 13 10:20:34 CST 2011


Allan, you are right!

So the code in provide-condor is an obsolete fossil?

My earlier diffs were wrong because I diffed trunk against 0.92, but the problem occurred in the merge of stable *to* trunk (obviously now ;)

The error I think is in rev 2989:

--- modules/provider-localscheduler/src/org/globus/cog/abstraction/impl/scheduler/condor/CondorExecutor.java    (revision 2988)
+++ modules/provider-localscheduler/src/org/globus/cog/abstraction/impl/scheduler/condor/CondorExecutor.java    (working copy)

The working trunk version generates this Condor submit file:

universe = grid
grid_resource = gt2 ff-grid3.unl.edu/jobmanager-pbs
stream_output = False
stream_error  = False
Transfer_Executable = false
output = /home/wilde/.globus/scripts/Condor50896.submit.stdout
error = /home/wilde/.globus/scripts/Condor50896.submit.stderr

remote_initialdir = /panfs/panasas/CMS/data/engage/tmp/ff-grid3.unl.edu/catsn-20110113-1059-4xb6b31h
executable = /bin/bash
arguments = /panfs/panasas/CMS/data/engage/tmp/ff-grid3.unl.edu/catsn-20110113-1059-4xb6b31h/shared/_swiftwrap cat-fmk15f4k -jobdir f -scratch  -e /bin/cat -out outdir/f.0001.out -err stderr.txt -i -d outdir -if data.txt -of outdir/f.0001.out -k  -cdmfile  -status file -a data.txt
notification = Never
leave_in_queue = TRUE
queue

while the failing 0.92 version generates this:

universe = grid
grid_resource = gt2 belhaven-1.renci.org/jobmanager-condor
stream_output = False
stream_error  = False
Transfer_Executable = false
output = /home/wilde/.globus/scripts/Condor43688.submit.stdout
error = /home/wilde/.globus/scripts/Condor43688.submit.stderr

remote_initialdir = /nfs/osg-data/engage/tmp/belhaven-1.renci.org/catsn-20110113-1050-eskyjcb5
executable = /bin/bash
arguments = /nfs/osg-data/engage/tmp/belhaven-1.renci.org/catsn-20110113-1050-eskyjcb5/shared/_swiftwrap cat-kbmn4f4k -jobdir k -scrat
ch "" -e /bin/cat -out outdir/f.0001.out -err stderr.txt -i -d outdir -if data.txt -of outdir/f.0001.out -k "" -cdmfile "" -status fil
e -a data.txt
notification = Never
leave_in_queue = TRUE
queue

It is not yet clear to me if the older code is working because it *failed* to escape the quotes on the arguments line with \", or because it *omitted* the "". I need to look more closely to see if Im being fooled by the .submit file text I pasted above (ie if the \" is really there, or if "" is missing entirely).

At any rate - Mihael, can you sync up with me on this (ie whichever of us get to it first should fix). Or Sarah, David, Justin, or Allan?

Mihael, I think your top prio should be the coaster staging timing issue that Allan and Justin are both encountering (we think).

We need to add a test for how this works and verify that its creating a valid submit file.

Thanks,

- Mike



The diffs are below:

--- modules/provider-localscheduler/src/org/globus/cog/abstraction/impl/scheduler/condor/CondorExecutor.java    (revision 2988)
+++ modules/provider-localscheduler/src/org/globus/cog/abstraction/impl/scheduler/condor/CondorExecutor.java    (working copy)
@@ -116,97 +116,6 @@
                wr.close();
        }
 
-       private static final boolean[] TRIGGERS;
-
-       static {
-               TRIGGERS = new boolean[128];
-               TRIGGERS[' '] = true;
-               TRIGGERS['\n'] = true;
-               TRIGGERS['\t'] = true;
-               TRIGGERS['\\'] = true;
-               TRIGGERS['>'] = true;
-               TRIGGERS['<'] = true;
-               TRIGGERS['"'] = true;
-       }
-
-       protected String quote(String s) {
-               if ("".equals(s)) {
-                       return "";
-               }
-               boolean quotes = false;
-               for (int i = 0; i < s.length(); i++) {
-                       char c = s.charAt(i);
-                       if (c < 128 && TRIGGERS[c]) {
-                               quotes = true;
-                               break;
-                       }
-               }
-               if (!quotes) {
-                       return s;
-               }
-               StringBuffer sb = new StringBuffer();
-               if (quotes) {
-                       sb.append('\\');
-                       sb.append('"');
-               }
-               for (int i = 0; i < s.length(); i++) {
-                       char c = s.charAt(i);
-                       if (c == '"' || c == '\\') {
-                               sb.append('\\');
-                       }
-                       sb.append(c);
-               }
-               if (quotes) {
-                       sb.append('\\');
-                       sb.append('"');
-               }
-               return sb.toString();
-       }
-
-       protected String replaceVars(String str) {
-               StringBuffer sb = new StringBuffer();
-               boolean escaped = false;
-               for (int i = 0; i < str.length(); i++) {
-                       char c = str.charAt(i);
-                       if (c == '\\') {
-                               if (escaped) {
-                                       sb.append('\\');
-                               }
-                               else {
-                                       escaped = true;
-                               }
-                       }
-                       else {
-                               if (c == '$' && !escaped) {
-                                       if (i == str.length() - 1) {
-                                               sb.append('$');
-                                       }
-                                       else {
-                                               int e = str.indexOf(' ', i);
-                                               if (e == -1) {
-                                                       e = str.length();
-                                               }
-                                               String name = str.substring(i + 1, e);
-                                               Object attr = getSpec().getAttribute(name);
-                                               if (attr != null) {
-                                                       sb.append(attr.toString());
-                                               }
-                                               else {
-                                                       sb.append('$');
-                                                       sb.append(name);
-                                               }
-                                               i = e;
-                                       }
-                               }
-                               else {
-                                       sb.append(c);
-                               }
-                               escaped = false;
-                       }
-               }
-               return sb.toString();
-       }
-
        protected String getName() {
                return "Condor";
        }
login1$ 


----- Original Message -----
> I think my diffs were wrong. Please ignore this thread till I re-do
> them.
> 
> - Mike
> 
> ----- Original Message -----
> > ----- Original Message -----
> > > Shouldn't we be looking at the diffs in provider-localscheduler?
> >
> > I don't *think* so - my tests were using COndor-G directly:
> >
> > <profile namespace="globus" key="jobType">grid</profile>
> > <profile namespace="globus" key="gridResource">gt2
> > ff-grid3.unl.edu/jobmanager-pbs</profile>
> >
> > But in any case, I diff'ed the entire cog and swift trees, and saw
> > almost *no* diffs (see later msg). The only one I am suspicious of
> > at
> > the moment is the @Override patch.
> >
> > I need to find when that change was made and whether I somehow
> > compiled *with* the Overrides in place in the older working copy.
> >
> > - Mike
> >
> > >
> > > -Allan (mobile)
> > >
> > > On Jan 13, 2011 11:17 AM, "Michael Wilde" < wilde at mcs.anl.gov >
> > > wrote:
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > > I need to check what local mods I had applied, but I think
> > > > > > its
> > > > > > more
> > > > > > likely that some Condor submit file quoting fix fell off in
> > > > > > 0.92
> > > > > > integration.
> > > > >
> > > > > Yeah. A svn diff > somefile would help.
> > > >
> > > > Hmmm. So far svn diffs show no changes within provider-condor,
> > > > neither between trunk and 0.92 branch nor within my working
> > > > copies
> > > > of those two on engage-submit, which seem to behave differently
> > > > regarding Condor quoting.
> > > >
> > > > Could the change(s) that were made a long time ago to fix Condor
> > > > quoting be in a different module than provider-condor? If so,
> > > > whats
> > > > a likely place to look?
> > > >
> > > > I'll check vdl-int.k next.
> > > >
> > > > - Mike
> > > >
> > > > > >
> > > > > > So Marc, sorry - this release is not usable for you yet.
> > > > > >
> > > > > > - Mike
> > > > > >
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > > Im trying my first tests of 0.92 on engage-submit, sending
> > > > > > > 100
> > > > > > > trivial
> > > > > > > cat jobs to 10 OSG sites.
> > > > > > >
> > > > > > > My jobs seem to be all dying with the error "Found illegal
> > > > > > > unescaped
> > > > > > > double-quote" (see below).
> > > > > > >
> > > > > > > Has anyone successfully run a Condor-G job on OSG with
> > > > > > > 0.92?
> > > > > > >
> > > > > > > I'll dig deeper and try the same test with the older
> > > > > > > version
> > > > > > > of
> > > > > > > trunk
> > > > > > > that Marc has been using here with better success. Will
> > > > > > > also
> > > > > > > try a
> > > > > > > single job run and capture a simpler log and the condor-g
> > > > > > > submit
> > > > > > > file.
> > > > > > >
> > > > > > > Allan, have you tried 0.92 against COndor-G? If not, could
> > > > > > > you?
> > > > > > >
> > > > > > > Sarah, we should add some Condor-G-to-GT2 testing to 0.92
> > > > > > > validation I
> > > > > > > think.
> > > > > > >
> > > > > > > - Mike
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Caused by:
> > > > > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > > > > > > Cannot submit job: Could not submit job (condor_submit
> > > > > > > reported an
> > > > > > > exit code of 1). Submitting job(s)
> > > > > > > Found illegal unescaped double-quote: "" -e /bin/cat -out
> > > > > > > outdir/f.0065.out -err stderr.txt -i -d outdir -if
> > > > > > > data.txt
> > > > > > > -of
> > > > > > > outdir/f.0065.out -k "" -cdmfile "" -status file -a
> > > > > > > data.txtThe
> > > > > > > full
> > > > > > > arguments you specified were:
> > > > > > > /osg/data/engage/tmp/
> > > > > > > osg.hpc.ufl.edu/catsn-20110113-0025-vv4p4up3/shared/_swiftwrap
> > > > > > > cat-ajxnee4k -jobdir a -scratch "" -e /bin/cat -out
> > > > > > > outdir/f.0065.out
> > > > > > > -err stderr.txt -i -d outdir -if data.txt -of
> > > > > > > outdir/f.0065.out -k
> > > > > > > ""
> > > > > > > -cdmfile "" -status file -a data.txt
> > > > > > >
> > > > > > >
> > > > > > > Script is:
> > > > > > >
> > > > > > > e$ cat catsn.swift
> > > > > > > type file;
> > > > > > >
> > > > > > > app (file o) cat (file i)
> > > > > > > {
> > > > > > > cat @i stdout=@o;
> > > > > > > }
> > > > > > >
> > > > > > > file out[]<simple_mapper; location="outdir",
> > > > > > > prefix="f.",suffix=".out">;
> > > > > > > foreach j in [1:@toint(@arg("n","1"))] {
> > > > > > > file data<"data.txt">;
> > > > > > > out[j] = cat(data);
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Michael Wilde
> > > > > > > Computation Institute, University of Chicago
> > > > > > > Mathematics and Computer Science Division
> > > > > > > Argonne National Laboratory
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Swift-devel mailing list
> > > > > > > Swift-devel at ci.uchicago.edu
> > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > > >
> > > >
> > > > --
> > > > Michael Wilde
> > > > Computation Institute, University of Chicago
> > > > Mathematics and Computer Science Division
> > > > Argonne National Laboratory
> > > >
> > > > _______________________________________________
> > > > Swift-devel mailing list
> > > > Swift-devel at ci.uchicago.edu
> > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > >
> >
> > --
> > Michael Wilde
> > Computation Institute, University of Chicago
> > Mathematics and Computer Science Division
> > Argonne National Laboratory
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list