[Swift-user] SGE swift job

Altaweel, Mark m.altaweel at ucl.ac.uk
Fri May 1 14:31:12 CDT 2015


Hi,

Further to the message below, I received this from our cluster administrator:

There isn't a parallel environment called 1way on Legion. Ian spent some time modifying the source code for Swift 0.94.1 to get it to submit jobs on Legion, but then found that those modifications no longer work for the later versions.

Here's what he had to do for the earlier version:

Modify the code in swift-0.94/cog/modules/provider-localscheduler/src/org/globus/cog/abstraction/impl/scheduler/sge/SGEExecutor.java:

   * find the function private void verifyQueueInformation() and delete the contents, replacing with just return;
   * delete the line: writeAttr("queue", "-q ", wr);

and then, because it expects to be able to inspect the queue settings itself to determine cores per node and jobs per node, I ham-fistedly changed those to be hard-coded to 1 in the same file, changing:

       String queue = (String)spec.getAttribute("queue");

       int coresPerNode = Integer.valueOf(getAttribute(spec, "coresPerNode",
           String.valueOf(poller.getQueueInformation(queue).getSlots())));
       int jobsPerNode = Integer.valueOf(getAttribute(spec, "jobsPerNode",
           String.valueOf(coresPerNode)));
       int coresToRequest = ( count * jobsPerNode + coresPerNode - 1) / coresPerNode * coresPerNode;

to

       //String queue = (String)spec.getAttribute("queue");

       int coresPerNode = 1;
       int jobsPerNode = 1;
       int coresToRequest = 1;

In cog/modules/provider-localscheduler/src/org/globus/cog/abstraction/impl/scheduler/sge/SGEExecutor.java:

change:

       writeWallTime(wr);
       writeSoftWallTime(wr);

       if (spec.getStdInput() != null) {
           wr.write("#$ -i " + quote(spec.getStdInput()) + '\n');
       }
       wr.write("#$ -o " + quote(stdout) + '\n');
       wr.write("#$ -e " + quote(stderr) + '\n');

       if (!spec.getEnvironmentVariableNames().isEmpty()) {

to:

       writeWallTime(wr);
       writeSoftWallTime(wr);

       if (spec.getStdInput() != null) {
           wr.write("#$ -i " + quote(spec.getStdInput()) + '\n');
       }
       wr.write("#$ -o " + quote(stdout) + '\n');
       wr.write("#$ -e " + quote(stderr) + '\n');
       wr.write("#$ -jsv /shared/ucl/apps/sge_support/clean_variables_from_jobenv.jsv\n");

       if (!spec.getEnvironmentVariableNames().isEmpty()) {

As the above suggests, I've put that JSV script in /shared/ucl/apps/sge_support in case it's needed for anything else. The bug which makes it useful has been fixed in the version of SoGE we're using after the upgrade, so it shouldn't be necessary for too long. (That part is to do with stripping out variables containing % characters).
On May 1, 2015, at 4:27 PM, Altaweel, Mark <tcrnma3 at live.ucl.ac.uk<mailto:tcrnma3 at live.ucl.ac.uk>> wrote:

Hi,

I am trying to use Swift (swift-0.96-sge-mod) and trying to run a script on SGE for the local cluster. Is there any clear reason for the error (below).

My swift.conf setup is:

site.Legion {
        execution {
                type: "coaster"
                jobManager: "local:sge"
                URL : "localhost"
                options {
                        maxJobs: 2
                        nodeGranularity: 1
                        maxNodesPerJob: 2
                        tasksPerNode: 1
                        jobProject: "AllUsers"
                        jobQueue: "Tarvek"
                        maxJobTime: "1800"

                }
        }
        maxParallelTasks : 3
        initialParallelTasks : 2
        staging: local
        workDirectory: "/tmp/"${env.USER}
        app.ALL {
                executable: "*"
                maxWallTime: "00:05:00"
        }
}


I get the following error:

RunID: run009
Warning: The @ syntax for function invocation is deprecated
Warning: Variable spans, defined on line 52, might have multiple conflicting writers
Progress: Fri, 01 May 2015 16:19:29+0100
Number of parameter combinations: 2
Stride: 1
Begin: 1, End: 1
Begin: 2, End: 2
Progress: Fri, 01 May 2015 16:19:30+0100  Submitted:2
Error: No parallel environment specified

Could not submit job (qsub reported an exit code of 1).
Unable to run job: job rejected: the requested parallel environment "1way" does not exist.Exiting.

Execution failed:
Exception in sh:
    Arguments: [repast_instance.sh, /imports/home1/tcrnma3/Scratch/UrbanModel/, 1, 1, 1, urf_2.txt]
    Host: Legion
    Directory: repast-run009/jobs/9/sh-9xg6568m
exception @ swift-int-staging.k, line: 174
Caused by:
exception @ swift-int-staging.k, line: 170
Caused by: Block task failed: Error submitting block task
org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not submit job (qsub reported an exit code of 1).
Unable to run job: job rejected: the requested parallel environment "1way" does not exist.Exiting.

at org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.submit(AbstractJobSubmissionTaskHandler.java:62)
at org.globus.cog.abstraction.impl.common.AbstractTaskHandler.submit(AbstractTaskHandler.java:45)
at org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.submit(ExecutionTaskHandler.java:61)
at org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.run(BlockTaskSubmitter.java:70)
Caused by: org.globus.cog.abstraction.impl.scheduler.common.ProcessException: Could not submit job (qsub reported an exit code of 1).
Unable to run job: job rejected: the requested parallel environment "1way" does not exist.Exiting.

at org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.start(AbstractExecutor.java:116)
at org.globus.cog.abstraction.impl.scheduler.sge.SGEExecutor.start(SGEExecutor.java:192)
at org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.submit(AbstractJobSubmissionTaskHandler.java:52)
... 3 more



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20150501/57c29944/attachment.html>


More information about the Swift-user mailing list