[Swift-devel] Swift 0.93 exception on Fusion

David Kelly davidk at ci.uchicago.edu
Wed Aug 17 14:37:37 CDT 2011


I am seeing the same issue when running directly on login.pads with pbs and coasters. Alberto and Jon both had success running from communicado to PADS using ssh:pbs, but using local:pbs with coasters seems to trigger whatever is causing this.

The files are in ~davidk/temp2.

David

----- Original Message -----
> From: "David Kelly" <davidk at ci.uchicago.edu>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: swift-devel at ci.uchicago.edu
> Sent: Wednesday, August 17, 2011 12:28:31 PM
> Subject: Re: [Swift-devel] Swift 0.93 exception on Fusion
> Two .submit files get created in .globus/scripts:
> PBS8380254050153377753.submit and PBS5442733515255786709.submit.
> 
> PBS5442733515255786709.submit is empty. I'm guessing this is where the
> problem is.
> 
> PBS8380254050153377753.submit looks ok. There is nothing out of the
> ordinary in the stderr and stdout files for this. I can resubmit this
> using qsub and it gets queued without any immediate errors.
> 
> Attached a tar file of the PBS files.
> 
> David
> 
> ----- Original Message -----
> > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > To: "David Kelly" <davidk at ci.uchicago.edu>
> > Cc: swift-devel at ci.uchicago.edu
> > Sent: Wednesday, August 17, 2011 11:43:38 AM
> > Subject: Re: [Swift-devel] Swift 0.93 exception on Fusion
> > David, one possibility here is that (automatic) coasters is
> > generating
> > an additional job with invalid PBS attributes.
> >
> > Can you turn on debug=true in etc/pbs.properties and take a look to
> > see if thats the case?
> >
> > The NPE is an issue in the pbs/localsched provider, but perhaps
> > caused
> > by na unexpected problem or state in the pbs jobs.
> >
> > - Mike
> >
> >
> > ----- Original Message -----
> > > From: "David Kelly" <davidk at ci.uchicago.edu>
> > > To: swift-devel at ci.uchicago.edu
> > > Sent: Wednesday, August 17, 2011 11:34:06 AM
> > > Subject: [Swift-devel] Swift 0.93 exception on Fusion
> > > Hello,
> > >
> > > When testing 0.93 on Fusion, Swift throws an exception. I am
> > > running
> > > with the catsn script. It runs, creates the output, but then gives
> > > this error when cleaning up:
> > >
> > > Final status: time: Wed, 17 Aug 2011 11:17:38 -0500 Finished
> > > successfully:10
> > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > > Cannot submit job
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.submit(AbstractJobSubmissionTaskHandler.java:67)
> > > at
> > > org.globus.cog.abstraction.impl.common.AbstractTaskHandler.submit(AbstractTaskHandler.java:45)
> > > at
> > > org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.submit(ExecutionTaskHandler.java:57)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.LocalQueueProcessor.run(LocalQueueProcessor.java:40)
> > > Caused by: java.lang.NullPointerException
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor.makeName(PBSExecutor.java:304)
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor.writeScript(PBSExecutor.java:205)
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.buildCommandLine(AbstractExecutor.java:169)
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.start(AbstractExecutor.java:89)
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.submit(AbstractJobSubmissionTaskHandler.java:53)
> > > ... 3 more
> > >
> > > The config and log files are attached. They can also be found on
> > > fusion in ~davidk/temp. This is filed in bugzilla as bug #515.
> > >
> > > David
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
> > --
> > Michael Wilde
> > Computation Institute, University of Chicago
> > Mathematics and Computer Science Division
> > Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list