[Swift-user] trunk-cobalt block task ended prematurely

Ketan Maheshwari ketan at mcs.anl.gov
Mon Mar 2 16:30:17 CST 2015


BG/Q accepts job-desc in the form of qsub command only. This is why no
script gets generated. I updated the provider from the old cqsub and
related options to the new qsub and options. However, I am not sure why the
script mode is not working.

--Ketan

On Mon, Mar 2, 2015 at 4:27 PM, Hategan-Marandiuc, Philip M. <
hategan at mcs.anl.gov> wrote:

> I looked at the Cobalt script in the run directory you sent and it was
> empty. When I look at the trunk code for the Cobalt provider, I don't
> see anything that would generate a script, so I'm not sure what's
> happening there. I vaguely remember that somebody wrote a patch for the
> Cobalt provider to use script mode, but I'm not sure where that ended
> up.
>
> Mihael
>
> On Mon, 2015-03-02 at 16:14 -0600, Ketan Maheshwari wrote:
> > This is the first time I am trying with 0.96.
> >
> > The generated qsub command indeed does not have "--mode script" which
> seems
> > to be causing the issue.
> >
> > Thanks,
> > Ketan
> >
> > On Mon, Mar 2, 2015 at 4:02 PM, Michael Wilde <wilde at anl.gov> wrote:
> >
> > >  Is this the first time you've tried running 0.96 on the BG/Q in subjob
> > > mode?
> > > (I.e., has this ever worked before?)
> > >
> > > Did you get a submission script in the run directory (or a log of the
> > > cobalt qsub or cqsub command) which you could test manually?
> > >
> > > If 0.96 is rejecting the "script" property, it seems possible that
> 0.96 is
> > > generating an invalid qsub command and/or submission script.
> > >
> > > - Mike
> > >
> > >
> > >
> > > On 3/2/15 3:47 PM, Ketan Maheshwari wrote:
> > >
> > > I trying to run on BG/Q with local:cobalt with trunk but Swift crashes
> > > with the following error:
> > >
> > >  Caused by: Exception in bgsh:
> > >     Arguments: [/home/ketan/SwiftApps/subjobs/mpicatsnsleep/mpicatnap,
> > > /gpfs/mira-home/ketan/SwiftApps/subjobs/mpicatsnsleep/./data.txt,
> > >
> /gpfs/mira-home/ketan/SwiftApps/subjobs/mpicatsnsleep/./outdir/f.0002.out,
> > > 1]
> > >     Host: cluster
> > >     Directory: catsnsleepmpi-run001/jobs/b/bgsh-3nq3uc5m
> > >  exception @ swift-int-staging.k, line: 165
> > > Caused by:
> > >  exception @ swift-int-staging.k, line: 160
> > > Caused by: Block task failed: 0302-2109420-000000 Block task ended
> > > prematurely
> > >
> > >  In the log, I see the qsub call being made and a jobid is returned.
> > > However, I could not figure what is the cause for the task to fail.
> > >
> > >  One more thing I noticed when translating from old sites conf to new
> is
> > > that the new conf did not accept the property "globus:mode = script".
> > >
> > >  A full run log is attached. Thanks for any suggestions.
> > >
> > >  Thanks,
> > > Ketan
> > >
> > >
> > > _______________________________________________
> > > Swift-user mailing listSwift-user at ci.uchicago.eduhttps://
> lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> > >
> > >
> > > --
> > > Michael Wilde
> > > Mathematics and Computer Science          Computation Institute
> > > Argonne National Laboratory               The University of Chicago
> > >
> > >
> > > _______________________________________________
> > > Swift-user mailing list
> > > Swift-user at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> > >
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20150302/41fa6954/attachment.html>


More information about the Swift-user mailing list