[Swift-user] trunk-cobalt block task ended prematurely

Mihael Hategan hategan at mcs.anl.gov
Mon Mar 2 17:27:09 CST 2015


It would really be much more useful if you posted the full log.

Anyway, I believe that what you need to do is:
site.cluster.execution.options.workerLoggingLevel = "DEBUG"

Mihael

On Mon, 2015-03-02 at 16:37 -0600, Ketan Maheshwari wrote:
> The qsub command from the log says:
> 
> qsub -e WORKER_LOGGING_LEVEL=NONE --proccount 32 -n 32 -t 40 --cwd ...
> 
> So, the env variable on swift.conf does not seem to take effect.
> 
> On Mon, Mar 2, 2015 at 4:33 PM, Hategan-Marandiuc, Philip M. <
> hategan at mcs.anl.gov> wrote:
> 
> > Well, we need to figure out why. Since the qsub command line is in the
> > swift log, and the qsub command line should reflect the setting, it
> > would be useful if you posted the swift log.
> >
> > Mihael
> >
> > On Mon, 2015-03-02 at 16:27 -0600, Ketan Maheshwari wrote:
> > > For workerlogs, I am trying:
> > >
> > >  app.bgsh {
> > >         executable: "/home/ketan/SwiftApps/subjobs/bg.sh"
> > >         maxWallTime: "00:04:00"
> > >         env.ENABLE_WORKER_LOGGING="TRUE"
> > >         env.WORKER_LOGGING_LEVEL="DEBUG"
> > >         env.WORKER_LOG_DIR="/home/ketan/workerlogs"
> > >     }
> > >
> > > Does not seem to trigger logging.
> > >
> > > Thanks,
> > > Ketan
> > >
> > > On Mon, Mar 2, 2015 at 4:07 PM, Hategan-Marandiuc, Philip M. <
> > > hategan at mcs.anl.gov> wrote:
> > >
> > > > I would recommend enabling worker logging to see if we get any info
> > from
> > > > the worker process. Could be some simple thing, like the wrong IP
> > > > address.
> > > >
> > > > Mihael
> > > >
> > > > On Mon, 2015-03-02 at 15:47 -0600, Ketan Maheshwari wrote:
> > > > > I trying to run on BG/Q with local:cobalt with trunk but Swift
> > crashes
> > > > with
> > > > > the following error:
> > > > >
> > > > > Caused by: Exception in bgsh:
> > > > >     Arguments:
> > [/home/ketan/SwiftApps/subjobs/mpicatsnsleep/mpicatnap,
> > > > > /gpfs/mira-home/ketan/SwiftApps/subjobs/mpicatsnsleep/./data.txt,
> > > > >
> > > >
> > /gpfs/mira-home/ketan/SwiftApps/subjobs/mpicatsnsleep/./outdir/f.0002.out,
> > > > > 1]
> > > > >     Host: cluster
> > > > >     Directory: catsnsleepmpi-run001/jobs/b/bgsh-3nq3uc5m
> > > > > exception @ swift-int-staging.k, line: 165
> > > > > Caused by:
> > > > > exception @ swift-int-staging.k, line: 160
> > > > > Caused by: Block task failed: 0302-2109420-000000 Block task ended
> > > > > prematurely
> > > > >
> > > > > In the log, I see the qsub call being made and a jobid is returned.
> > > > > However, I could not figure what is the cause for the task to fail.
> > > > >
> > > > > One more thing I noticed when translating from old sites conf to new
> > is
> > > > > that the new conf did not accept the property "globus:mode = script".
> > > > >
> > > > > A full run log is attached. Thanks for any suggestions.
> > > > >
> > > > > Thanks,
> > > > > Ketan
> > > > > _______________________________________________
> > > > > Swift-user mailing list
> > > > > Swift-user at ci.uchicago.edu
> > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> > > >
> > > >
> > > >
> >
> >
> >





More information about the Swift-user mailing list