[Swift-devel] MODIS freezes on Midway
David Kelly
davidk at ci.uchicago.edu
Wed Mar 20 14:05:56 CDT 2013
Yadu,
As a test, I just untarred swiftdemo.v04.tgz, removed the SBATCH_RESERVATION line from setup.sh and was able to run on midway. Send me a message on Skype when you have a few minutes and we can take a closer look at this.
David
----- Original Message -----
> From: "Yadu Nand" <yadudoc1729 at gmail.com>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "swift-devel" <swift-devel at ci.uchicago.edu>
> Sent: Wednesday, March 20, 2013 12:19:01 PM
> Subject: Re: [Swift-devel] MODIS freezes on Midway
> Hi everyone,
> David, please find the submit files attached to the mail.
> I am running the 5 different variants of modis,
> (local,midway,beagle,uc3, multiple)
> from the test system we have. I am not setting the SBATCH_RESERVATION
> variable
> in the setup scripts.
> Ketan, for modis_local and modis_midway, I am setting <filesystem
> provider="local"/>
> and interestingly, modis_local works fine from the stress test apps
> group now, the rest
> fail though. I think there is a different issue here now.
> It looks like most failures I'm seeing now is from perl and tc.data
> issues. I've attached the
> modis.stdout from the 5 testcases, if you'd like to take a look. The
> tc.data supplied is the
> same as the ones that came with swiftdemo.
> -Yadu
> On Wed, Mar 20, 2013 at 5:26 AM, Michael Wilde < wilde at mcs.anl.gov >
> wrote:
> > Likely not needed if its using provider staging.
>
> > ----- Original Message -----
>
> > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >
>
> > > To: "David Kelly" < davidk at ci.uchicago.edu >
>
> > > Cc: "swift-devel" < swift-devel at ci.uchicago.edu >
>
> > > Sent: Tuesday, March 19, 2013 6:04:44 PM
>
> > > Subject: Re: [Swift-devel] MODIS freezes on Midway
>
> > >
>
> > >
>
> > >
>
> > > In addition to what David mentioned, from logs, it seems that
> > > your
>
> > > sites file is missing this line:
>
> > >
>
> > >
>
> > > <filesystem provider="local"/>
>
> > >
>
> > >
>
> > >
>
> > > Or David, correct me if this is not required in this
> > > configuration.
>
> > >
>
> > >
>
> > >
>
> > > On Tue, Mar 19, 2013 at 6:00 PM, David Kelly <
> > > davidk at ci.uchicago.edu
>
> > > > wrote:
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Yadu,
>
> > >
>
> > >
>
> > > The setup.sh script sets this environment variable:
>
> > >
>
> > >
>
> > > export SBATCH_RESERVATION=osg
>
> > >
>
> > >
>
> > > I believe sbatch is picking up on this and trying to run with
> > > this
>
> > > reservation, which is likely expired. Can you try unsetting
>
> > > SBATCH_RESERVATION, commenting out that line in setup.sh and
> > > trying
>
> > > again?
>
> > >
>
> > >
>
> > > Thanks,
>
> > > David
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > From: "Yadu Nand" < yadudoc1729 at gmail.com >
>
> > > To: "swift-devel" < swift-devel at ci.uchicago.edu >
>
> > > Sent: Tuesday, March 19, 2013 5:35:41 PM
>
> > > Subject: [Swift-devel] MODIS freezes on Midway
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Hi,
>
> > >
>
> > > I've been running the modis tests on Midway, from the demo that
> > > mike
>
> > > had shared:
>
> > >
>
> > > /home/wilde/osgdemo/modis/svn/swiftdemo.v04.tgz
>
> > >
>
> > >
>
> > > In the logs(please see attachment) I see a fail message : "
> > > sbatch:
>
> > > error: Batch job
>
> > > submission failed: Requested reservation is i nvalid"
>
> > >
>
> > >
>
> > > The fact that no error messages are shown on stdout doesn't help,
>
> > > plus, swift just
>
> > > seems to hang forever. * Please help! *
>
> > >
>
> > >
>
> > > I see test.midway show no progress, with just the same status for
>
> > > about 20mins:
>
> > >
>
> > > Progress: time: Tue, 19 Mar 2013 20:21:18 +0000 Selecting site:35
>
> > > Submitted:65
>
> > >
>
> > >
>
> > > After this, I tried to kill by Ctrl+C and then I get a few error
>
> > > messages :
>
> > >
>
> > > Failed to shut down block: Block 0319-5807480-000000
> > > (16x3540.000s)
>
> > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
>
> > > Can only cancel an active task
>
> > > at
>
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:196)
>
> > > at
>
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
>
> > > at
>
> > > org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
>
> > > at
>
> > > org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:106)
>
> > > at
>
> > > org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:95)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:332)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:312)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.shutdownBlocks(BlockQueueProcessor.java:800)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.shutdown(BlockQueueProcessor.java:789)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.shutdown(JobQueue.java:119)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.CoasterService.shutdown(CoasterService.java:271)
>
> > > at
>
> > > org.globus.cog.abstraction.coaster.service.ServiceShutdownHandler.requestComplete(ServiceShutdownHandler.java:28)
>
> > > at
>
> > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.receiveCompleted(RequestHandler.java:88)
>
> > > at
>
> > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleRequest(AbstractKarajanChannel.java:519)
>
> > > at
>
> > > org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel.actualSend(AbstractPipedChannel.java:86)
>
> > > at
>
> > > org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel$Sender.run(AbstractPipedChannel.java:115)
>
> > > /home/wilde/osgdemo/modis/svn/swiftdemo.v04.tgz
>
> > >
>
> > > --
>
> > > Yadu Nand B
>
> > > _______________________________________________
>
> > > Swift-devel mailing list
>
> > > Swift-devel at ci.uchicago.edu
>
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
> > >
>
> > >
>
> > > _______________________________________________
>
> > > Swift-devel mailing list
>
> > > Swift-devel at ci.uchicago.edu
>
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > --
>
> > > Ketan
>
> > >
>
> > >
>
> > > _______________________________________________
>
> > > Swift-devel mailing list
>
> > > Swift-devel at ci.uchicago.edu
>
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
> > >
>
> > _______________________________________________
>
> > Swift-devel mailing list
>
> > Swift-devel at ci.uchicago.edu
>
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>
> --
> Yadu Nand B
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20130320/11d79a82/attachment.html>
More information about the Swift-devel
mailing list