[Swift-devel] MODIS freezes on Midway

Yadu Nand yadudoc1729 at gmail.com
Wed Mar 20 14:46:03 CDT 2013


Quick update.

The sites.xml file for modis.midway was messed up while copying. Now the
tests for local and midway are running fine from the test suite.

I'm seeing the test.beagle fail because there isn't a folder in my name on
lustre... It fails with
Could not submit job
Could not start coaster service
Task ended before registration was received. Failed to download bootstrap
jar.

On test.uc3, I can see jobs getting submitted and completed, but it looks
like an exception is thrown in perl, halting the test.

Sorry about the formatting, I'm mailing from my phone.

-yadu
On Mar 21, 2013 12:43 AM, "Michael Wilde" <wilde at mcs.anl.gov> wrote:

> Also the code is now in svn:
>
>   https://svn.ci.uchicago.edu/svn/vdl2/SwiftTutorials/OSG_2013-03-11/MODIS
>
> - Mike
>
> ----- Original Message -----
> > From: "David Kelly" <davidk at ci.uchicago.edu>
> > To: "Yadu Nand" <yadudoc1729 at gmail.com>
> > Cc: "swift-devel" <swift-devel at ci.uchicago.edu>, "Michael Wilde" <
> wilde at mcs.anl.gov>
> > Sent: Wednesday, March 20, 2013 2:05:56 PM
> > Subject: Re: [Swift-devel] MODIS freezes on Midway
> >
> >
> > Yadu,
> >
> >
> > As a test, I just untarred swiftdemo.v04.tgz, removed the
> > SBATCH_RESERVATION line from setup.sh and was able to run on midway.
> > Send me a message on Skype when you have a few minutes and we can
> > take a closer look at this.
> >
> > David
> >
> >
> > ----- Original Message -----
> >
> >
> > From: "Yadu Nand" <yadudoc1729 at gmail.com>
> > To: "Michael Wilde" <wilde at mcs.anl.gov>
> > Cc: "swift-devel" <swift-devel at ci.uchicago.edu>
> > Sent: Wednesday, March 20, 2013 12:19:01 PM
> > Subject: Re: [Swift-devel] MODIS freezes on Midway
> >
> > Hi everyone,
> >
> >
> > David, please find the submit files attached to the mail.
> > I am running the 5 different variants of modis,
> > (local,midway,beagle,uc3, multiple)
> > from the test system we have. I am not setting the SBATCH_RESERVATION
> > variable
> > in the setup scripts.
> >
> >
> > Ketan, for modis_local and modis_midway, I am setting <filesystem
> > provider="local"/>
> > and interestingly, modis_local works fine from the stress test apps
> > group now, the rest
> > fail though. I think there is a different issue here now.
> >
> >
> > It looks like most failures I'm seeing now is from perl and tc.data
> > issues. I've attached the
> > modis.stdout from the 5 testcases, if you'd like to take a look. The
> > tc.data supplied is the
> > same as the ones that came with swiftdemo.
> >
> >
> > -Yadu
> >
> >
> >
> >
> > On Wed, Mar 20, 2013 at 5:26 AM, Michael Wilde < wilde at mcs.anl.gov >
> > wrote:
> >
> >
> > Likely not needed if its using provider staging.
> >
> >
> > ----- Original Message -----
> > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >
> > > To: "David Kelly" < davidk at ci.uchicago.edu >
> > > Cc: "swift-devel" < swift-devel at ci.uchicago.edu >
> > > Sent: Tuesday, March 19, 2013 6:04:44 PM
> > > Subject: Re: [Swift-devel] MODIS freezes on Midway
> > >
> > >
> > >
> > > In addition to what David mentioned, from logs, it seems that your
> > > sites file is missing this line:
> > >
> > >
> > > <filesystem provider="local"/>
> > >
> > >
> > >
> > > Or David, correct me if this is not required in this configuration.
> > >
> > >
> > >
> > > On Tue, Mar 19, 2013 at 6:00 PM, David Kelly <
> > > davidk at ci.uchicago.edu
> > > > wrote:
> > >
> > >
> > >
> > >
> > > Yadu,
> > >
> > >
> > > The setup.sh script sets this environment variable:
> > >
> > >
> > > export SBATCH_RESERVATION=osg
> > >
> > >
> > > I believe sbatch is picking up on this and trying to run with this
> > > reservation, which is likely expired. Can you try unsetting
> > > SBATCH_RESERVATION, commenting out that line in setup.sh and trying
> > > again?
> > >
> > >
> > > Thanks,
> > > David
> > >
> > >
> > >
> > >
> >
> >
> > > From: "Yadu Nand" < yadudoc1729 at gmail.com >
> > > To: "swift-devel" < swift-devel at ci.uchicago.edu >
> > > Sent: Tuesday, March 19, 2013 5:35:41 PM
> > > Subject: [Swift-devel] MODIS freezes on Midway
> > >
> > >
> > >
> > >
> > > Hi,
> > >
> > > I've been running the modis tests on Midway, from the demo that
> > > mike
> > > had shared:
> > >
> > > /home/wilde/osgdemo/modis/svn/swiftdemo.v04.tgz
> > >
> > >
> > > In the logs(please see attachment) I see a fail message : " sbatch:
> > > error: Batch job
> > > submission failed: Requested reservation is i nvalid"
> > >
> > >
> > > The fact that no error messages are shown on stdout doesn't help,
> > > plus, swift just
> > > seems to hang forever. * Please help! *
> > >
> > >
> > > I see test.midway show no progress, with just the same status for
> > > about 20mins:
> > >
> > > Progress: time: Tue, 19 Mar 2013 20:21:18 +0000 Selecting site:35
> > > Submitted:65
> > >
> > >
> > > After this, I tried to kill by Ctrl+C and then I get a few error
> > > messages :
> > >
> > > Failed to shut down block: Block 0319-5807480-000000 (16x3540.000s)
> > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > > Can only cancel an active task
> > > at
> > >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:196)
> > > at
> > >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> > > at
> > >
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> > > at
> > >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:106)
> > > at
> > >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:95)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:332)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:312)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.shutdownBlocks(BlockQueueProcessor.java:800)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.shutdown(BlockQueueProcessor.java:789)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.shutdown(JobQueue.java:119)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.CoasterService.shutdown(CoasterService.java:271)
> > > at
> > >
> org.globus.cog.abstraction.coaster.service.ServiceShutdownHandler.requestComplete(ServiceShutdownHandler.java:28)
> > > at
> > >
> org.globus.cog.karajan.workflow.service.handlers.RequestHandler.receiveCompleted(RequestHandler.java:88)
> > > at
> > >
> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleRequest(AbstractKarajanChannel.java:519)
> > > at
> > >
> org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel.actualSend(AbstractPipedChannel.java:86)
> > > at
> > >
> org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel$Sender.run(AbstractPipedChannel.java:115)
> > > /home/wilde/osgdemo/modis/svn/swiftdemo.v04.tgz
> > >
> > > --
> > > Yadu Nand B
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > >
> > >
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Ketan
> > >
> > >
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
> >
> >
> >
> > --
> > Yadu Nand B
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20130321/54f3c58e/attachment.html>


More information about the Swift-devel mailing list