[Swift-devel] MODIS freezes on Midway

Michael Wilde wilde at mcs.anl.gov
Wed Mar 20 15:05:14 CDT 2013


Yadu, I dont think the beagle test requires you to have any directories on beagle.

But it does expect that you have set up your ssh keys so that from the midway login host you can do a password-less ssh to beagle.  Test that manually before trying test.beagle.

- Mike

----- Original Message -----
> From: "Yadu Nand" <yadudoc1729 at gmail.com>
> To: "Michael Wilde" <wilde at mcs.anl.gov>
> Cc: "David Kelly" <davidk at ci.uchicago.edu>, "swift-devel" <swift-devel at ci.uchicago.edu>
> Sent: Wednesday, March 20, 2013 2:46:03 PM
> Subject: Re: [Swift-devel] MODIS freezes on Midway
> 
> 
> 
> Quick update.
> 
> The sites.xml file for modis.midway was messed up while copying. Now
> the tests for local and midway are running fine from the test suite.
> 
> I'm seeing the test.beagle fail because there isn't a folder in my
> name on lustre... It fails with
> Could not submit job
> Could not start coaster service
> Task ended before registration was received. Failed to download
> bootstrap jar.
> 
> On test.uc3, I can see jobs getting submitted and completed, but it
> looks like an exception is thrown in perl, halting the test.
> 
> 
> Sorry about the formatting, I'm mailing from my phone.
> 
> -yadu
> On Mar 21, 2013 12:43 AM, "Michael Wilde" < wilde at mcs.anl.gov >
> wrote:
> 
> 
> Also the code is now in svn:
> 
> https://svn.ci.uchicago.edu/svn/vdl2/SwiftTutorials/OSG_2013-03-11/MODIS
> 
> - Mike
> 
> ----- Original Message -----
> > From: "David Kelly" < davidk at ci.uchicago.edu >
> > To: "Yadu Nand" < yadudoc1729 at gmail.com >
> > Cc: "swift-devel" < swift-devel at ci.uchicago.edu >, "Michael Wilde"
> > < wilde at mcs.anl.gov >
> > Sent: Wednesday, March 20, 2013 2:05:56 PM
> > Subject: Re: [Swift-devel] MODIS freezes on Midway
> > 
> > 
> > Yadu,
> > 
> > 
> > As a test, I just untarred swiftdemo.v04.tgz, removed the
> > SBATCH_RESERVATION line from setup.sh and was able to run on
> > midway.
> > Send me a message on Skype when you have a few minutes and we can
> > take a closer look at this.
> > 
> > David
> > 
> > 
> > ----- Original Message -----
> > 
> > 
> > From: "Yadu Nand" < yadudoc1729 at gmail.com >
> > To: "Michael Wilde" < wilde at mcs.anl.gov >
> > Cc: "swift-devel" < swift-devel at ci.uchicago.edu >
> > Sent: Wednesday, March 20, 2013 12:19:01 PM
> > Subject: Re: [Swift-devel] MODIS freezes on Midway
> > 
> > Hi everyone,
> > 
> > 
> > David, please find the submit files attached to the mail.
> > I am running the 5 different variants of modis,
> > (local,midway,beagle,uc3, multiple)
> > from the test system we have. I am not setting the
> > SBATCH_RESERVATION
> > variable
> > in the setup scripts.
> > 
> > 
> > Ketan, for modis_local and modis_midway, I am setting <filesystem
> > provider="local"/>
> > and interestingly, modis_local works fine from the stress test apps
> > group now, the rest
> > fail though. I think there is a different issue here now.
> > 
> > 
> > It looks like most failures I'm seeing now is from perl and tc.data
> > issues. I've attached the
> > modis.stdout from the 5 testcases, if you'd like to take a look.
> > The
> > tc.data supplied is the
> > same as the ones that came with swiftdemo.
> > 
> > 
> > -Yadu
> > 
> > 
> > 
> > 
> > On Wed, Mar 20, 2013 at 5:26 AM, Michael Wilde < wilde at mcs.anl.gov
> > >
> > wrote:
> > 
> > 
> > Likely not needed if its using provider staging.
> > 
> > 
> > ----- Original Message -----
> > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >
> > > To: "David Kelly" < davidk at ci.uchicago.edu >
> > > Cc: "swift-devel" < swift-devel at ci.uchicago.edu >
> > > Sent: Tuesday, March 19, 2013 6:04:44 PM
> > > Subject: Re: [Swift-devel] MODIS freezes on Midway
> > > 
> > > 
> > > 
> > > In addition to what David mentioned, from logs, it seems that
> > > your
> > > sites file is missing this line:
> > > 
> > > 
> > > <filesystem provider="local"/>
> > > 
> > > 
> > > 
> > > Or David, correct me if this is not required in this
> > > configuration.
> > > 
> > > 
> > > 
> > > On Tue, Mar 19, 2013 at 6:00 PM, David Kelly <
> > > davidk at ci.uchicago.edu
> > > > wrote:
> > > 
> > > 
> > > 
> > > 
> > > Yadu,
> > > 
> > > 
> > > The setup.sh script sets this environment variable:
> > > 
> > > 
> > > export SBATCH_RESERVATION=osg
> > > 
> > > 
> > > I believe sbatch is picking up on this and trying to run with
> > > this
> > > reservation, which is likely expired. Can you try unsetting
> > > SBATCH_RESERVATION, commenting out that line in setup.sh and
> > > trying
> > > again?
> > > 
> > > 
> > > Thanks,
> > > David
> > > 
> > > 
> > > 
> > > 
> > 
> > 
> > > From: "Yadu Nand" < yadudoc1729 at gmail.com >
> > > To: "swift-devel" < swift-devel at ci.uchicago.edu >
> > > Sent: Tuesday, March 19, 2013 5:35:41 PM
> > > Subject: [Swift-devel] MODIS freezes on Midway
> > > 
> > > 
> > > 
> > > 
> > > Hi,
> > > 
> > > I've been running the modis tests on Midway, from the demo that
> > > mike
> > > had shared:
> > > 
> > > /home/wilde/osgdemo/modis/svn/swiftdemo.v04.tgz
> > > 
> > > 
> > > In the logs(please see attachment) I see a fail message : "
> > > sbatch:
> > > error: Batch job
> > > submission failed: Requested reservation is i nvalid"
> > > 
> > > 
> > > The fact that no error messages are shown on stdout doesn't help,
> > > plus, swift just
> > > seems to hang forever. * Please help! *
> > > 
> > > 
> > > I see test.midway show no progress, with just the same status for
> > > about 20mins:
> > > 
> > > Progress: time: Tue, 19 Mar 2013 20:21:18 +0000 Selecting site:35
> > > Submitted:65
> > > 
> > > 
> > > After this, I tried to kill by Ctrl+C and then I get a few error
> > > messages :
> > > 
> > > Failed to shut down block: Block 0319-5807480-000000
> > > (16x3540.000s)
> > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > > Can only cancel an active task
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:196)
> > > at
> > > org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> > > at
> > > org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> > > at
> > > org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:106)
> > > at
> > > org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:95)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:332)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:312)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.shutdownBlocks(BlockQueueProcessor.java:800)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.shutdown(BlockQueueProcessor.java:789)
> > > at
> > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.shutdown(JobQueue.java:119)
> > > at
> > > org.globus.cog.abstraction.coaster.service.CoasterService.shutdown(CoasterService.java:271)
> > > at
> > > org.globus.cog.abstraction.coaster.service.ServiceShutdownHandler.requestComplete(ServiceShutdownHandler.java:28)
> > > at
> > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.receiveCompleted(RequestHandler.java:88)
> > > at
> > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleRequest(AbstractKarajanChannel.java:519)
> > > at
> > > org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel.actualSend(AbstractPipedChannel.java:86)
> > > at
> > > org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel$Sender.run(AbstractPipedChannel.java:115)
> > > /home/wilde/osgdemo/modis/svn/swiftdemo.v04.tgz
> > > 
> > > --
> > > Yadu Nand B
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > 
> > > 
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Ketan
> > > 
> > > 
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > 
> > 
> > 
> > 
> > --
> > Yadu Nand B
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > 
> > 
> 



More information about the Swift-devel mailing list