[Swift-user] queue problem?

Michael Wilde wilde at mcs.anl.gov
Thu May 19 09:55:18 CDT 2011


Also, SHeri - are you using Swift 0.92.1?  This looks a bit like the bug in 0.92 that was fixed in 0.92.1

- Mike

----- Original Message -----
> Is this a SwiftScript that ran successfully on the MCS machines but
> fails
> on Fusion? If so, can you point me to the working directory for this
> run?
> Justin
> 
> On Mon, 16 May 2011, Sheri Mickelson wrote:
> 
> > I'm seeing a different error now:
> > mapper.existing() returned a path [3] that it cannot subsequently
> > map
> >
> > It starts up, but dies shortly after that. I attached the log file.
> >
> > -Sheri
> >
> > Justin M Wozniak wrote:
> >>
> >> That's probably a perms thing, I just reapplied the permissions,
> >> please try
> >> again.
> >>
> >> On Mon, 16 May 2011, Sheri Mickelson wrote:
> >>
> >>> Hi Justin,
> >>>
> >>> I'm getting this error when swift tries to run:
> >>>
> >>> Exception in thread "main" java.lang.NoClassDefFoundError:
> >>> org/griphyn/vdl/karajan/Loader
> >>> Caused by: java.lang.ClassNotFoundException:
> >>> org.griphyn.vdl.karajan.Loader
> >>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >>>     at java.security.AccessController.doPrivileged(Native Method)
> >>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> >>>     at
> >>>     sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> >>> Could not find the main class: org.griphyn.vdl.karajan.Loader.
> >>> Program
> >>> will exit.
> >>>
> >>> -Sheri
> >>>
> >>> Justin M Wozniak wrote:
> >>>>
> >>>> Let's go with my trunk-based installation in the location below
> >>>> for now.
> >>>> I tried testing this again over the weekend but did not get
> >>>> through the
> >>>> queue. I have already set up the additional logging in this
> >>>> installation.
> >>>>
> >>>> /homes/wozniak/Public/cog/modules/swift/dist/swift-svn/bin/swift
> >>>>
> >>>>     Justin
> >>>>
> >>>> On Fri, 13 May 2011, Sheri Mickelson wrote:
> >>>>
> >>>>> Here's the log file.
> >>>>> This is the first time I'm running this version of swift on
> >>>>> fusion. I
> >>>>> had done my development work with this swift version on an mcs
> >>>>> compute
> >>>>> machine.
> >>>>>
> >>>>> -Sheri
> >>>>>
> >>>>> Justin M Wozniak wrote:
> >>>>>> Hello
> >>>>>>     Can you send the log for this run?
> >>>>>>     Is this a new issue that appeared after an update?
> >>>>>>     Also, in any future runs regarding this issue, please add
> >>>>>>
> >>>>>> log4j.logger.org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor
> >>>>>> = DEBUG
> >>>>>>
> >>>>>> (one line) to your etc/log4j.properties file.
> >>>>>>
> >>>>>>     Thanks
> >>>>>>     Justin
> >>>>>>
> >>>>>> On Fri, 13 May 2011, Sheri Mickelson wrote:
> >>>>>>
> >>>>>>> I'm running into a problem running swift version 0.92.1 on
> >>>>>>> fusion with
> >>>>>>> coasters.
> >>>>>>> This is the error I'm seeing:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----------------------------------------------------------------------------
> >>>>>>> Progress: Selecting site:168 Submitted:23 Active:2
> >>>>>>> Progress: Selecting site:168 Submitted:23 Active:1 Checking
> >>>>>>> status:1
> >>>>>>> Progress: Selecting site:167 Stage in:1 Submitted:22 Active:2
> >>>>>>> Finished successfully:1
> >>>>>>> queuedsize > 0 but no job dequeued. Queued: {}
> >>>>>>> java.lang.Throwable
> >>>>>>>     at
> >>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252)
> >>>>>>>     at
> >>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520)
> >>>>>>>     at
> >>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> >>>>>>> queuedsize > 0 but no job dequeued. Queued: {}
> >>>>>>> java.lang.Throwable
> >>>>>>>     at
> >>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252)
> >>>>>>>     at
> >>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520)
> >>>>>>>     at
> >>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> >>>>>>> Shutting down worker
> >>>>>>>
> >>>>>>> Shutting down worker
> >>>>>>>
> >>>>>>> Shutting down worker
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----------------------------------------------------------------------------
> >>>>>>> And here's my sites file:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----------------------------------------------------------------------------
> >>>>>>> <config>
> >>>>>>> <pool handle="fusion">
> >>>>>>>  <execution jobmanager="local:pbs" provider="coaster"
> >>>>>>>  url="none"/>
> >>>>>>>  <profile namespace="globus" key="maxtime">3600</profile>
> >>>>>>>  <profile namespace="globus" key="workersPerNode">1</profile>
> >>>>>>>  <profile namespace="globus" key="slots">1</profile>
> >>>>>>>  <profile namespace="globus" key="nodeGranularity">4</profile>
> >>>>>>>  <profile namespace="globus" key="maxNodes">2</profile>
> >>>>>>>  <profile namespace="globus" key="queue">batch</profile>
> >>>>>>>  <profile namespace="karajan" key="jobThrottle">0.23</profile>
> >>>>>>>  <profile namespace="karajan"
> >>>>>>>  key="initialScore">10000</profile>
> >>>>>>>  <profile namespace="globus" key="project">parvis</profile>
> >>>>>>>  <profile namespace="globus"
> >>>>>>>  key="lowOverAllocation">100</profile>
> >>>>>>>  <profile namespace="globus"
> >>>>>>>  key="highOverAllocation">100</profile>
> >>>>>>>  <filesystem provider="local"/>
> >>>>>>>  <workdirectory>/fusion/gpfs/home/mickelso/amwg-swift/swift/</workdirectory>
> >>>>>>> </pool>
> >>>>>>> </config>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----------------------------------------------------------------------------
> >>>>>>> Do you know what might be causing this?
> >>>>>>>
> >>>>>>> Thanks, Sheri
> >>>>>>> _______________________________________________
> >>>>>>> Swift-user mailing list
> >>>>>>> Swift-user at ci.uchicago.edu
> >>>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> 
> --
> Justin M Wozniak
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list