[Swift-user] queue problem?
Justin M Wozniak
wozniak at mcs.anl.gov
Thu May 19 09:48:58 CDT 2011
Is this a SwiftScript that ran successfully on the MCS machines but fails
on Fusion? If so, can you point me to the working directory for this run?
Justin
On Mon, 16 May 2011, Sheri Mickelson wrote:
> I'm seeing a different error now:
> mapper.existing() returned a path [3] that it cannot subsequently map
>
> It starts up, but dies shortly after that. I attached the log file.
>
> -Sheri
>
> Justin M Wozniak wrote:
>>
>> That's probably a perms thing, I just reapplied the permissions, please try
>> again.
>>
>> On Mon, 16 May 2011, Sheri Mickelson wrote:
>>
>>> Hi Justin,
>>>
>>> I'm getting this error when swift tries to run:
>>>
>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>> org/griphyn/vdl/karajan/Loader
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.griphyn.vdl.karajan.Loader
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>> Could not find the main class: org.griphyn.vdl.karajan.Loader. Program
>>> will exit.
>>>
>>> -Sheri
>>>
>>> Justin M Wozniak wrote:
>>>>
>>>> Let's go with my trunk-based installation in the location below for now.
>>>> I tried testing this again over the weekend but did not get through the
>>>> queue. I have already set up the additional logging in this
>>>> installation.
>>>>
>>>> /homes/wozniak/Public/cog/modules/swift/dist/swift-svn/bin/swift
>>>>
>>>> Justin
>>>>
>>>> On Fri, 13 May 2011, Sheri Mickelson wrote:
>>>>
>>>>> Here's the log file.
>>>>> This is the first time I'm running this version of swift on fusion. I
>>>>> had done my development work with this swift version on an mcs compute
>>>>> machine.
>>>>>
>>>>> -Sheri
>>>>>
>>>>> Justin M Wozniak wrote:
>>>>>> Hello
>>>>>> Can you send the log for this run?
>>>>>> Is this a new issue that appeared after an update?
>>>>>> Also, in any future runs regarding this issue, please add
>>>>>>
>>>>>> log4j.logger.org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor
>>>>>> = DEBUG
>>>>>>
>>>>>> (one line) to your etc/log4j.properties file.
>>>>>>
>>>>>> Thanks
>>>>>> Justin
>>>>>>
>>>>>> On Fri, 13 May 2011, Sheri Mickelson wrote:
>>>>>>
>>>>>>> I'm running into a problem running swift version 0.92.1 on fusion with
>>>>>>> coasters.
>>>>>>> This is the error I'm seeing:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----------------------------------------------------------------------------
>>>>>>> Progress: Selecting site:168 Submitted:23 Active:2
>>>>>>> Progress: Selecting site:168 Submitted:23 Active:1 Checking
>>>>>>> status:1
>>>>>>> Progress: Selecting site:167 Stage in:1 Submitted:22 Active:2
>>>>>>> Finished successfully:1
>>>>>>> queuedsize > 0 but no job dequeued. Queued: {}
>>>>>>> java.lang.Throwable
>>>>>>> at
>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252)
>>>>>>> at
>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520)
>>>>>>> at
>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
>>>>>>> queuedsize > 0 but no job dequeued. Queued: {}
>>>>>>> java.lang.Throwable
>>>>>>> at
>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252)
>>>>>>> at
>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520)
>>>>>>> at
>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
>>>>>>> Shutting down worker
>>>>>>>
>>>>>>> Shutting down worker
>>>>>>>
>>>>>>> Shutting down worker
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----------------------------------------------------------------------------
>>>>>>> And here's my sites file:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----------------------------------------------------------------------------
>>>>>>> <config>
>>>>>>> <pool handle="fusion">
>>>>>>> <execution jobmanager="local:pbs" provider="coaster" url="none"/>
>>>>>>> <profile namespace="globus" key="maxtime">3600</profile>
>>>>>>> <profile namespace="globus" key="workersPerNode">1</profile>
>>>>>>> <profile namespace="globus" key="slots">1</profile>
>>>>>>> <profile namespace="globus" key="nodeGranularity">4</profile>
>>>>>>> <profile namespace="globus" key="maxNodes">2</profile>
>>>>>>> <profile namespace="globus" key="queue">batch</profile>
>>>>>>> <profile namespace="karajan" key="jobThrottle">0.23</profile>
>>>>>>> <profile namespace="karajan" key="initialScore">10000</profile>
>>>>>>> <profile namespace="globus" key="project">parvis</profile>
>>>>>>> <profile namespace="globus" key="lowOverAllocation">100</profile>
>>>>>>> <profile namespace="globus" key="highOverAllocation">100</profile>
>>>>>>> <filesystem provider="local"/>
>>>>>>> <workdirectory>/fusion/gpfs/home/mickelso/amwg-swift/swift/</workdirectory>
>>>>>>> </pool>
>>>>>>> </config>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----------------------------------------------------------------------------
>>>>>>> Do you know what might be causing this?
>>>>>>>
>>>>>>> Thanks, Sheri
>>>>>>> _______________________________________________
>>>>>>> Swift-user mailing list
>>>>>>> Swift-user at ci.uchicago.edu
>>>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
--
Justin M Wozniak
More information about the Swift-user
mailing list