[Swift-user] queue problem?

Sheri Mickelson mickelso at mcs.anl.gov
Thu May 19 10:09:28 CDT 2011


Hi Mike,

I was originally running 0.92.1, but I got the "mapper.existing() returned a path [3] that it cannot 
subsequently map" error using Justin's trunk version.

I went back to an older version of swift and I think I might have found what was causing the initial 
error (an error in one of my csh scripts that had the wrong path in it).  I'm still looking into it 
and let you know how it goes.

Justin, the path to my working directory is /home/climate1/mickelso/amwg-swift/test-swift.

-Sheri

Michael Wilde wrote:
> Also, SHeri - are you using Swift 0.92.1?  This looks a bit like the bug in 0.92 that was fixed in 0.92.1
> 
> - Mike
> 
> ----- Original Message -----
>> Is this a SwiftScript that ran successfully on the MCS machines but
>> fails
>> on Fusion? If so, can you point me to the working directory for this
>> run?
>> Justin
>>
>> On Mon, 16 May 2011, Sheri Mickelson wrote:
>>
>>> I'm seeing a different error now:
>>> mapper.existing() returned a path [3] that it cannot subsequently
>>> map
>>>
>>> It starts up, but dies shortly after that. I attached the log file.
>>>
>>> -Sheri
>>>
>>> Justin M Wozniak wrote:
>>>> That's probably a perms thing, I just reapplied the permissions,
>>>> please try
>>>> again.
>>>>
>>>> On Mon, 16 May 2011, Sheri Mickelson wrote:
>>>>
>>>>> Hi Justin,
>>>>>
>>>>> I'm getting this error when swift tries to run:
>>>>>
>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>> org/griphyn/vdl/karajan/Loader
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.griphyn.vdl.karajan.Loader
>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>>>>     at
>>>>>     sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>>>> Could not find the main class: org.griphyn.vdl.karajan.Loader.
>>>>> Program
>>>>> will exit.
>>>>>
>>>>> -Sheri
>>>>>
>>>>> Justin M Wozniak wrote:
>>>>>> Let's go with my trunk-based installation in the location below
>>>>>> for now.
>>>>>> I tried testing this again over the weekend but did not get
>>>>>> through the
>>>>>> queue. I have already set up the additional logging in this
>>>>>> installation.
>>>>>>
>>>>>> /homes/wozniak/Public/cog/modules/swift/dist/swift-svn/bin/swift
>>>>>>
>>>>>>     Justin
>>>>>>
>>>>>> On Fri, 13 May 2011, Sheri Mickelson wrote:
>>>>>>
>>>>>>> Here's the log file.
>>>>>>> This is the first time I'm running this version of swift on
>>>>>>> fusion. I
>>>>>>> had done my development work with this swift version on an mcs
>>>>>>> compute
>>>>>>> machine.
>>>>>>>
>>>>>>> -Sheri
>>>>>>>
>>>>>>> Justin M Wozniak wrote:
>>>>>>>> Hello
>>>>>>>>     Can you send the log for this run?
>>>>>>>>     Is this a new issue that appeared after an update?
>>>>>>>>     Also, in any future runs regarding this issue, please add
>>>>>>>>
>>>>>>>> log4j.logger.org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor
>>>>>>>> = DEBUG
>>>>>>>>
>>>>>>>> (one line) to your etc/log4j.properties file.
>>>>>>>>
>>>>>>>>     Thanks
>>>>>>>>     Justin
>>>>>>>>
>>>>>>>> On Fri, 13 May 2011, Sheri Mickelson wrote:
>>>>>>>>
>>>>>>>>> I'm running into a problem running swift version 0.92.1 on
>>>>>>>>> fusion with
>>>>>>>>> coasters.
>>>>>>>>> This is the error I'm seeing:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----------------------------------------------------------------------------
>>>>>>>>> Progress: Selecting site:168 Submitted:23 Active:2
>>>>>>>>> Progress: Selecting site:168 Submitted:23 Active:1 Checking
>>>>>>>>> status:1
>>>>>>>>> Progress: Selecting site:167 Stage in:1 Submitted:22 Active:2
>>>>>>>>> Finished successfully:1
>>>>>>>>> queuedsize > 0 but no job dequeued. Queued: {}
>>>>>>>>> java.lang.Throwable
>>>>>>>>>     at
>>>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252)
>>>>>>>>>     at
>>>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520)
>>>>>>>>>     at
>>>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
>>>>>>>>> queuedsize > 0 but no job dequeued. Queued: {}
>>>>>>>>> java.lang.Throwable
>>>>>>>>>     at
>>>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252)
>>>>>>>>>     at
>>>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520)
>>>>>>>>>     at
>>>>>>>>> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
>>>>>>>>> Shutting down worker
>>>>>>>>>
>>>>>>>>> Shutting down worker
>>>>>>>>>
>>>>>>>>> Shutting down worker
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----------------------------------------------------------------------------
>>>>>>>>> And here's my sites file:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----------------------------------------------------------------------------
>>>>>>>>> <config>
>>>>>>>>> <pool handle="fusion">
>>>>>>>>>  <execution jobmanager="local:pbs" provider="coaster"
>>>>>>>>>  url="none"/>
>>>>>>>>>  <profile namespace="globus" key="maxtime">3600</profile>
>>>>>>>>>  <profile namespace="globus" key="workersPerNode">1</profile>
>>>>>>>>>  <profile namespace="globus" key="slots">1</profile>
>>>>>>>>>  <profile namespace="globus" key="nodeGranularity">4</profile>
>>>>>>>>>  <profile namespace="globus" key="maxNodes">2</profile>
>>>>>>>>>  <profile namespace="globus" key="queue">batch</profile>
>>>>>>>>>  <profile namespace="karajan" key="jobThrottle">0.23</profile>
>>>>>>>>>  <profile namespace="karajan"
>>>>>>>>>  key="initialScore">10000</profile>
>>>>>>>>>  <profile namespace="globus" key="project">parvis</profile>
>>>>>>>>>  <profile namespace="globus"
>>>>>>>>>  key="lowOverAllocation">100</profile>
>>>>>>>>>  <profile namespace="globus"
>>>>>>>>>  key="highOverAllocation">100</profile>
>>>>>>>>>  <filesystem provider="local"/>
>>>>>>>>>  <workdirectory>/fusion/gpfs/home/mickelso/amwg-swift/swift/</workdirectory>
>>>>>>>>> </pool>
>>>>>>>>> </config>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----------------------------------------------------------------------------
>>>>>>>>> Do you know what might be causing this?
>>>>>>>>>
>>>>>>>>> Thanks, Sheri
>>>>>>>>> _______________________________________________
>>>>>>>>> Swift-user mailing list
>>>>>>>>> Swift-user at ci.uchicago.edu
>>>>>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>>>>>>>>
>> --
>> Justin M Wozniak
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 



More information about the Swift-user mailing list