[Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks

ketan ketancmaheshwari at gmail.com
Thu Jun 2 11:07:50 CDT 2011


Is there a definitive way to know if i am not included on Engage VOMRS, 
since, I am able to create a proxy with vo membership for quite some 
time now.

On 6/2/11 11:06 AM, Allan Espinosa wrote:
> Hi Ketan,
>
> It maybe the case that you are not included in the Engage VOMRS.  If
> you are included, you could file an OSG ticket regarding this issue on
> the grid operations center.
>
> -Allan
>
> 2011/6/2 ketan<ketancmaheshwari at gmail.com>:
>> Ok, done this. On another trial, seems the workflow is progressing, however,
>> after a few hours, I get the following errors:
>>
>> Final status:  Initializing:2636  Failed:393  Finished successfully:395
>> The following errors have occurred:
>> 1. Server refused performing the request. Custom message: Bad password.
>> (error code 1) [Nested exception message:  Custom message: Unexpected reply:
>> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could not
>> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821
>> 530-
>> 530 End.] (393 times)
>> 2. Application seispeak_agg not executed due to errors in dependencies (216
>> times)
>> 3. Application seispeak_local not executed due to errors in dependencies
>> (2420 times)
>>
>> While, I am debugging, kindly let me know if these message ring any bells.
>>
>> Regards,
>> Ketan
>>
>>
>> On 6/1/11 1:02 PM, Allan Espinosa wrote:
>>> Hi Ketan,
>>>
>>> Could you add debugging for Swift's vdl:stagein calls?  Also, is this
>>> using the stable branch?
>>>
>>> Here's the log4j.properties I always use:
>>>
>>> # Set root category priority to WARN and its appenders to CONSOLE and
>>> FILE.
>>> log4j.rootCategory=INFO, CONSOLE, FILE
>>>
>>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
>>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
>>> log4j.appender.CONSOLE.Threshold=INFO
>>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n
>>>
>>> log4j.appender.FILE=org.apache.log4j.FileAppender
>>> log4j.appender.FILE.File=swift.log
>>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
>>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd
>>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n
>>>
>>> log4j.logger.swift=DEBUG
>>>
>>> log4j.logger.org.apache.axis.utils=ERROR
>>>
>>> log4j.logger.org.globus.swift.trace=INFO
>>>
>>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG
>>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN
>>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN
>>>
>>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG
>>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG
>>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG
>>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO
>>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG
>>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO
>>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG
>>>
>>> # log4j.logger.org.globus.swift.data.Director=DEBUG
>>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO
>>>
>>>
>>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF
>>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF
>>>
>>> # Transfer
>>> #log4j.logger.org.globus.ftp=DEBUG
>>> #log4j.logger.org.globus.gridftp=DEBUG
>>>
>>>
>>> -Allan
>>>
>>> 2011/6/1 ketan<ketancmaheshwari at gmail.com>:
>>>> Allan,
>>>>
>>>> I tried to run the posproc workflow on the OSG whitelisted resources.
>>>> However, the workflow seems not to respond after completing the first two
>>>> tasks:
>>>>
>>>> I get something like this:
>>>>
>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>> ..
>>>> ..
>>>> ..
>>>>
>>>>
>>>> The sites.xml, tc.data and the log files are on bridled as follows:
>>>>
>>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml
>>>>
>>>> /home/ketan/osg-tg-effort/cybershake/tc.data
>>>>
>>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log
>>>>
>>>> Swift is:
>>>>
>>>> [bridled.ci.uchicago.edu:cybershake]$ which swift
>>>> swift is /home/ketan/swift-0.92.1/bin/swift
>>>>
>>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid
>>>> proxy.
>>>>
>>>> Could you indicate what are the first debugging steps that I should be
>>>> taking on osg in such condition?
>>>>
>>>>
>>>> Thanks,
>>>> Ketan
>>>>
>>>>
>>>
>>
>
>



More information about the Swift-devel mailing list