[Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks

Allan Espinosa aespinosa at cs.uchicago.edu
Thu Jun 2 11:16:36 CDT 2011


You could login at the VOMRS webpage and have them identify your
membership through your DN.

I do occassionally have some sites reject my DN after a while but
haven't investigated the reason for it.

-Allan

2011/6/2 ketan <ketancmaheshwari at gmail.com>:
> Is there a definitive way to know if i am not included on Engage VOMRS,
> since, I am able to create a proxy with vo membership for quite some time
> now.
>
> On 6/2/11 11:06 AM, Allan Espinosa wrote:
>>
>> Hi Ketan,
>>
>> It maybe the case that you are not included in the Engage VOMRS.  If
>> you are included, you could file an OSG ticket regarding this issue on
>> the grid operations center.
>>
>> -Allan
>>
>> 2011/6/2 ketan<ketancmaheshwari at gmail.com>:
>>>
>>> Ok, done this. On another trial, seems the workflow is progressing,
>>> however,
>>> after a few hours, I get the following errors:
>>>
>>> Final status:  Initializing:2636  Failed:393  Finished successfully:395
>>> The following errors have occurred:
>>> 1. Server refused performing the request. Custom message: Bad password.
>>> (error code 1) [Nested exception message:  Custom message: Unexpected
>>> reply:
>>> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could
>>> not
>>> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821
>>> 530-
>>> 530 End.] (393 times)
>>> 2. Application seispeak_agg not executed due to errors in dependencies
>>> (216
>>> times)
>>> 3. Application seispeak_local not executed due to errors in dependencies
>>> (2420 times)
>>>
>>> While, I am debugging, kindly let me know if these message ring any
>>> bells.
>>>
>>> Regards,
>>> Ketan
>>>
>>>
>>> On 6/1/11 1:02 PM, Allan Espinosa wrote:
>>>>
>>>> Hi Ketan,
>>>>
>>>> Could you add debugging for Swift's vdl:stagein calls?  Also, is this
>>>> using the stable branch?
>>>>
>>>> Here's the log4j.properties I always use:
>>>>
>>>> # Set root category priority to WARN and its appenders to CONSOLE and
>>>> FILE.
>>>> log4j.rootCategory=INFO, CONSOLE, FILE
>>>>
>>>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
>>>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
>>>> log4j.appender.CONSOLE.Threshold=INFO
>>>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n
>>>>
>>>> log4j.appender.FILE=org.apache.log4j.FileAppender
>>>> log4j.appender.FILE.File=swift.log
>>>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
>>>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd
>>>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n
>>>>
>>>> log4j.logger.swift=DEBUG
>>>>
>>>> log4j.logger.org.apache.axis.utils=ERROR
>>>>
>>>> log4j.logger.org.globus.swift.trace=INFO
>>>>
>>>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG
>>>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN
>>>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN
>>>>
>>>>
>>>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG
>>>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG
>>>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG
>>>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO
>>>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG
>>>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO
>>>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG
>>>>
>>>> # log4j.logger.org.globus.swift.data.Director=DEBUG
>>>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO
>>>>
>>>>
>>>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF
>>>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF
>>>>
>>>> # Transfer
>>>> #log4j.logger.org.globus.ftp=DEBUG
>>>> #log4j.logger.org.globus.gridftp=DEBUG
>>>>
>>>>
>>>> -Allan
>>>>
>>>> 2011/6/1 ketan<ketancmaheshwari at gmail.com>:
>>>>>
>>>>> Allan,
>>>>>
>>>>> I tried to run the posproc workflow on the OSG whitelisted resources.
>>>>> However, the workflow seems not to respond after completing the first
>>>>> two
>>>>> tasks:
>>>>>
>>>>> I get something like this:
>>>>>
>>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>>>> ..
>>>>> ..
>>>>> ..
>>>>>
>>>>>
>>>>> The sites.xml, tc.data and the log files are on bridled as follows:
>>>>>
>>>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml
>>>>>
>>>>> /home/ketan/osg-tg-effort/cybershake/tc.data
>>>>>
>>>>>
>>>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log
>>>>>
>>>>> Swift is:
>>>>>
>>>>> [bridled.ci.uchicago.edu:cybershake]$ which swift
>>>>> swift is /home/ketan/swift-0.92.1/bin/swift
>>>>>
>>>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid
>>>>> proxy.
>>>>>
>>>>> Could you indicate what are the first debugging steps that I should be
>>>>> taking on osg in such condition?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Ketan
>>>>>



More information about the Swift-devel mailing list