[Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks

Allan Espinosa aespinosa at cs.uchicago.edu
Thu Jun 2 11:06:04 CDT 2011


Hi Ketan,

It maybe the case that you are not included in the Engage VOMRS.  If
you are included, you could file an OSG ticket regarding this issue on
the grid operations center.

-Allan

2011/6/2 ketan <ketancmaheshwari at gmail.com>:
> Ok, done this. On another trial, seems the workflow is progressing, however,
> after a few hours, I get the following errors:
>
> Final status:  Initializing:2636  Failed:393  Finished successfully:395
> The following errors have occurred:
> 1. Server refused performing the request. Custom message: Bad password.
> (error code 1) [Nested exception message:  Custom message: Unexpected reply:
> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could not
> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821
> 530-
> 530 End.] (393 times)
> 2. Application seispeak_agg not executed due to errors in dependencies (216
> times)
> 3. Application seispeak_local not executed due to errors in dependencies
> (2420 times)
>
> While, I am debugging, kindly let me know if these message ring any bells.
>
> Regards,
> Ketan
>
>
> On 6/1/11 1:02 PM, Allan Espinosa wrote:
>>
>> Hi Ketan,
>>
>> Could you add debugging for Swift's vdl:stagein calls?  Also, is this
>> using the stable branch?
>>
>> Here's the log4j.properties I always use:
>>
>> # Set root category priority to WARN and its appenders to CONSOLE and
>> FILE.
>> log4j.rootCategory=INFO, CONSOLE, FILE
>>
>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
>> log4j.appender.CONSOLE.Threshold=INFO
>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n
>>
>> log4j.appender.FILE=org.apache.log4j.FileAppender
>> log4j.appender.FILE.File=swift.log
>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd
>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n
>>
>> log4j.logger.swift=DEBUG
>>
>> log4j.logger.org.apache.axis.utils=ERROR
>>
>> log4j.logger.org.globus.swift.trace=INFO
>>
>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG
>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN
>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN
>>
>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG
>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG
>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG
>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO
>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG
>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO
>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG
>>
>> # log4j.logger.org.globus.swift.data.Director=DEBUG
>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO
>>
>>
>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF
>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF
>>
>> # Transfer
>> #log4j.logger.org.globus.ftp=DEBUG
>> #log4j.logger.org.globus.gridftp=DEBUG
>>
>>
>> -Allan
>>
>> 2011/6/1 ketan<ketancmaheshwari at gmail.com>:
>>>
>>> Allan,
>>>
>>> I tried to run the posproc workflow on the OSG whitelisted resources.
>>> However, the workflow seems not to respond after completing the first two
>>> tasks:
>>>
>>> I get something like this:
>>>
>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>> Progress:  Selecting site:248  Stage in:2  Finished successfully:2
>>> ..
>>> ..
>>> ..
>>>
>>>
>>> The sites.xml, tc.data and the log files are on bridled as follows:
>>>
>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml
>>>
>>> /home/ketan/osg-tg-effort/cybershake/tc.data
>>>
>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log
>>>
>>> Swift is:
>>>
>>> [bridled.ci.uchicago.edu:cybershake]$ which swift
>>> swift is /home/ketan/swift-0.92.1/bin/swift
>>>
>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid
>>> proxy.
>>>
>>> Could you indicate what are the first debugging steps that I should be
>>> taking on osg in such condition?
>>>
>>>
>>> Thanks,
>>> Ketan
>>>
>>>
>>
>>
>
>



-- 
Allan M. Espinosa <http://amespinosa.wordpress.com>
PhD student, Computer Science
University of Chicago <http://people.cs.uchicago.edu/~aespinosa>



More information about the Swift-devel mailing list