[Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks

Michael Wilde wilde at mcs.anl.gov
Thu Jun 2 11:16:46 CDT 2011


Yes. Run a globus-job-run under that proxy, to any site in the Engage list, and use the command /usr/bin/id to see what UNIX login and group you are mapped to.

- Mike


----- Original Message -----
> Is there a definitive way to know if i am not included on Engage
> VOMRS,
> since, I am able to create a proxy with vo membership for quite some
> time now.
> 
> On 6/2/11 11:06 AM, Allan Espinosa wrote:
> > Hi Ketan,
> >
> > It maybe the case that you are not included in the Engage VOMRS. If
> > you are included, you could file an OSG ticket regarding this issue
> > on
> > the grid operations center.
> >
> > -Allan
> >
> > 2011/6/2 ketan<ketancmaheshwari at gmail.com>:
> >> Ok, done this. On another trial, seems the workflow is progressing,
> >> however,
> >> after a few hours, I get the following errors:
> >>
> >> Final status: Initializing:2636 Failed:393 Finished
> >> successfully:395
> >> The following errors have occurred:
> >> 1. Server refused performing the request. Custom message: Bad
> >> password.
> >> (error code 1) [Nested exception message: Custom message:
> >> Unexpected reply:
> >> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure:
> >> Could not
> >> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821
> >> 530-
> >> 530 End.] (393 times)
> >> 2. Application seispeak_agg not executed due to errors in
> >> dependencies (216
> >> times)
> >> 3. Application seispeak_local not executed due to errors in
> >> dependencies
> >> (2420 times)
> >>
> >> While, I am debugging, kindly let me know if these message ring any
> >> bells.
> >>
> >> Regards,
> >> Ketan
> >>
> >>
> >> On 6/1/11 1:02 PM, Allan Espinosa wrote:
> >>> Hi Ketan,
> >>>
> >>> Could you add debugging for Swift's vdl:stagein calls? Also, is
> >>> this
> >>> using the stable branch?
> >>>
> >>> Here's the log4j.properties I always use:
> >>>
> >>> # Set root category priority to WARN and its appenders to CONSOLE
> >>> and
> >>> FILE.
> >>> log4j.rootCategory=INFO, CONSOLE, FILE
> >>>
> >>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
> >>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
> >>> log4j.appender.CONSOLE.Threshold=INFO
> >>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n
> >>>
> >>> log4j.appender.FILE=org.apache.log4j.FileAppender
> >>> log4j.appender.FILE.File=swift.log
> >>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
> >>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd
> >>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n
> >>>
> >>> log4j.logger.swift=DEBUG
> >>>
> >>> log4j.logger.org.apache.axis.utils=ERROR
> >>>
> >>> log4j.logger.org.globus.swift.trace=INFO
> >>>
> >>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG
> >>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN
> >>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN
> >>>
> >>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG
> >>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG
> >>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG
> >>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO
> >>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG
> >>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO
> >>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG
> >>>
> >>> # log4j.logger.org.globus.swift.data.Director=DEBUG
> >>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO
> >>>
> >>>
> >>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF
> >>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF
> >>>
> >>> # Transfer
> >>> #log4j.logger.org.globus.ftp=DEBUG
> >>> #log4j.logger.org.globus.gridftp=DEBUG
> >>>
> >>>
> >>> -Allan
> >>>
> >>> 2011/6/1 ketan<ketancmaheshwari at gmail.com>:
> >>>> Allan,
> >>>>
> >>>> I tried to run the posproc workflow on the OSG whitelisted
> >>>> resources.
> >>>> However, the workflow seems not to respond after completing the
> >>>> first two
> >>>> tasks:
> >>>>
> >>>> I get something like this:
> >>>>
> >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2
> >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2
> >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2
> >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2
> >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2
> >>>> ..
> >>>> ..
> >>>> ..
> >>>>
> >>>>
> >>>> The sites.xml, tc.data and the log files are on bridled as
> >>>> follows:
> >>>>
> >>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml
> >>>>
> >>>> /home/ketan/osg-tg-effort/cybershake/tc.data
> >>>>
> >>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log
> >>>>
> >>>> Swift is:
> >>>>
> >>>> [bridled.ci.uchicago.edu:cybershake]$ which swift
> >>>> swift is /home/ketan/swift-0.92.1/bin/swift
> >>>>
> >>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a
> >>>> valid
> >>>> proxy.
> >>>>
> >>>> Could you indicate what are the first debugging steps that I
> >>>> should be
> >>>> taking on osg in such condition?
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Ketan
> >>>>
> >>>>
> >>>
> >>
> >
> >
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list