[Swift-devel] Notes from 0.93 meeting

Jonathan Monette jonmon at mcs.anl.gov
Fri Aug 26 20:02:30 CDT 2011


I don't think it matters but it was a thought. When I was working with swift and globus online I had to set it to an Il address and not the dns name. But that might have been for a different reason.

----- Reply message -----
From: "David Kelly" <davidk at ci.uchicago.edu>
Date: Fri, Aug 26, 2011 7:13 pm
Subject: [Swift-devel] Notes from 0.93 meeting
To: "Jonathan Monette" <jonmon at mcs.anl.gov>
Cc: "Mihael Hategan" <hategan at mcs.anl.gov>, "swift-devel Devel" <swift-devel at ci.uchicago.edu>


I set it to communicado.ci.uchicago.edu. I'll try again with IP address.

----- Original Message -----
> From: "Jonathan Monette" <jonmon at mcs.anl.gov>
> To: "David Kelly" <davidk at ci.uchicago.edu>
> Cc: "Mihael Hategan" <hategan at mcs.anl.gov>, "swift-devel Devel" <swift-devel at ci.uchicago.edu>
> Sent: Friday, August 26, 2011 6:54:29 PM
> Subject: Re: [Swift-devel] Notes from 0.93 meeting
> Did you set GLOBUS_HOSTNAME to communicado.ci.uchicago.edu or probably
> better the ip-address of communicado?
> On Aug 26, 2011, at 6:52 PM, David Kelly wrote:
> 
> > I tried setting GLOBUS_HOSTNAME on communicado. The gram log file is
> > no longer created, but I still don't see any jobs being submitted?
> >
> > There is a new set of logs at
> > www.ci.uchicago.edu/~davidk/ranger-gt2-logs2.tar.gz
> >
> > David
> >
> > ----- Original Message -----
> >> From: "Mihael Hategan" <hategan at mcs.anl.gov>
> >> To: "David Kelly" <davidk at ci.uchicago.edu>
> >> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>, "Jonathan
> >> Monette" <jonmon at mcs.anl.gov>
> >> Sent: Friday, August 26, 2011 1:42:13 PM
> >> Subject: Re: [Swift-devel] Notes from 0.93 meeting
> >> "The job manager failed to open stderr" tends to happen when you
> >> have
> >> GLOBUS_HOSTNAME set incorrectly.
> >>
> >> On Fri, 2011-08-26 at 13:38 -0500, David Kelly wrote:
> >>> When I am trying to run the script now, Swift does not seem to be
> >>> submitting the jobs correctly. Nothing it showing up in qstat.
> >>>
> >>> I noticed that a gram log gets created in my home directory that
> >>> says:
> >>> ts=2011-08-26T17:30:03.910618Z id=27215 event=gram.job.end
> >>> level=ERROR gramid=/16145868447994515851/17606392074284884670/
> >>> job_status=4 status=-73 reason="the job manager failed to open
> >>> stdout"
> >>>
> >>> I'm guessing this is the cause of the problem. Bugs #153 and #215
> >>> were related to similar problems with stdout and gt2/sge.
> >>>
> >>> The full logs are at
> >>> http://www.ci.uchicago.edu/~davidk/ranger-gt2-logs.tar.gz
> >>>
> >>> Thanks,
> >>> David
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: "Mihael Hategan" <hategan at mcs.anl.gov>
> >>>> To: "Jonathan Monette" <jonmon at mcs.anl.gov>
> >>>> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>
> >>>> Sent: Thursday, August 25, 2011 5:31:34 PM
> >>>> Subject: Re: [Swift-devel] Notes from 0.93 meeting
> >>>> On Thu, 2011-08-25 at 17:18 -0500, Jonathan Monette wrote:
> >>>>> I can send mail to ci support and cc mike to it and ask what
> >>>>> they
> >>>>> can
> >>>>> do.
> >>>>>
> >>>>> Mihael, is there anyway for Swift to give a little more feedback
> >>>>> besides unknown CA or is that a jglobus problem?
> >>>>
> >>>> It's a jglobus problem.
> >>>>
> >>>> That in itself may not be a big issue, but jglobus is now being
> >>>> heavily
> >>>> re-organized by the globus team, so I'm not sure what the best
> >>>> long-term
> >>>> strategy is here.
> >>>>>
> >>>>> ----- Reply message -----
> >>>>> From: "Sarah Kenny" <skenny at uchicago.edu>
> >>>>> Date: Thu, Aug 25, 2011 5:11 pm
> >>>>> Subject: [Swift-devel] Notes from 0.93 meeting
> >>>>> To: "Jonathan Monette" <jonmon at mcs.anl.gov>
> >>>>> Cc: "Mihael Hategan" <hategan at mcs.anl.gov>, "swift-devel Devel"
> >>>>> <swift-devel at ci.uchicago.edu>
> >>>>>
> >>>>>
> >>>>>
> >>>>> if i had a nickel for every time i dealt with this i'd be rich!
> >>>>> :)
> >>>>> actually, now that i'm looking at our uci machines i actually
> >>>>> have
> >>>>> them updating hourly...so, maybe you want to ask the admins to
> >>>>> do
> >>>>> that
> >>>>> to avoid a full day of confusion whenever they expire :P
> >>>>>
> >>>>> *usually* i can't gsissh either if the certs have expired but,
> >>>>> yeah,
> >>>>> they must be using different CA's now for that on ranger as
> >>>>> mihael
> >>>>> suggests...
> >>>>>
> >>>>> On Thu, Aug 25, 2011 at 2:46 PM, Jonathan Monette
> >>>>> <jonmon at mcs.anl.gov>
> >>>>> wrote:
> >>>>>        True. I did not think that each mechanism would use
> >>>>>        different
> >>>>>        CAs. We might want to ask ci support to update the grid
> >>>>>        certs
> >>>>>        more frequently then to avoid this situation.
> >>>>>
> >>>>>
> >>>>>        On Aug 25, 2011, at 4:42 PM, Mihael Hategan wrote:
> >>>>>
> >>>>>> On Thu, 2011-08-25 at 16:40 -0500, Jonathan Monette
> >>>>>> wrote:
> >>>>>>> That is weird. If you were able to gsissh to ranger I
> >>>>>        would assume
> >>>>>>> that you are able to globus-url-copy to ranger.
> >>>>>>
> >>>>>> Not if the two use different CAs. Or if a password was
> >>>>>> typed
> >>>>>        at the ssh
> >>>>>> login.
> >>>>>>
> >>>>>>> Anyways, what Sarah said should work. I would assume
> >>>>>>> that
> >>>>>        ci would
> >>>>>>> update more frequently to avoid this problem.
> >>>>>>> On Aug 25, 2011, at 4:38 PM, Sarah Kenny wrote:
> >>>>>>>
> >>>>>>>> communicado's certs
> >>>>>>>> (/etc/grid-security/certificates)
> >>>>>>>> are
> >>>>>>>> out-of-date...if you copy
> >>>>>        ranger's /etc/grid-security/certificates
> >>>>>>>> directory to communicado and point yr X509_CERT_DIR
> >>>>>>>> to
> >>>>>>>> it
> >>>>>        you can
> >>>>>>>> get a job thru (a simple globus-job-run with my
> >>>>>>>> vaild
> >>>>>>>> cert
> >>>>>        fails
> >>>>>>>> from communicado at the moment if i don't do this).
> >>>>>>>>
> >>>>>>>> i set our machines at uci to update daily...i think
> >>>>>>>> it's
> >>>>>        less
> >>>>>>>> frequently at ci...
> >>>>>>>>
> >>>>>>>> On Thu, Aug 25, 2011 at 2:17 PM, Mihael Hategan
> >>>>>>>> <hategan at mcs.anl.gov> wrote:
> >>>>>>>>       Can you try a globus-url-copy to
> >>>>>>>>       gridftp.ranger?
> >>>>>>>>
> >>>>>>>>       gridftp.ranger seems to have the NCSA myproxy
> >>>>>>>>       CA.
> >>>>>        You say
> >>>>>>>>       you have the
> >>>>>>>>       proper certificates dir in your
> >>>>>>>>       X509_CERT_DIR,
> >>>>>>>>       and
> >>>>>        that
> >>>>>>>>       directory
> >>>>>>>>       contains the TACC root cert. So it should
> >>>>>>>>       work.
> >>>>>>>>       And
> >>>>>        so
> >>>>>>>>       should swift.
> >>>>>>>>
> >>>>>>>>       Though I think that jglobus should be more
> >>>>>>>>       clear
> >>>>>        about
> >>>>>>>>       "Unknown ca"
> >>>>>>>>       errors. At least the name of the unknown CA
> >>>>>>>>       should
> >>>>>        be part
> >>>>>>>>       of the error
> >>>>>>>>       message.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>       On Thu, 2011-08-25 at 15:55 -0500, David
> >>>>>>>>       Kelly
> >>>>>        wrote:
> >>>>>>>>> $ grid-proxy-info -all
> >>>>>>>>> subject : /C=US/O=National Center for
> >>>>>>>>> Supercomputing
> >>>>>>>>       Applications/CN=David Kelly
> >>>>>>>>> issuer : /C=US/O=National Center for Supercomputing
> >>>>>>>>       Applications/OU=Certificate
> >>>>>>>>       Authorities/CN=MyProxy
> >>>>>>>>> identity : /C=US/O=National Center for
> >>>>>>>>> Supercomputing
> >>>>>>>>       Applications/CN=David Kelly
> >>>>>>>>> type : end entity credential
> >>>>>>>>> strength : 1024 bits
> >>>>>>>>> path : /tmp/x509up_u1878
> >>>>>>>>> timeleft : 9:56:53
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ----- Original Message -----
> >>>>>>>>>> From: "Mihael Hategan" <hategan at mcs.anl.gov>
> >>>>>>>>>> To: "David Kelly" <davidk at ci.uchicago.edu>
> >>>>>>>>>> Cc: "Ketan Maheshwari"
> >>>>>>>>>> <ketancmaheshwari at gmail.com>,
> >>>>>>>>       "swift-devel Devel"
> >>>>>>>>       <swift-devel at ci.uchicago.edu>
> >>>>>>>>>> Sent: Thursday, August 25, 2011 3:42:57 PM
> >>>>>>>>>> Subject: Re: [Swift-devel] Notes from 0.93 meeting
> >>>>>>>>>> Odd. Can you paste the output of 'grid-proxy-info
> >>>>>>>>>> -all'?
> >>>>>>>>>>
> >>>>>>>>>> On Thu, 2011-08-25 at 15:18 -0500, David Kelly
> >>>>>>>>>> wrote:
> >>>>>>>>>>> Sure, here is the full log:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>
> >>>>>         http://www.ci.uchicago.edu/~davidk/001-catsn-ranger-20110825-1515-5tydro91.log
> >>>>>>>>>>>
> >>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>> From: "Mihael Hategan" <hategan at mcs.anl.gov>
> >>>>>>>>>>>> To: "David Kelly" <davidk at ci.uchicago.edu>
> >>>>>>>>>>>> Cc: "Ketan Maheshwari"
> >>>>>>>>>>>> <ketancmaheshwari at gmail.com>,
> >>>>>>>>       "swift-devel
> >>>>>>>>>>>> Devel" <swift-devel at ci.uchicago.edu>
> >>>>>>>>>>>> Sent: Thursday, August 25, 2011 2:43:31 PM
> >>>>>>>>>>>> Subject: Re: [Swift-devel] Notes from 0.93
> >>>>>>>>>>>> meeting
> >>>>>>>>>>>> It's possible that the CA dir on Ranger is not
> >>>>>>>>       properly set up.
> >>>>>>>>>>>> Can
> >>>>>>>>>>>> you
> >>>>>>>>>>>> post the full log?
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, 2011-08-25 at 13:56 -0500, David Kelly
> >>>>>>>>       wrote:
> >>>>>>>>>>>>> Those environment variables were not set up. I
> >>>>>>>>       have them defined
> >>>>>>>>>>>>> now, but I'm still getting the same error.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [davidk at communicado ranger]$ env |grep 509
> >>>>>>>>>>>>> X509_CERT_DIR=/opt/osg-1.2.16/globus/TRUSTED_CA
> >>>>>>>>>>>>> X509_CADIR=/opt/osg-1.2.16/globus/TRUSTED_CA
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [davidk at communicado ranger]$ swift -sites.file
> >>>>>>>>       sites.xml
> >>>>>>>>>>>>> -tc.file
> >>>>>>>>>>>>> tc.data 001-catsn-ranger.swift
> >>>>>>>>>>>>> Swift svn swift-r4987 (swift modified locally)
> >>>>>>>>       cog-r3229
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> RunID: 20110825-1352-f1v940b4
> >>>>>>>>>>>>> Progress: time: Thu, 25 Aug 2011 13:52:59 -0500
> >>>>>>>>>>>>> Progress: time: Thu, 25 Aug 2011 13:53:00 -0500
> >>>>>>>>       Selecting site:7
> >>>>>>>>>>>>> Initializing site shared directory:3
> >>>>>>>>>>>>> Execution failed:
> >>>>>>>>>>>>>     Authentication failed [Caused by: Failure
> >>>>>>>>       unspecified at
> >>>>>>>>>>>>>     GSS-API
> >>>>>>>>>>>>>     level [Caused by: Unknown CA]]
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>>>>> From: "Ketan Maheshwari"
> >>>>>>>>       <ketancmaheshwari at gmail.com>
> >>>>>>>>>>>>>> To: "David Kelly" <davidk at ci.uchicago.edu>
> >>>>>>>>>>>>>> Cc: "Jonathan Monette" <jonmon at mcs.anl.gov>,
> >>>>>>>>       "swift-devel
> >>>>>>>>>>>>>> Devel"
> >>>>>>>>>>>>>> <swift-devel at ci.uchicago.edu>
> >>>>>>>>>>>>>> Sent: Thursday, August 25, 2011 1:32:50 PM
> >>>>>>>>>>>>>> Subject: Re: [Swift-devel] Notes from 0.93
> >>>>>>>>       meeting
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Are your CADIR and CACERT env vars set up?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [communicado:swiftgrid]$ echo $X509_CADIR
> >>>>>>>>>>>>>> /opt/osg-1.2.16/globus/TRUSTED_CA
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [communicado:swiftgrid]$ echo $X509_CERT_DIR
> >>>>>>>>>>>>>> /opt/osg-1.2.16/globus/TRUSTED_CA
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Aug 25, 2011 at 1:29 PM, David Kelly <
> >>>>>>>>>>>>>> davidk at ci.uchicago.edu
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Jon,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Here is what happens when I try this from
> >>>>>>>>       communicado:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [davidk at communicado ~]$ myproxy-logon -l
> >>>>>>>>>>>>>> dkelly
> >>>>>>>>       -s
> >>>>>>>>>>>>>> myproxy.teragrid.org
> >>>>>>>>>>>>>> Enter MyProxy pass phrase:
> >>>>>>>>>>>>>> A credential has been received for user dkelly
> >>>>>>>>       in
> >>>>>>>>>>>>>> /tmp/x509up_u1878.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [davidk at communicado ranger]$ swift -sites.file
> >>>>>>>>       sites.xml
> >>>>>>>>>>>>>> -tc.file
> >>>>>>>>>>>>>> tc.data 001-catsn-ranger.swift
> >>>>>>>>>>>>>> Swift svn swift-r4987 (swift modified locally)
> >>>>>>>>       cog-r3229
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> RunID: 20110825-1326-o3e38fe0
> >>>>>>>>>>>>>> Progress: time: Thu, 25 Aug 2011 13:26:43
> >>>>>>>>>>>>>> -0500
> >>>>>>>>>>>>>> Progress: time: Thu, 25 Aug 2011 13:26:44
> >>>>>>>>>>>>>> -0500
> >>>>>>>>       Selecting
> >>>>>>>>>>>>>> site:8
> >>>>>>>>>>>>>> Initializing site shared directory:2
> >>>>>>>>>>>>>> Execution failed:
> >>>>>>>>>>>>>> Authentication failed [Caused by: Failure
> >>>>>>>>       unspecified at
> >>>>>>>>>>>>>> GSS-API
> >>>>>>>>>>>>>> level
> >>>>>>>>>>>>>> [Caused by: Unknown CA]]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Any ideas?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> David
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>>> Swift-devel mailing list
> >>>>>>>>>>>>>> Swift-devel at ci.uchicago.edu
> >>>>>>>>>>>>>>
> >>>>>>>>
> >>>>>         https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> Ketan
> >>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>> Swift-devel mailing list
> >>>>>>>>>>>>> Swift-devel at ci.uchicago.edu
> >>>>>>>>>>>>>
> >>>>>>>>
> >>>>>         https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>       _______________________________________________
> >>>>>>>>       Swift-devel mailing list
> >>>>>>>>       Swift-devel at ci.uchicago.edu
> >>>>>>>>
> >>>>>         https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Sarah Kenny
> >>>>>>>> Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio
> >>>>>>>> Sci
> >>>>>        III
> >>>>>>>> University of California Irvine, Dept. of Neurology
> >>>>>>>> ~
> >>>>>        773-818-8300
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Swift-devel mailing list
> >>>>>>>> Swift-devel at ci.uchicago.edu
> >>>>>>>>
> >>>>>        https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Sarah Kenny
> >>>>> Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III
> >>>>> University of California Irvine, Dept. of Neurology ~
> >>>>> 773-818-8300
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Swift-devel mailing list
> >>>> Swift-devel at ci.uchicago.edu
> >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110826/840b6d78/attachment.html>


More information about the Swift-devel mailing list