[Swift-devel] Swift jobs failing

Michael Wilde wilde at mcs.anl.gov
Thu Aug 25 22:25:09 CDT 2011


The ca certs in these directories work for the NCSA CA for accessing Ranger:

com$ export X509_CADIR=/home/wilde/TRUSTEDCA
com$ 
com$ 
com$ export X509_CERT_DIR=/home/wilde/TRUSTEDCA
com$ globus-job-run gatekeeper.ranger.tacc.teragrid.org:2119 /usr/bin/id
uid=455797(tg455797) gid=80243(G-80243) groups=80243(G-80243),81031(G-81031),81411(G-81411),81611(G-81611),81613(G-81613),81621(G-81621),81747(G-81747),81792(G-81792),800744(G-800744),800745(G-800745),800889(G-800889),800981(G-800981),800983(G-800983),801271(G-801271),801364(G-801364),801525(G-801525),801551(G-801551),801694(G-801694),801708(G-801708),801758(G-801758),801759(G-801759),801897(G-801897),802865(G-802865)
com$ 



----- Original Message -----
> From: "Jonathan Monette" <jonmon at mcs.anl.gov>
> To: "Mihael Hategan" <hategan at mcs.anl.gov>
> Cc: "swift-devel Devel" <swift-devel at ci.uchicago.edu>
> Sent: Thursday, August 25, 2011 9:28:35 PM
> Subject: Re: [Swift-devel] Swift jobs failing
> Never mind. I found a similar error.
> [Caused by: Defective credential detected [Caused by: CRL for CA
> "DC=org,DC=DOEGrids,OU=Certificate Authorities,CN=DOEGrids CA 1" has
> expired.]]
> 
> I will email support at ci and let them know of the situation and if they
> can up the frequency they update the CAs.
> 
> On Aug 25, 2011, at 8:17 PM, Mihael Hategan wrote:
> 
> > Authentication failed [Caused by: Defective credential detected
> > [Caused
> > by: CRL for CA "C=US,O=National Center for Supercomputing
> > Applications,OU=Certificate Authorities,CN=MyProxy" has expired.]]
> >
> > Sorry, but things cannot work properly if certificates are messed
> > up.
> > Please complain to support at ci.
> >
> > In the mean time I committed a patch to print better error messages
> > in
> > such cases (i.e. the above message). So please test that.
> >
> > On Thu, 2011-08-25 at 18:13 -0500, Jonathan Monette wrote:
> >> www.ci.uchicago.edu/~jonmon/logs/coasters.log.tar.gz
> >> On Aug 25, 2011, at 6:03 PM, Mihael Hategan wrote:
> >>
> >>> On Thu, 2011-08-25 at 17:51 -0500, Jonathan Monette wrote:
> >>>> my bad….try www.ci.uchicago.edu/~jonmon/logs/coasters.log
> >>>
> >>> Could you gzip that?
> >>>
> >>>>
> >>>> On Aug 25, 2011, at 5:41 PM, Mihael Hategan wrote:
> >>>>
> >>>>> mike at blabla:~/tmp$ wget
> >>>>> http://www.ci.uchicago.edu/~jonmon/log/coaster.log
> >>>>> --2011-08-25 15:40:31--
> >>>>> http://www.ci.uchicago.edu/~jonmon/log/coaster.log
> >>>>> Resolving www.ci.uchicago.edu... 192.5.86.67
> >>>>> Connecting to www.ci.uchicago.edu|192.5.86.67|:80... connected.
> >>>>> HTTP request sent, awaiting response... 404 Not Found
> >>>>> 2011-08-25 15:40:32 ERROR 404: Not Found.
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, 2011-08-25 at 16:43 -0500, Jonathan Monette wrote:
> >>>>>> Here is the link. The log is very long(has entries from 2010)
> >>>>>> but the log toward the bottom does mention that it could not
> >>>>>> register the coaster service.
> >>>>>> www.ci.uchicago.edu/~jonmon/log/coaster.log
> >>>>>>
> >>>>>> On Aug 25, 2011, at 2:11 PM, Mihael Hategan wrote:
> >>>>>>
> >>>>>>> Can you post the coaster log on the remote machine?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, 2011-08-25 at 14:07 -0500, Jonathan Monette wrote:
> >>>>>>>> Ok. I will check my configuration. I don't see very helpful
> >>>>>>>> messages
> >>>>>>>> in the log file but I will give a closer look. More
> >>>>>>>> information to
> >>>>>>>> follow.
> >>>>>>>>
> >>>>>>>> On Aug 25, 2011, at 2:05 PM, Ketan Maheshwari wrote:
> >>>>>>>>
> >>>>>>>>> I am using 0.93 from Communicado to OSG using
> >>>>>>>>> persistent-coasters. I
> >>>>>>>>> do not see such messages.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thu, Aug 25, 2011 at 1:42 PM, Jonathan Monette
> >>>>>>>>> <jonmon at mcs.anl.gov> wrote:
> >>>>>>>>>     This also seems to be happening in trunk. Is anyone
> >>>>>>>>>     seeing
> >>>>>>>>>     this issue with there code?
> >>>>>>>>>
> >>>>>>>>>     On Aug 25, 2011, at 12:44 PM, Jonathan Monette wrote:
> >>>>>>>>>
> >>>>>>>>>> I started a run of my SwiftMontage work and all the jobs
> >>>>>>>>>     keep failing. No progress is being made and the swift
> >>>>>>>>>     stdout will have the line "failed but can retry: 'some
> >>>>>>>>>     number'". The log is located at
> >>>>>>>>>     www.ci.uchicago.edu/~jonmon/logs/montage-20110825-1232-y104fa88.log.
> >>>>>>>>>     This is with the most recent version of 0.93.
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Swift-devel mailing list
> >>>>>>>>>> Swift-devel at ci.uchicago.edu
> >>>>>>>>>>
> >>>>>>>>>     https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>>>
> >>>>>>>>>     _______________________________________________
> >>>>>>>>>     Swift-devel mailing list
> >>>>>>>>>     Swift-devel at ci.uchicago.edu
> >>>>>>>>>     https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Ketan
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Swift-devel mailing list
> >>>>>>>> Swift-devel at ci.uchicago.edu
> >>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list