[Swift-devel] persistent coasters on OSG

Mihael Hategan hategan at mcs.anl.gov
Tue Aug 23 14:27:54 CDT 2011


If you look through the service log, you see that all "lost connection
to worker" messages come from workers on nemo. That implies that
something is wrong there, but I can't tell what it is.

Perhaps enabling worker logging for workers on nemo might shed some
light on the issue.

On Tue, 2011-08-23 at 14:17 -0500, Michael Wilde wrote:
> Can you describe what you are seeing on Nemo and what to look for there?
> 
> - Mike
> 
> ----- Original Message -----
> > From: "Mihael Hategan" <hategan at mcs.anl.gov>
> > To: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> > Cc: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > Sent: Tuesday, August 23, 2011 2:15:03 PM
> > Subject: Re: [Swift-devel] persistent coasters on OSG
> > It looks like workers on nemo are somehow messed up. Can you find out
> > why?
> > 
> > On Mon, 2011-08-22 at 13:45 -0500, Ketan Maheshwari wrote:
> > > Hi Mihael, All,
> > >
> > >
> > > I am trying to test the persistent coasters setup with OSG sites
> > > from
> > > communicado and see some intermittent exceptions/ jobs failed errors
> > > which eventually succeed on retries.
> > >
> > >
> > > The exceptions I see from the log are mostly low-level network
> > > exceptions: (Channel Exceptions, Broken Pipe SocketExceptions,
> > > Timeout, etc.).
> > >
> > >
> > > The runs that I tried were incremental catsn runs with n=1,10,50 and
> > > 100 and data.txt=100MB and 200MB.
> > >
> > >
> > > The only run that had the above mentioned errors were the ones with
> > > n=100 and data.txt=200MB.
> > >
> > >
> > > The other runs completed without any errors.
> > >
> > >
> > > I used just one OSG site for these runs.
> > >
> > >
> > > Attaching the sites, log files and a file that contains exception
> > > messages grepped from log files.
> > >
> > >
> > > Any clues as to harden this, I had about 5 errors on today's run and
> > > about 11 on a similar run last week.
> > >
> > >
> > >
> > >
> > > Regards,
> > > --
> > > Ketan
> > >
> > >
> > >
> > 
> > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 





More information about the Swift-devel mailing list