[Swift-devel] Re: [VDL2-user] GridFTP timeout exception

Mihael Hategan hategan at mcs.anl.gov
Mon Feb 26 14:01:23 CST 2007


Well, post the log.

On Mon, 2007-02-26 at 14:00 -0600, Tiberiu Stef-Praun wrote:
> I fixed that, I am getting back some of the results.
> Aparently the wf is stuck at the point where it needs to delete the
> remote files
> Although that might not be the actual root of all evils, because when
> running on a single site (teraport), several iterations of sets of
> jobs were sent out before the wf stopped completely.
> 
> 
> 
> On 2/26/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > Ip address maybe?
> >
> > On Mon, 2007-02-26 at 13:53 -0600, Tiberiu Stef-Praun wrote:
> > > How do I know that a GridFTP client timeout occurs ?
> > > It seems that my SIDGrid workflow has stopped performing.
> > > Normally it should process 200 parallel tasks, each of which producess
> > > 1GB  in 28 files.
> > >
> > > I am testing with 0.1 rc 1, but the same has happened to me before
> > > (SVn checkout, v0.rc3,etc).
> > >
> > > The workflow freezes after a while (after processing the first round
> > > of jobs submitted? = I received 16G and that's it). The current
> > > scenario is me using 3 teragrid sites (UC, Purdue, NCSA), but the same
> > > behavior (workflow freeze) happened when I ran the workflow on
> > > teraport only. Since It hang, I was always forced to terminate it, so
> > > we never had a full SIDGrid run.
> > >
> > > Any suggestions ?
> > >
> > > BTW, the NCSA problem is a non-issue, I solved it.  The only other
> > > small issue is taking full advantage of all the sites in the
> > > sites.xml. And the big issue is what I listed above.
> > >
> > > Next I will try running the workflow fully at the UC teragrid site.
> > >
> > > Tibi
> > >
> > >
> > > On 2/26/07, Ben Clifford <benc at hawaga.org.uk> wrote:
> > > >
> > > > On Mon, 26 Feb 2007, Mihael Hategan wrote:
> > > >
> > > > > On Fri, 2007-02-23 at 16:58 +0000, Ben Clifford wrote:
> > > > > >
> > > > > > On Fri, 23 Feb 2007, Mihael Hategan wrote:
> > > > > >
> > > > > > > Since this is non-functional (failing to shut down a GridFTP client
> > > > > > > that's not in use any more), I think the message could be moved to info,
> > > > > > > and the stack trace to debug.
> > > > > >
> > > > > > sounds good.
> > > > >
> > > > > Seems like it was at info for about a month now. I split it however to
> > > > > only log the exception in debug.
> > > >
> > > > ok. I guess this error message came from something like 0rc3 as chad
> > > > reported it originally I think.
> > > > --
> > > >
> > > > _______________________________________________
> > > > Swift-devel mailing list
> > > > Swift-devel at ci.uchicago.edu
> > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > >
> > >
> > >
> >
> >
> 
> 




More information about the Swift-devel mailing list