[Swift-devel] Coaster provider staging data xfer problem
Michael Wilde
wilde at mcs.anl.gov
Sun Oct 3 21:20:05 CDT 2010
Interesting: at 5000 jobs, the run completes normally; at 10,000 it fails again, as before. (I adjusted the script slightly to give 10,000 jobs instead of 9999)
- Mike
----- "Michael Wilde" <wilde at mcs.anl.gov> wrote:
> I just re-ran what I thought was my failing test, and it ran OK, but
> failed strangely in the swift cleanup process.
>
> Localhost coasters; provider staging.
>
> This run is at the moment on communicado in /tmp/wilde/run01 (Im
> running in /tmp due to the nature of the IO)
>
> The test is running 9999 cat jobs; one file in and one out per job.
> The file sizes are on order of <500KB each (random sizes).
>
> all 9999 files were produced, but then I got a lot of unlink messages
> and some strange exit code 11 error.
>
> The messages are in swift.stdouterr
>
> The script eas executed using ./run.sh; tc and sites file are in that
> run01 dir.
>
> This is worth looking at but low prio I think. I think the script
> terminated cleanly on smaller runs (-n=5, -n=100). So perhaps provider
> staging gets confused or has sync/mutex problems related to cleanup
> that occur at larger volumes of file copies???
>
> At any rate, this was *not* the error that I was referring to in the
> message below; in that test, staging died in the middle of a run. I
> will also try to test between two hosts.
>
> - Mike
>
>
> ----- "Mihael Hategan" <hategan at mcs.anl.gov> wrote:
>
> > On Sat, 2010-10-02 at 17:51 -0600, wilde at mcs.anl.gov wrote:
> > > ----- "Mihael Hategan" <hategan at mcs.anl.gov> wrote:
> > > >
> > > > Ok. I'll look at that. Just to be clear, you are talking about
> > > > gridftp=coaster rather than use.provider.staging, right?
> > >
> > > No, I dont *think* so!
> >
> > Ok.
> >
> > >
> > > What I meant above was provider staging via the coaster execution
> > > provider, which is the only coaster based data transport
> technique
> > I
> > > knew of.
> > >
> > > I'll try to replicate my test and send it.
> >
> > Ok. I tried 1024 jobs, 8 concurrent, 7MB files and I can't
> reproduce
> > it,
> > so it may not be straightforward.
> >
> > >
> > > I didnt know there was such a thing as gridftp=coaster!
> > >
> > > Would that be done by saying <filesystem provider="coaster"> ?
> >
> > Yes.
> >
> > > I didnt know you could say either of those. Can you explain what
> > that
> > > would do and how to say it? Is it a different data provider path
> > than
> > > provider staging, but which still uses coasters? Independent of
> > > coaster execution? (I might be way off base here, sorry!)
> >
> > Yes, and yes. Though I recommend provider staging.
> >
> > Mihael
>
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list