[Swift-devel] Re: swift-falkon problem
Mihael Hategan
hategan at mcs.anl.gov
Fri Mar 21 09:02:30 CDT 2008
On Fri, 2008-03-21 at 07:12 -0500, Michael Wilde wrote:
> My latest test on runs of 25, 100, and 1000 jobs seem to indicate that
> with a sync command at the end of the application script, all job status
> and data is returned ok every time.
Why not put it in the wrapper script at the end?
>
> (This is somewhat curious, as the info and success files fur the current
> would not yet be complete at the time, but the sync command effects all
> other activity on the host, and ensures that at least the currently
> existing dirs, files and data are synced, or that their sync has started).
>
> Without the sync, at the moment, virtually all jobs fail, and almost
> *no* data is being returned. Out of 3 runs of 1000 jobs, one run
> returned 2 data files, the other two returned no data files. One 100-job
> run without sync returned 11 of 100 files.
>
> It seems like the most fruitful testing to see if this sync is totally
> fixing the problem is to do lots more runs.
>
> I noted that the bblog host (from which I run Swift) has no special NFS
> mount flags, just rw. (I was wondering if they had something on that
> would affect coherence; seems not).
>
> I did not have a chance to capture the falkon logs in these tests; I
> will look for the ones Ioan mentioned, and try some runs with those logs
> captured.
>
> The swift logs I did capture are in the CI log dir, wilde/run{317-328}
>
> run317/comment:amps1 100 sico with sync - ran ok
> run318/comment:amps1 100 with no sync - died on first error
> run319/comment:amps1 without sync - 11 of 100 returned OK
> run320/comment:amps1 100 without sync - no data returned ok
> run321/comment:amps1 100 without sync - no data returned ok
> run322/comment:amps1 100 with sync - all data returned ok
> run323/comment:amps1 100 with sync - all data returned ok
> run324/comment:amps1 1000 with sync - all data returned ok
> run325/comment:amps1 1000 without sync - no data returned ok
> run326/comment:amps1 25 without sync - no data returned ok
> run327/comment:amps1 100 without sync - 2 data files returned ok
> run328/comment:amps1 1000 with sync - all data returned ok
>
> - Mike
>
> On 3/20/08 6:23 PM, Ben Clifford wrote:
> > There is flag for NFS mounts, 'noac', which disables attribute caching on
> > clients, which I think may make the fielsystem behave in the desired
> > fashion; however it sounds like it also massively reduces filesystem
> > performance and fileserver load.
> >
> > Mike, you might be able to persuade MCS systems to make such a filesystem
> > available.
> >
> > I suspect some multi-second delay after touching the status file and
> > before exiting in the wrapper script is probably the best workaround for
> > now, though.
> >
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
More information about the Swift-devel
mailing list