[Swift-devel] Re: swift-falkon problem

Mihael Hategan hategan at mcs.anl.gov
Fri Mar 21 09:02:30 CDT 2008


On Fri, 2008-03-21 at 07:12 -0500, Michael Wilde wrote:
> My latest test on runs of 25, 100, and 1000 jobs seem to indicate that
> with a sync command at the end of the application script, all job status
> and data is returned ok every time.

Why not put it in the wrapper script at the end?

> 
> (This is somewhat curious, as the info and success files fur the current
> would not yet be complete at the time, but the sync command effects all
> other activity on the host, and ensures that at least the currently
> existing dirs, files and data are synced, or that their sync has started).
> 
> Without the sync, at the moment, virtually all jobs fail, and almost
> *no* data is being returned.  Out of 3 runs of 1000 jobs, one run
> returned 2 data files, the other two returned no data files. One 100-job
> run without sync returned 11 of 100 files.
> 
> It seems like the most fruitful testing to see if this sync is totally
> fixing the problem is to do lots more runs.
> 
> I noted that the bblog host (from which I run Swift) has no special NFS
> mount flags, just rw. (I was wondering if they had something on that
> would affect coherence; seems not).
> 
> I did not have a chance to capture the falkon logs in these tests; I
> will look for the ones Ioan mentioned, and try some runs with those logs
> captured.
> 
> The swift logs I did capture are in the CI log dir, wilde/run{317-328}
> 
> run317/comment:amps1 100 sico with sync - ran ok
> run318/comment:amps1 100 with no sync - died on first error
> run319/comment:amps1 without sync - 11 of 100 returned OK
> run320/comment:amps1 100 without sync - no data returned ok
> run321/comment:amps1 100 without sync - no data returned ok
> run322/comment:amps1 100 with sync - all data returned ok
> run323/comment:amps1 100 with sync - all data returned ok
> run324/comment:amps1 1000 with sync - all data returned ok
> run325/comment:amps1 1000 without sync - no data returned ok
> run326/comment:amps1 25 without sync - no data returned ok
> run327/comment:amps1 100 without sync - 2 data files returned ok
> run328/comment:amps1 1000 with sync - all data returned ok
> 
> - Mike
> 
> On 3/20/08 6:23 PM, Ben Clifford wrote:
> > There is flag for NFS mounts, 'noac', which disables attribute caching on 
> > clients, which I think may make the fielsystem behave in the desired 
> > fashion; however it sounds like it also massively reduces filesystem 
> > performance and fileserver load.
> > 
> > Mike, you might be able to persuade MCS systems to make such a filesystem 
> > available.
> > 
> > I suspect some multi-second delay after touching the status file and 
> > before exiting in the wrapper script is probably the best workaround for 
> > now, though.
> > 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 




More information about the Swift-devel mailing list