[Swift-devel] Re: swift-falkon problem

Wed Mar 19 16:22:46 CDT 2008

On Wed, 19 Mar 2008, Michael Wilde wrote:

> My (likely outdated) understanding of NFS protocol was that its supposed to
> guarantee close-to-open coherence.  Meaning that if two clients want to access
> a file sequentially, and the writing client closes the file before the reading
> client opens the file, then NFS was supposed to ensure that the reader
> correctly saw the existence and content of the file.

Right.

Linux NFS (but this is going back half a decade) had some problem there (I 
think that caused problems for GRAM2 somewhere, for example) though I do 
not remember the details; and it was also half a decade ago so has a good 
chance of being different now.

A quick google did not find anything that immediately applied.

I've also still not entirely ruled out a race somewhere in the 
falkon->provider-deef->swift stack reporting this.

> If others agree that this should still be the case, then its worth 
> looking at our code to make sure that this is the case.  If it wasnt, 
> you'd think that more things would break, but perhaps Falkon exacerbates 
> any problems in that area due to its low latency.

Indeed, the combination of falkon and local filesystem access is probably 
getting the time between touching the status file on one node and reading 
it on another down pretty low compared to other submission and file access 
protocols.

> The race as far as I know is between the worker writing and moving result,
> info, and success status files, and the swift host seeing these, correct?

That's what your logs look like today. But yesterday had different timings 
that suggested a different problem.

More runs of the kind that failed would be useful, along with the 
corresponding falkon logs that Ioan listed in a mail in this thread.

--