Inconsistent results on bluegene (reproduce the same problem on ANL's BG/L)

Robert Latham robl at mcs.anl.gov
Fri Jun 2 14:24:42 CDT 2006


On Thu, May 18, 2006 at 07:40:08AM -0700, Yu-Heng Tseng wrote:
> The test case is run under home directory. The inconsistent problem 
> exists for both ANL's and NCAR's BG/L system. So I suspect this may 
> not be a single issue. Thank you so much if someone can solve this 
> problem.

Hi Yu-heng

Sorry for the delay in getting back to you, but I've had a chance to
look at this a little bit now.  I too see these non-zero results for
16 processes on both NFS and PVFS2 (when I change the file name in the
test program).

This is probably a filesystem issue.  NFS caching can cause problems
in some cases.  Further, both PVFS2 and NFS ignores fcntl lock
requests.  Additionally, PVFS2 on argonne's BGL is treated like a
regular unix file system.  

ROMIO's noncontigous I/O optimizations on unix-like file systems
perform a read-modify-write, and require file locking to eliminate the
possiblity of false sharing among different processes.   ROMIO can try
to work around this for nfs, but I don't think IBM's ROMIO is
configured for that (nfs:/path/to/file results in "unsupported file
system).  

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B




More information about the parallel-netcdf mailing list