Inconsistent results on bluegene (reproduce the same problem on ANL's BG/L)
Robert Latham
robl at mcs.anl.gov
Fri Jun 2 14:24:42 CDT 2006
On Thu, May 18, 2006 at 07:40:08AM -0700, Yu-Heng Tseng wrote:
> The test case is run under home directory. The inconsistent problem
> exists for both ANL's and NCAR's BG/L system. So I suspect this may
> not be a single issue. Thank you so much if someone can solve this
> problem.
Hi Yu-heng
Sorry for the delay in getting back to you, but I've had a chance to
look at this a little bit now. I too see these non-zero results for
16 processes on both NFS and PVFS2 (when I change the file name in the
test program).
This is probably a filesystem issue. NFS caching can cause problems
in some cases. Further, both PVFS2 and NFS ignores fcntl lock
requests. Additionally, PVFS2 on argonne's BGL is treated like a
regular unix file system.
ROMIO's noncontigous I/O optimizations on unix-like file systems
perform a read-modify-write, and require file locking to eliminate the
possiblity of false sharing among different processes. ROMIO can try
to work around this for nfs, but I don't think IBM's ROMIO is
configured for that (nfs:/path/to/file results in "unsupported file
system).
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA B29D F333 664A 4280 315B
More information about the parallel-netcdf
mailing list