pnetcdf and mvapich2 2.2

Wei-keng Liao wkliao at eecs.northwestern.edu
Fri Feb 3 17:21:37 CST 2017


Hi, Mark

The Lustre driver in mvapich2 appears to append O_CREAT to the open mode,
(line 50, in file src/mpi/romio/adio/ad_lustre/ad_lustre_open.c), even
if the file is open for read-only. This is the root cause of one of the
error messages you are seeing:
   "expect error code NC_ENOENT but got NC_ENOTNC"

Attached is a small MPI program to verify such error. Could you please
give it a try on your mvapich2 build and Lustre?
Compile command:
    mpicc test_open_no_such_file.c -o test_open_no_such_file
Run command:
    mpiexec -n 1 ./test_open_no_such_file /lustre/path/non-exist-file

If it is indeed an internal issue of mvapich, I can file a bug report to them.
thanks


Wei-keng

On Feb 3, 2017, at 12:25 PM, Wei-keng Liao wrote:

> Hi, Mark
> 
> For running "make check" on Lustre, could you please set the environment
> variable PNETCDF_HINTS to "nc_header_align_size=512;nc_var_align_size=1"
> and run "make check" again? I think it should pass make check. Do let me
> know. These errors only occur for file systems whose striping size is
> larger than 1. So, ext4 is not affected. I am working on a fix for that
> test program. Please note this is a bug in the test program. the PnetCDF
> library itself is intact.
> 
> When running "make check", I suggest not to set the environment variable
> PNETCDF_VERBOSE_DEBUG_MODE, as many error checks are designed on
> purpose. Those debugging messages can easily mask the true errors. That
> environment variable is designed for testing one program at a time.
> 
> As for the errors from mvapich2, I do not have access to a machine with
> infiniband and thus could not give it a try. However, the errors look like
> a similar issue that has been discovered in OpenMPI recently: fail to
> return the correct MPI error codes. I will look into the mvapich2 source
> codes to confirm.
> 
> Thanks for trying various compilers and reporting the problem !
> 
> Wei-keng
> 



More information about the parallel-netcdf mailing list