[mpich-discuss] MPI_File_open() fails on local + NFS file system

Audet, Martin Martin.Audet at imi.cnrc-nrc.gc.ca
Thu Apr 21 16:20:47 CDT 2011


Hi MPICH_Developers,

We are unable to use MPI_File_open() on a cluster where the first node (master node) mount a local file system and export it via NFS to a few cluster nodes so that /home on both the master node and the compute node refers to the same directory.

When a job composed on one (or more) process on the master node and one (or more) process on a client node is started, MPI_File_open() to create a new file either make the program to abort (if the process of rank 0 is on the master node using the local file system) or to freeze (if the process of rank 0 is on a compute node acessing the file via NFS).

When the program freeze, an inspection with gdb shows that the process of rank 0 is stuck into a MPI_Bcast() called by MPI_File_open().

Note that this happens with many mpich2 versions from 1.0.7 to 1.4rc2.

So we would like to know if the configuration we use is local or not.

Thanks,

Martin Audet


More information about the mpich-discuss mailing list