[MPICH] MPI_File_read_all hanging

Wei-keng Liao wkliao at ece.northwestern.edu
Sat Feb 2 17:49:42 CST 2008


An update to this problem.

On the same test machine, I built mpich2 1.0.2, 1.0.3, 1.0.4, and 1.0.5 
and compiled, ran the same subarray program for each of these versions. 
Only 2-1.0.2 has no problem. Others have the same hanging situation.

Wei-keng


On Fri, 1 Feb 2008, Wei-keng Liao wrote:
> 
> I have an I/O program hanging on MPI_File_read_all. The code is the 
> attached C file. It writes 20 3D block-block-block partitioned arrays, 
> closes the file, re-opens it, and reads the 20 arrays back, also in the 
> same 3D block pattern. It is similar to the ROMIO 3D test code, 
> coll_test.c
> 
> The error occured when I ran on 64 processes, not less (the machine I ran 
> has 2 processors per node). The first 20 writes are OK. But the program 
> hangs at around 10th read. After tracing down to the source, it hangs on
>            MPI_Waitall(nprocs_recv, requests, statuses);
> in function ADIOI_R_Exchange_data(), file ad_read_coll.c .
> 
> I am using mpich2-1.0.6p1 on a Linux cluster 
> 2.6.9-42.0.10.EL_lustre-1.4.10.1smp #1 SMP x86_64 x86_64 x86_64 GNU/Linux
> 
> gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)
> 
> 
> Wei-keng




More information about the mpich-discuss mailing list