[MPICH] MPI_File_read_all hanging
Wei-keng Liao
wkliao at ece.northwestern.edu
Sat Feb 2 17:49:42 CST 2008
An update to this problem.
On the same test machine, I built mpich2 1.0.2, 1.0.3, 1.0.4, and 1.0.5
and compiled, ran the same subarray program for each of these versions.
Only 2-1.0.2 has no problem. Others have the same hanging situation.
Wei-keng
On Fri, 1 Feb 2008, Wei-keng Liao wrote:
>
> I have an I/O program hanging on MPI_File_read_all. The code is the
> attached C file. It writes 20 3D block-block-block partitioned arrays,
> closes the file, re-opens it, and reads the 20 arrays back, also in the
> same 3D block pattern. It is similar to the ROMIO 3D test code,
> coll_test.c
>
> The error occured when I ran on 64 processes, not less (the machine I ran
> has 2 processors per node). The first 20 writes are OK. But the program
> hangs at around 10th read. After tracing down to the source, it hangs on
> MPI_Waitall(nprocs_recv, requests, statuses);
> in function ADIOI_R_Exchange_data(), file ad_read_coll.c .
>
> I am using mpich2-1.0.6p1 on a Linux cluster
> 2.6.9-42.0.10.EL_lustre-1.4.10.1smp #1 SMP x86_64 x86_64 x86_64 GNU/Linux
>
> gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)
>
>
> Wei-keng
More information about the mpich-discuss
mailing list