[mpich-discuss] MPI_File_set_view Error: ADIOI_Count_contiguous_blocks

Ingo Bojak I.Bojak at science.ru.nl
Thu May 29 08:59:00 CDT 2008


I've been having trouble setting a file view which has blocks of 
different lengths of doubles, with gaps in between, and potentially gaps 
in the beginning and end. Since this is the first time using MPI*2*-IO, 
I thought the errors were from mistakes in resizing to get holes at the 
beginning/end. But it seems that the resizing per se is the problem. 
Somewhat edited code snippet follows:

---
// lb: lower bound, ext: extend
MPI_Aint lb,ext;

// noblo: number of blocks, disp: list of displacements, blen: list of 
block lengths
int noblo,*disp,*blen;

// sdump: file handle, stat: file status
MPI_File sdump;
MPI_Status stat;

// datatype used for filetype
MPI_Datatype loc_double;

if (MPI_Type_indexed(noblo,blen,disp,MPI_DOUBLE,&loc_double)!=MPI_SUCCESS)
  sstop("setview: typing loc_double",ERR60);

if (MPI_Type_get_extent(loc_double,&lb,&ext)!=MPI_SUCCESS)
   sstop("setview: get extent double",ERR61);

// *** the following lines to be commented out ***

if (MPI_Type_create_resized(loc_double,lb,ext,&loc_double)!=MPI_SUCCESS)
   sstop("setview: resize loc_double",ERR62);

// *** the previous lines to be commented out ***

if (MPI_Type_commit(&loc_double)!=MPI_SUCCESS)
   sstop("setview: commit loc_double",ERR63);

if (MPI_File_open(MPI_COMM_WORLD,sdname,MPI_MODE_CREATE | 
MPI_MODE_WRONLY, MPI_INFO_NULL,&sdump)!=MPI_SUCCESS)
   sstop("stdump: can't open file",ERR56);

// error occurs in the line below, *if* lines above are *not* commented out
if 
(MPI_File_set_view(sdump,0,MPI_DOUBLE,loc_double,"native",MPI_INFO_NULL)!=MPI_SUCCESS)
   sstop("stdump: can't set view",ERR57);
---

"sstop" is just my wrapper for MPI_Abort with some error output. If the 
indicated lines are commented out, the program runs fine (although the 
output is wrong). If the indicated lines are in, I get the following error:

Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks, 
combiner = 18

The program is the aborted since I trigger on lack of MPI_SUCCESS. Note 
that the resize leaves everything as it was before in the above (not so 
in the original code, of course - but I was getting the same error when 
I was actually resizing with intent). A simplified, but otherwise 
typical, version of what may be found in noblo, blen, and disp:

Intended writing pattern for ranks 0,1,2:
0011010022122202

rank 0
------
noblo=4;
disp[4]={0, 4, 6,14};
blen[4]={2,1,2,1};

rank 1
------
noblo=3;
dis[3]={2,5,10};
blen[3]={2,1,1};

rank 2
------
noblo=2;
dis[2]={8,11};
blen[3]={2,3};

I would have then tried to resize the index-datatype to lb=0,ext=16*8 or 
perhaps lb=current_lb,ext=16*8-current_lb. I'm not sure which one of the 
two is correct so that multiple writes advance all ranks such that they 
alway write in the same intended block pattern. But anyhow, I couldn't 
make progress on that because of the above error which seems to be 
independent of the actual resize made.

The whole thing runs usually currently under Cygwin on a WinXP box:
mpicc -Wall prog.c
mpiexec -n 16  ./a.exe
I use the single machine for rapid debugging. But I have also started 
this on one of the target clusters (Linux, gcc) and I get exactly  the 
same problem!

Am I being stupid (likely...) or is there some problem with the MPI 
library function? Any ideas?

Thanks,
Ingo




More information about the mpich-discuss mailing list