[mpich-discuss] MPI_File_set_view Error: ADIOI_Count_contiguous_blocks
Ingo Bojak
I.Bojak at science.ru.nl
Thu May 29 08:59:00 CDT 2008
I've been having trouble setting a file view which has blocks of
different lengths of doubles, with gaps in between, and potentially gaps
in the beginning and end. Since this is the first time using MPI*2*-IO,
I thought the errors were from mistakes in resizing to get holes at the
beginning/end. But it seems that the resizing per se is the problem.
Somewhat edited code snippet follows:
---
// lb: lower bound, ext: extend
MPI_Aint lb,ext;
// noblo: number of blocks, disp: list of displacements, blen: list of
block lengths
int noblo,*disp,*blen;
// sdump: file handle, stat: file status
MPI_File sdump;
MPI_Status stat;
// datatype used for filetype
MPI_Datatype loc_double;
if (MPI_Type_indexed(noblo,blen,disp,MPI_DOUBLE,&loc_double)!=MPI_SUCCESS)
sstop("setview: typing loc_double",ERR60);
if (MPI_Type_get_extent(loc_double,&lb,&ext)!=MPI_SUCCESS)
sstop("setview: get extent double",ERR61);
// *** the following lines to be commented out ***
if (MPI_Type_create_resized(loc_double,lb,ext,&loc_double)!=MPI_SUCCESS)
sstop("setview: resize loc_double",ERR62);
// *** the previous lines to be commented out ***
if (MPI_Type_commit(&loc_double)!=MPI_SUCCESS)
sstop("setview: commit loc_double",ERR63);
if (MPI_File_open(MPI_COMM_WORLD,sdname,MPI_MODE_CREATE |
MPI_MODE_WRONLY, MPI_INFO_NULL,&sdump)!=MPI_SUCCESS)
sstop("stdump: can't open file",ERR56);
// error occurs in the line below, *if* lines above are *not* commented out
if
(MPI_File_set_view(sdump,0,MPI_DOUBLE,loc_double,"native",MPI_INFO_NULL)!=MPI_SUCCESS)
sstop("stdump: can't set view",ERR57);
---
"sstop" is just my wrapper for MPI_Abort with some error output. If the
indicated lines are commented out, the program runs fine (although the
output is wrong). If the indicated lines are in, I get the following error:
Error: Unsupported datatype passed to ADIOI_Count_contiguous_blocks,
combiner = 18
The program is the aborted since I trigger on lack of MPI_SUCCESS. Note
that the resize leaves everything as it was before in the above (not so
in the original code, of course - but I was getting the same error when
I was actually resizing with intent). A simplified, but otherwise
typical, version of what may be found in noblo, blen, and disp:
Intended writing pattern for ranks 0,1,2:
0011010022122202
rank 0
------
noblo=4;
disp[4]={0, 4, 6,14};
blen[4]={2,1,2,1};
rank 1
------
noblo=3;
dis[3]={2,5,10};
blen[3]={2,1,1};
rank 2
------
noblo=2;
dis[2]={8,11};
blen[3]={2,3};
I would have then tried to resize the index-datatype to lb=0,ext=16*8 or
perhaps lb=current_lb,ext=16*8-current_lb. I'm not sure which one of the
two is correct so that multiple writes advance all ranks such that they
alway write in the same intended block pattern. But anyhow, I couldn't
make progress on that because of the above error which seems to be
independent of the actual resize made.
The whole thing runs usually currently under Cygwin on a WinXP box:
mpicc -Wall prog.c
mpiexec -n 16 ./a.exe
I use the single machine for rapid debugging. But I have also started
this on one of the target clusters (Linux, gcc) and I get exactly the
same problem!
Am I being stupid (likely...) or is there some problem with the MPI
library function? Any ideas?
Thanks,
Ingo
More information about the mpich-discuss
mailing list