[mpich-discuss] filetype_is_contig in ad_write_coll.c
Wei-keng Liao
wkliao at ece.northwestern.edu
Thu Nov 4 12:24:19 CDT 2010
Hi, Rob,
It is a post from almost a year ago :)
I need to refresh my memory.
After reread the codes, I think using contig_access_count is more
accurate, as it is calculated from using both filetype and buffer type.
Wei-keng
On Oct 29, 2010, at 4:19 PM, Rob Latham wrote:
> Hi Wei-keng. I'm reaching way back into the archives to answer your
> questions. Sorry for the ridiculously long latency of my reply: see
> below...
>
> On Mon, Dec 21, 2009 at 12:39:31AM -0600, Wei-keng Liao wrote:
>> In file mpich2-1.2.1/src/mpi/romio/adio/common/ad_write_coll.c,
>> function ADIOI_GEN_WriteStridedColl(), line 145, the call to
>> ADIOI_Datatype_iscontig() may not be necessary if cb_write is
>> not disabled, i.e. fd->hints->cb_write != ADIOI_HINT_DISABLE
>
> I agree that sometimes ADIOI_Datatype_iscontig() is a little
> conservative. I think it reports "contig" only if the type is a named
> type. Maybe the fix is to tune ADIOI_Datatype_iscontig ?
>
>> Whether filetype is contiguous is actually already calculated when
>> calling ADIOI_Calc_my_off_len() at line 160 above, which returns
>> contig_access_count and if it is equal to 1, then filetype_is_contig
>> should be 1.
>>
>> However, I found that in some cases, the function
>> ADIOI_Datatype_iscontig()
>> does not report correctly. I tried a case that uses
>> MPI_Type_create_subarray() to create a contiguous filetype, but
>> ADIOI_Datatype_iscontig() returns a false. After checking the
>> flatten type,
>> I found that the flatten type contains 2 or 3 non-contiguous
>> segments, as
>> flat_file->count == 2 or 3
>>
>> In case of flat_file->count == 2, either flat_file->blocklens[0] or
>> flat_file->blocklens[1] is 0.
>>
>> In case of flat_file->count == 3, both flat_file->blocklens[0] and
>> flat_file->blocklens[2] are 0s.
>>
>> In both cases, there is only one element in flat_file->blocklens[]
>> which is not 0.
>
> I think your optimization might be a tad aggressive here. You've
> constructed a file datatype with an LB of 0 and an UB of 400, so even
> though there's just one 80-byte region to be written in *this* case,
> what if 'buf_size' was bigger? We'd tile the file type and end up
> with a non-contiguous access pattern, right?
>
> I know you are looking at this from the parallel-netcdf context, where
> the library sets up a file view big enough for the subsequent write.
> Maybe there's an optimization we can make for that case.
>
> We have the memory buffer, count, and datatype, and we can get the
> file type from fd->filetype: If the following conditions are true, we
> can override the filetype_is_contig check:
>
> - the size of fd->filetype (as reported by MPI_Type_size) is the same
> as count*bufftype_size
> - the contig_access_count returned from ADIOI_Calc_my_off_len is 1
>
> How does that sound?
>
>> Attached is an example C program and a patch to ad_write_coll.c.
>> Running both can show the disagreement in filetype contiguity.
>> The patch also fix this problem.
>
> thanks for the test case. It is exceedingly helpful for me to use when I
> (eventually) look at the problem.
>
> ==rob
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
More information about the mpich-discuss
mailing list