[MPICH] slow IOR when using fileview
Wei-keng Liao
wkliao at ece.northwestern.edu
Tue Jul 10 18:26:29 CDT 2007
Here is a revised of the fix I proposed earlier, where I forgot to
consider the file displacement, fd->disp, defined in the file view.
if (file_ptr_type == ADIO_INDIVIDUAL && buftype_is_contig &&
bufsize + (offset - (fd->disp + flat_file->indices[st_index])) <=
flat_file->blocklens[st_index]) {
ADIO_WriteContig(fd, buf, bufsize, MPI_BYTE, ADIO_EXPLICIT_OFFSET,
offset, status, error_code);
return;
}
Wei-keng
On Tue, 3 Jul 2007, Wei-keng Liao wrote:
>
> I come up with a very simple solution for checking this contiguous buftype,
> non-contiguous filetype, but the "intersection" is contiguous.
> In file ad_write_str.c, line 265, insert the followings:
>
> if (file_ptr_type == ADIO_INDIVIDUAL && buftype_is_contig &&
> bufsize + (offset - flat_file->indices[st_index]) <=
> flat_file->blocklens[st_index]) {
> ADIO_WriteContig(fd, buf, bufsize, MPI_BYTE, ADIO_EXPLICIT_OFFSET,
> offset, status, error_code);
> return;
> }
>
> The first if condition "file_ptr_type == ADIO_INDIVIDUAL" is because I am not
> sure if this is applicable to shared file pointers.
>
> The second condition "buftype_is_contig" is to ensure buffer is contiguous.
>
> The third condition is to ensure the requesting data is within a single block
> of flat_file, ie, the st_index block in flat_file.
>
> I tested it with my testing code and it ran OK. Please let me know if this
> can cause any problem that I did not think of. I hope it can be incorporated
> into ROMIO in the future release.
>
> Wei-keng
>
>
>
> On Tue, 3 Jul 2007, Wei-keng Liao wrote:
>
>>
>> I checked the ROMIO source for this particular access pattern.
>> At first, a few words about the access pattern.
>> 1) MPI_Type_create_subarray() creates the file access regions like
>> file: |----------|----------|----------| .... |----------|
>> P0 P1 P2 P7
>> Each segment is of size 10MB.
>> 2) There is no overlapped, interleaved, or non-contiguous access across
>> all processes. Every file access is a single contiguous write request.
>> 3) Write buffer is also contiguous. The write amount is 10 MB, same across
>> all MPI processes.
>> 4) The effect of using this file type should be the same as using
>> explicit file offset without file type.
>>
>> In ROMIO source file ad_write_coll.c, in function
>> ADIOI_GEN_WriteStridedColl(), ADIOI_Datatype_iscontig() is called in line
>> 141 to check if the file type is contiguous and it returns 0. That means
>> the file type is not contiguous. In general, this is true, since the file
>> type is applied to the entire file space repeatedly. Therefore, in line
>> 153, ADIO_WriteStrided() is called, instead of ADIO_WriteContig() in line
>> 150. So, data sieving is performed by default in ADIO_WriteStrided() which
>> chops the 10 MB write into 20 512KB chunks. For each chunk, a
>> read-modify-write is carried out.
>>
>> In fact, this I/O pattern should trigger ADIO_WriteContig() for best
>> result. I suggest one more test should be given here for checking if the
>> intersection of the buffertype and filetype is contiguous. If yes,
>> ADIO_WriteContig() is called. Here, the intersection operation will involve
>> the current file position. I don't know how complicate can this
>> implementation be.
>>
>> Wei-keng
>>
>>
>>
>>
>> On Mon, 2 Jul 2007, Yu, Weikuan wrote:
>>
>>>> If the independent
>>>> access is used instead, I don't know why each write is divided into 512
>>>> KB
>>>> chunks and locking is ever needed to guaranteed the atomic access of the
>>>> 10 MB contiguous file range. For this particular access pattern, ROMIO
>>>> should not do read-modify-write at all.
>>>
>>> 512KB is the default buffer size for data sieving. So with 512KB buffer
>>> size, each process is only able to write out 512KB data in each call of
>>> ADIOI_GEN_WriteStrided. For 10MB, this results in 20 iterations of
>>> write_all(), 40 fcntl() total. crayPat indicates that fcntl() takes 88% of
>>> the total Wall clock time with fileview, 0% w/o fileview.
>>>
>>> --Weikuan
>>>
>>
>
More information about the mpich-discuss
mailing list