[MPICH] question about MPI-IO Read_all
Wei-keng Liao
wkliao at ece.northwestern.edu
Thu Apr 26 00:06:43 CDT 2007
Hi,
There may be a little misunderstanding with the MPI fileview. Fileview
defines the file regions that are visible (readable/writable) to a
process. It has nothing to do with buffer datatype and count used in
Read_all(). The arguments count and datatype in Read_all() are used to
describe the memory layout of the argument buffer. This datatype provides
a convenient way to describe noncontiguous memory regions of an I/O buffer
that are written to (read from) a file.
To tell the difference, one can consider that, for a write case, MPI-IO
library will "pack" the non-contiguous data in the write buffer to form a
continuous byte stream and fill it contiguously in the visible file
regions specified by the fileview. Reverse the flow for the read case.
As for the 2 GB limitation, first I would say it is not very often to see
a single process write > 2 GB data from a file (similar with read). That
would take > 2GB memory space to accommodate the buffer in a single
compute node. Even if this is the case, one still can use a properly
defined buffer datatype to avoid the 32bit limit of the integer "count"
argument. If the non-contiguity nature of the I/O buffer is regular
strided, one can define a datatype using MPI_Type_vector(),
MPI_Type_hvector(), etc. For highly irregular non-contiguity, one can use
MPI_Type_struct(). If you want, you can describe your array memory layout
and we may come up a way to define a datatype for it.
The data amount to be read/written by an MPI process in a
Read_all()/Write_all() is the "count" multiplies the size of the buffer
datatype. Therefore, the read/write amount can still be larger than 2GB
when integer "count" is less than 2^32. Note that the data amount one can
read/write is determined by these two arguments, not the fileview.
Hope this helps.
Wei-keng
On Wed, 25 Apr 2007, Peter Diamessis wrote:
> Hi folks,
>
> If I may just comment that this is a very interesting topic.
> I ran into a similar situation when using MPI_WRITE_ALL &
> MPI_READ_ALL to output/read-in non-contiguous 3-D data
> in my CFD solver. The "global" size of the binary file was approximately
> 10Gb consisting of 20 3-D variables. I would encounter errors when trying
> to output all 20 fields in one file. I then broke the file down into 10 files
> with
> 2 fields each, with approximate file-size equal to 1.2 Gb. Then everything
> worked smoothly. I'm wondering if this a similar issue to what Russell has
> been pointing out ?
>
> Sincerely,
>
> Pete Diamessis
>
>
>
>
> ----- Original Message ----- From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> To: "'Russell L. Carter'" <rcarter at esturion.net>; <mpich-discuss at mcs.anl.gov>
> Sent: Wednesday, April 25, 2007 10:19 PM
> Subject: RE: [MPICH] question about MPI-IO Read_all
>
>
>> 2^31 is 2 Gbytes. If you are reading 2 GB per process with a single
>> Read_all, you are already doing quite well performance-wise. If you want to
>> read more than that you can create a derived datatype of say 10 contiguous
>> bytes and pass that as the datatype to Read_all. That would give you 20 GB.
>> You read even more by using 100 or 1000 instead of 10.
>>
>> In practice, you might encounter some errors, because the MPI-IO
>> implementation internally may use some types that are 32-bit, not expecting
>> anyone to read larger than that with a single call. So try it once, and if
>> it doesn't work, read in 2GB chunks.
>>
>> Rajeev
>>
>>
>>> -----Original Message-----
>>> From: owner-mpich-discuss at mcs.anl.gov
>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Russell L. Carter
>>> Sent: Wednesday, April 25, 2007 6:33 PM
>>> To: mpich-discuss at mcs.anl.gov
>>> Subject: [MPICH] question about MPI-IO Read_all
>>>
>>> Hi,
>>> I have a question about the amount of data it is possible to read
>>> using MPI::Create_hindex with a fundamental type of MPI::BYTE, and
>>> MPI::File::Read_all.
>>>
>>> Following the discussion about irregularly distributed arras beginning
>>> on p. 78 of "Using MPI-2", I want to read my data by doing this:
>>>
>>> double *buf = ...;
>>> int count, bufsize = ...;
>>> MPI::Offset offset = ...;
>>> MPI::File f = MPI::File::Open(...);
>>> MPI::Datatype filetype(MPI::BYTE);
>>> filetype.Create_hindexed(count, blocks, displacements);
>>> f.Set_view(offset, MPI::BYTE, filetype, "native", info_);
>>> f.Read_all(buf, bufsize, MPI::BYTE);
>>>
>>> What I a curious about is the amount of data that can
>>> be read with Read_all. Since bufsize is an int, then
>>> that would seem to imply that the maximum Read_all (per node)
>>> is 2^31. Which in bytes, is not gigantic.
>>>
>>> Is there some other technique I can use to increase the amount
>>> of data I can Read_all at one time? I have different sized
>>> data interspersed, so I can't offset by a larger fundamental
>>> type. My arrays are not contiguous in the fortran calling program,
>>> and are of int and 4 or 8 byte reals. If I use a Create_struct
>>> to make a filetype that I use to Set_view, doesn't this have
>>> the same read size limitation? Only now it is for all the
>>> arrays in the struct. Hopefully I am missing something.
>>>
>>> Thanks,
>>> Russell
>>>
>>>
>>
>
More information about the mpich-discuss
mailing list