[mpich-discuss] ROMIO individual file pointer
Rajeev Thakur
thakur at mcs.anl.gov
Wed Jun 18 14:32:14 CDT 2008
But does it cause any problem? At that offset, if blocklens is 0 for that
process, nothing will be written. The next write will occur at the next
offset with non-zero blocklen. The fd->fp_ind value is internal to the
implementation. As long as the right I/O gets done, and the right value is
returned for MPI_File_get_posn, it is ok.
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
> Sent: Tuesday, June 17, 2008 11:52 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [mpich-discuss] ROMIO individual file pointer
>
>
> In ROMIO romio/adio/common/ad_set_view.c, line 60 states the
> individual
> file point, fd->fp_ind points to the first byte to be
> accessed. I can see
> fd->fp_ind is set to the right value in this file.
>
> However, in romio/adio/common/ad_read_coll.c, function
> ADIOI_Calc_my_off_len() line 443, fd->fp_ind is set to the value of
> variable "off". Now, the problem is how "off" is calculated. From the
> codes between lines 428 and 438, "off" is moved up to the
> next flat_file
> segment, index j. Since flat_file may contains an
> empty-length element
> (either first or last) whose blocklens[] is equal to zero, when user
> buffer size "bufsize" is filled, "off" will not moved to the
> first byte to
> be accessed in the next collective I/O.
>
> This problem appears when I defined a non-contiguous file
> view and did two
> collective I/O consecutively, each requesting data size equal
> to one whole
> file view. Then at the beginning of the second collective
> I/O, fd->fp_ind
> of all processes are having the same value, pointing to the
> beginning of
> the second file view, instead of individual starting offset.
>
> The attached codes demonstrate this problem. If a printf
> statement for
> fd->fp_ind is inserted at the beginning of
> ADIOI_GEN_WriteStridedColl() in
> file ad_write_coll.c, the standard outputs are
> 0: First collective write -----------------------
> 0: fd->fpind = 0
> 1: fd->fpind = 5
> 2: fd->fpind = 50
> 3: fd->fpind = 55
> 0: Second collective write -----------------------
> 0: fd->fpind = 100
> 1: fd->fpind = 100 <-- not right
> 2: fd->fpind = 100 <-- not right
> 3: fd->fpind = 100 <-- not right
>
> The correct results for the second collective write should be:
> 0: Second collective write -----------------------
> 0: fd->fpind = 100
> 1: fd->fpind = 105
> 2: fd->fpind = 150
> 3: fd->fpind = 155
>
>
> My fix to this probelm is to insert the following codes in between
> lines 435 and 436 of file ad_read_coll.c.
>
> while (flat_file->blocklens[j]==0) {
> j++;
> if (j == flat_file->count) {
> j = 0;
> n_filetypes++;
> }
> }
>
>
> Wei-keng
>
More information about the mpich-discuss
mailing list