[mpich-discuss] ROMIO individual file pointer
Wei-keng Liao
wkliao at ece.northwestern.edu
Wed Jun 18 15:18:20 CDT 2008
True, it does not cause errors.
It will affect individual process's offset_list[0] and start_offset in
ADIOI_GEN_WriteStridedColl() and ADIOI_GEN_ReadStridedColl(). These values
are used to calculate file domains, my_req[] and others_req[]. It will not
cause problem, but just cover more unnecessary file regions. When I was
reading this part, I thought it may be a good idea to make the variable
fp_ind consistent across multiple collective I/O.
Wei-keng
On Wed, 18 Jun 2008, Rajeev Thakur wrote:
> But does it cause any problem? At that offset, if blocklens is 0 for that
> process, nothing will be written. The next write will occur at the next
> offset with non-zero blocklen. The fd->fp_ind value is internal to the
> implementation. As long as the right I/O gets done, and the right value is
> returned for MPI_File_get_posn, it is ok.
>
> Rajeev
>
> > -----Original Message-----
> > From: owner-mpich-discuss at mcs.anl.gov
> > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
> > Sent: Tuesday, June 17, 2008 11:52 PM
> > To: mpich-discuss at mcs.anl.gov
> > Subject: [mpich-discuss] ROMIO individual file pointer
> >
> >
> > In ROMIO romio/adio/common/ad_set_view.c, line 60 states the
> > individual
> > file point, fd->fp_ind points to the first byte to be
> > accessed. I can see
> > fd->fp_ind is set to the right value in this file.
> >
> > However, in romio/adio/common/ad_read_coll.c, function
> > ADIOI_Calc_my_off_len() line 443, fd->fp_ind is set to the value of
> > variable "off". Now, the problem is how "off" is calculated. From the
> > codes between lines 428 and 438, "off" is moved up to the
> > next flat_file
> > segment, index j. Since flat_file may contains an
> > empty-length element
> > (either first or last) whose blocklens[] is equal to zero, when user
> > buffer size "bufsize" is filled, "off" will not moved to the
> > first byte to
> > be accessed in the next collective I/O.
> >
> > This problem appears when I defined a non-contiguous file
> > view and did two
> > collective I/O consecutively, each requesting data size equal
> > to one whole
> > file view. Then at the beginning of the second collective
> > I/O, fd->fp_ind
> > of all processes are having the same value, pointing to the
> > beginning of
> > the second file view, instead of individual starting offset.
> >
> > The attached codes demonstrate this problem. If a printf
> > statement for
> > fd->fp_ind is inserted at the beginning of
> > ADIOI_GEN_WriteStridedColl() in
> > file ad_write_coll.c, the standard outputs are
> > 0: First collective write -----------------------
> > 0: fd->fpind = 0
> > 1: fd->fpind = 5
> > 2: fd->fpind = 50
> > 3: fd->fpind = 55
> > 0: Second collective write -----------------------
> > 0: fd->fpind = 100
> > 1: fd->fpind = 100 <-- not right
> > 2: fd->fpind = 100 <-- not right
> > 3: fd->fpind = 100 <-- not right
> >
> > The correct results for the second collective write should be:
> > 0: Second collective write -----------------------
> > 0: fd->fpind = 100
> > 1: fd->fpind = 105
> > 2: fd->fpind = 150
> > 3: fd->fpind = 155
> >
> >
> > My fix to this probelm is to insert the following codes in between
> > lines 435 and 436 of file ad_read_coll.c.
> >
> > while (flat_file->blocklens[j]==0) {
> > j++;
> > if (j == flat_file->count) {
> > j = 0;
> > n_filetypes++;
> > }
> > }
> >
> >
> > Wei-keng
> >
>
More information about the mpich-discuss
mailing list