[mpich-discuss] ROMIO individual file pointer

Wei-keng Liao wkliao at ece.northwestern.edu
Wed Jun 18 15:18:20 CDT 2008


True, it does not cause errors.

It will affect individual process's offset_list[0] and start_offset in 
ADIOI_GEN_WriteStridedColl() and ADIOI_GEN_ReadStridedColl(). These values 
are used to calculate file domains, my_req[] and others_req[]. It will not 
cause problem, but just cover more unnecessary file regions. When I was
reading this part, I thought it may be a good idea to make the variable 
fp_ind consistent across multiple collective I/O.


Wei-keng



On Wed, 18 Jun 2008, Rajeev Thakur wrote:

> But does it cause any problem? At that offset, if blocklens is 0 for that
> process, nothing will be written. The next write will occur at the next
> offset with non-zero blocklen. The fd->fp_ind value is internal to the
> implementation. As long as the right I/O gets done, and the right value is
> returned for MPI_File_get_posn, it is ok.
> 
> Rajeev  
> 
> > -----Original Message-----
> > From: owner-mpich-discuss at mcs.anl.gov 
> > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
> > Sent: Tuesday, June 17, 2008 11:52 PM
> > To: mpich-discuss at mcs.anl.gov
> > Subject: [mpich-discuss] ROMIO individual file pointer
> > 
> > 
> > In ROMIO romio/adio/common/ad_set_view.c, line 60 states the 
> > individual 
> > file point, fd->fp_ind points to the first byte to be 
> > accessed. I can see 
> > fd->fp_ind is set to the right value in this file.
> > 
> > However, in romio/adio/common/ad_read_coll.c, function 
> > ADIOI_Calc_my_off_len() line 443, fd->fp_ind is set to the value of 
> > variable "off". Now, the problem is how "off" is calculated. From the 
> > codes between lines 428 and 438, "off" is moved up to the 
> > next flat_file 
> > segment, index j. Since flat_file may contains an 
> > empty-length element 
> > (either first or last) whose blocklens[] is equal to zero, when user 
> > buffer size "bufsize" is filled, "off" will not moved to the 
> > first byte to 
> > be accessed in the next collective I/O.
> > 
> > This problem appears when I defined a non-contiguous file 
> > view and did two 
> > collective I/O consecutively, each requesting data size equal 
> > to one whole 
> > file view. Then at the beginning of the second collective 
> > I/O, fd->fp_ind 
> > of all processes are having the same value, pointing to the 
> > beginning of 
> > the second file view, instead of individual starting offset.
> > 
> > The attached codes demonstrate this problem. If a printf 
> > statement for 
> > fd->fp_ind is inserted at the beginning of 
> > ADIOI_GEN_WriteStridedColl() in 
> > file ad_write_coll.c, the standard outputs are 
> >   0: First collective write -----------------------
> >   0: fd->fpind = 0
> >   1: fd->fpind = 5
> >   2: fd->fpind = 50
> >   3: fd->fpind = 55
> >   0: Second collective write -----------------------
> >   0: fd->fpind = 100
> >   1: fd->fpind = 100     <-- not right
> >   2: fd->fpind = 100     <-- not right
> >   3: fd->fpind = 100     <-- not right
> > 
> > The correct results for the second collective write should be:
> >   0: Second collective write -----------------------
> >   0: fd->fpind = 100
> >   1: fd->fpind = 105
> >   2: fd->fpind = 150
> >   3: fd->fpind = 155
> > 
> > 
> > My fix to this probelm is to insert the following codes in between
> > lines 435 and 436 of file ad_read_coll.c.
> > 
> >     while (flat_file->blocklens[j]==0) {
> >         j++;
> >         if (j == flat_file->count) {
> >             j = 0;
> >             n_filetypes++;
> >         }
> >     }
> > 
> > 
> > Wei-keng
> > 
> 




More information about the mpich-discuss mailing list