[MPICH2-dev] RE: alignment hints for calculating the file domain.

Yu, Weikuan wyu at ornl.gov
Tue Jul 10 07:55:18 CDT 2007


Hi, All,

Wangdi has observed a lot of lock contentions when the file domain
partitions in collective IO are not aligned on Lustre stripe boundary
(say 1MB). As a remedy of that, we are proposing a minor change on the
implementation of ADIOI_GEN_Calc_file_domains() for it to respect an
advisory alignment. The attached patch illustrates what we are
proposing. 

We hope this can get incorporated upstream in future release of romio.
It would save the need of a modified ADIOI_GEN_Calc_file_domains() in
the ADIO driver for file system like Lustre, and allow us to feed
ADIOI_GEN_Calc_file_domains() with a preferred alignment. Please
consider.

Thanks,
Weikuan

P.S.: We will use mpich-discuss at mcs.anl.gov for further discussions if
mpich2-dev is not for this purpose.

> -----Original Message-----
> From: wangdi [mailto:wangdi at clusterfs.com] 
> Sent: Tuesday, July 10, 2007 12:33 AM
> To: mpich2-dev at mcs.anl.gov; Yu, Weikuan
> Subject: alignment hints for calculating the file domain.
> 
> Hello,
> 
> We are trying to improve lustre adio driver currently for 
> mpich2. We found that all the file system  use 
> ADIOI_GEN_WriteStridedColl as their collective write API in MPICH2.
> In this API, it will calculate the file domains and 
> distribute the I/O segment evenly around all the clients 
> according to the file domain size. And, when calculating the 
> file domains, it just divide the whole i/o area(min, max) by 
> the count of the clients, but did not take into account the 
> specifics of some file systems, which might be sensitive to 
> this file_domain_size.
> For example, some file system might need special alignment 
> for better performance instead of this evenly divided domain_size.
> 
> So could you export a new API (ADIOI_Calc_file_domains) here, 
> then those file systems, which are not sensitive to the 
> alignment, just call current calc_file_domains API(might be 
> renamed as ADIO_GEN_Calc_file_domains), and for those file 
> systems, which need the special alignment could get a 
> alignment hint before real calculation. Maybe the calc file 
> domain API could be changed in this way?
> 
> 
> ADIOI_XXX_Calc_file_domains()
> {
>        int alignment = get_xxx_alignment();
>        *fd_size_ptr = alignment;
> 
>        ADIOI_GEN_Calc_file_domains(......, fd_size_ptr);
> 
>  }
> 
> ADIOI_GEN_Calc_file_domains(......, *fd_size) he {
>      int alignment = *fd_size_ptr;
>  
>      ........
> 
> 
>      fd_size = ((max_end_offset - min_st_offset + 1) + 
> nprocs_for_coll -
>                1)/nprocs_for_coll;
> 
>      if (alignment != 0) {
>             fd_size = ((fd_size + alignmentan - 1) / 
> alignment) * alignment;
>      }
> 
>      ........
> 
> 
>     *fd_size_ptr = fd_size;
>  
> }
> 
> 
> 
> Thanks
> 
> 
> 
> 
> 
> 
> 
> 
> --
> Regards,
> Tom Wangdi    
> --
> Cluster File Systems, Inc
> Software Engineer
> http://www.clusterfs.com
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: align-file-domain.patch
Type: application/octet-stream
Size: 2299 bytes
Desc: align-file-domain.patch
URL: <https://lists.mcs.anl.gov/mailman/private/mpich2-dev/attachments/20070710/1c8d1ec0/attachment.obj>


More information about the mpich2-dev mailing list