[mpich-discuss] ROMIO: Need information on File realms
Wei-keng Liao
wkliao at ece.northwestern.edu
Wed Sep 1 09:46:07 CDT 2010
These are different file realm assignments.
AAR: aggregate access region
FSZ: File Size Based File Domains
USR: User Specified Striped File Domains
Our very first paper of this work (see below) called it persistent file domain (PFD)
but it was renamed to PFR in our Cluster06 paper.
"Scalable High-level Caching for Parallel I/O" published in IPDPS 2004.
Wei-keng
On Sep 1, 2010, at 8:53 AM, Pascal Deveze wrote:
> Hi, Wei-keng,
>
> Very interesting paper. And a nice bandwidth with Lustre on Jaguar !
>
> I do not find any information on the meaning of AAR, FSZ and USR.
> I see that according to these values, ADIOI_Calc_file_realms_aar, ADIOI_Calc_file_realms_fsize, or ADIOI_Calc_file_realms_user_size
> will be called. The comments in the source do not explain me the differences.
>
> Thanks
>
> Pascal
>
> Wei-keng Liao a écrit :
>> Hi, Pascal,
>>
>> Whether the PFR is better than Lustre ADIO driver or not requires performance evaluation.
>> When we wrote that paper, we did not carry out such an evaluation. But PFR certainly can achieve
>> the same file access mapping (i.e. the one-to-one mapping between I/O aggregators and
>> Lustre OSTs) as Lustre ADIO driver. Furthermore, users can also use the hints to
>> customize different mappings that may do good on other file systems as well.
>>
>> For MPI-IO optimizations on Lustre, we have another paper you might want to check it out.
>> "Dynamically Adapting File Domain Partitioning Methods for Collective I/O Based on Underlying Parallel File System Locking Protocols", published in SC 2008.
>> This paper concludes that the group-based cyclic file domain partitioning method performs the best on Lustre.
>>
>> Wei-keng
>>
>> On Aug 27, 2010, at 3:32 AM, Pascal Deveze wrote:
>>
>>
>>> Wei-keng, I begin to read your paper with interest !
>>> Rob, thanks a lot for your explanations !
>>>
>>> Of course, I will experiment this code and will be happy giving news from me.
>>>
>>> As far as I understand this new method, it could be better using PFR on Lustre instead of Lustre ADIO Driver.
>>> This because PFR does the stripe alignment, but also brings a lot of optimizations. I am right ?
>>>
>>> Regards,
>>>
>>> Pascal
>>>
>>> Rob Latham a écrit :
>>>
>>>> On Thu, Aug 26, 2010 at 11:04:17AM +0200, Pascal Deveze wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I recently saw that there are new files in adio/common
>>>>> (ad_aggregate_new.c, ad_io_coll.c, ...). The are implementing a "new
>>>>> 2 phase method" using "file realms".
>>>>> This is very interesting to me, but I do not have any information.
>>>>> Are there some paper (architecture, white paper or High Level
>>>>> Design) describing this new method ?
>>>>>
>>>> The code is in ROMIO but only enabled if you set the "romio_cb_pfr"
>>>> hint: (PFR == persistent file realms)
>>>>
>>>> Here are all the hints relevant to file realms. There are quite a
>>>> few:
>>>>
>>>> - romio_cb_pfr: set this to use file realms. if not set, file domains
>>>> will continue to be calculated in the traditional manner.
>>>>
>>>> - romio_cb_fr_types: the "file realm types" can be "AAR", "FSZ" or
>>>> "USR". The paper Wei-keng mentioned explains this better.
>>>> I am slightly embarrassed to admit that while I added hint parsing
>>>> for the rest of these hints, I never did add the hint parsing for
>>>> this one. You will be stuck with AAR.
>>>> - romio_cb_fr_alignment: easier hint to explain. Align file realms
>>>> to the given byte boundary. Certain file systems perform much
>>>> better when writes are aligned to block boundaries
>>>>
>>>> - romio_cb_ds_threshold: normally, two-phase does data sieving if the write request contains any holes. Set this hint (a
>>>> datatype's size-to-extent ratio), and datatypes less than this ratio will skip the data sieving optimization and instead service the request piecewise.
>>>> - romio_cb_alltoall: the communication phase of two-phase can involve
>>>> either point-to-point communication, or use MPI_Alltoall if this
>>>> hint is set to 'enable' or 'automatic'
>>>>
>>>> If you experiment with this code, I would love to hear your results.
>>>> You have a knack for finding bugs :>
>>>>
>>>> ==rob
>>>>
>>>>
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>
>>
>>
>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
More information about the mpich-discuss
mailing list