[mpich2-dev] Another romio performance question

Fri Aug 21 10:35:38 CDT 2009

I was looking at the "hole" detection and read-modify-write processing in 
ad_write_coll (and ad_bgl_wrcoll).

I notice that it very intentionally looks for front and back holes.   Any 
hole causes read-modify-write.   But what if the only hole found is front 
or  back?  Was there any effort to just skip front/back holes?  Avoid the 
read and adjust the write to only write where there's data?   Adjust the 
offset and length to skip the front/back holes?  Or is it not worth it on 
typical collective i/o patterns?

I have a customer complaint that a particular testcase performs poorly 
with MPI_File_write_at_all.  For the most part, it's just a testcase that 
shouldn't be using collective i/o.  Each rank writes large contiguous 
blocks to it's own range of the file.  So aggregation is just a waste of 
time.  Each write the aggregator does is a large contiguous write for a 
single rank.  So there's really no true aggregation.

What caught my eye was, for example,  using a 16MB cb_buffer_size and 
writing a contiguous 1M block causes read-modify-write of the whole 16M 
because of the single large (15M) trailing (or leading) hole.   It just 
seems like we should do better, but is it worth doing anything for 
something that probably isn't a true collective i/o pattern?

I can fix the testcase performance by hinting cb_buffer_size down to 1MB 
and then there's no hole.  This is a fine user circumvention, but I'm 
trying to decide if we should do more.

Thoughts?

Bob Cernohous:  (T/L 553) 507-253-6093

BobC at us.ibm.com
IBM Rochester, Building 030-2(C335), Department 61L
3605 Hwy 52 North, Rochester,  MN 55901-7829

> Chaos reigns within.
> Reflect, repent, and reboot.
> Order shall return.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20090821/a7afdc42/attachment.htm>