[mpich2-dev] More ROMIO performance questions

Mon Sep 14 17:48:00 CDT 2009

On Mon, Sep 14, 2009 at 04:57:47PM -0500, Bob Cernohous wrote:
> 
> We tried to tune it with hints for cb_block_size and get ok performance 
> when we can avoid read/write data sieving.

I'm sure you must have meant "cb_buffer_size" ?  

> They customized the testcase to coordinate/flow-control the non-collective 
> i/o and they get great performance.   They only have N simultaneous 
> writers/readers active.  They pass a token around and take turns.  It's 
> almost like having N aggregators but without the collective i/o overhead 
> to pass the data around.  Instead they pass a small token and take turns 
> writing the large, non-interleaved contiguous data blocks.
> 
> I'm not aware of anything in MPIIO or ROMIO that would do tihs?   Has this 
> been explored by the experts (meaning you guys)? 

In the ordered mode routines, we pass a token around to ensure that
process write/read in rank-order.  (this is actually a pretty naive
way to implement ordered mode, but until very recently nobody seemed
too concerned about shared file pointer performance).

We don't do anything like this in ROMIO because frankly if a high
performance file system can't handle simultaneous non-interleaved
contiguous data blocks (what we would consider the best case scenario
performance-wise), then a lot of ROMIO assumptions about how to
achieve peak performance kind of go out the window.

However, Kevin's suggestion that this is instead due to lock
contention makes a lot of sense and I'm curious to hear what if any
impact that has on your customer's performance.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA