<br><font size=2 face="sans-serif">We have another i/o scenario with interesting
performance issues.</font>
<br>
<br><font size=2 face="sans-serif">One again, it's large non-interleaved
contiguous blocks being written/read (checkpointing software). We
ran into the same problems with data sieving and romio_cb_write/read =
enable as we discussed a couple weeks ago.</font>
<br>
<br><font size=2 face="sans-serif">We tried to tune it with hints for cb_block_size
and get ok performance when we can avoid read/write data sieving.</font>
<br>
<br><font size=2 face="sans-serif">Trying romio_cb_write/read = automatic
gets very poor performance.Similarly, pure non-collective writes get very
poor performance. It seems like having too many writers/readers performs
poorly on their configuration ... so</font>
<br>
<br><font size=2 face="sans-serif">They customized the testcase to coordinate/flow-control
the non-collective i/o and they get great performance. They only
have N simultaneous writers/readers active. They pass a token around
and take turns. It's almost like having N aggregators but without
the collective i/o overhead to pass the data around. Instead they
pass a small token and take turns writing the large, non-interleaved contiguous
data blocks.</font>
<br>
<br><font size=2 face="sans-serif">I'm not aware of anything in MPIIO or
ROMIO that would do tihs? Has this been explored by the experts
(meaning you guys)? </font>
<br>
<br>
<br><font size=2 face="sans-serif"><br>
Bob Cernohous: (T/L 553) 507-253-6093<br>
<br>
BobC@us.ibm.com<br>
IBM Rochester, Building 030-2(C335), Department 61L<br>
3605 Hwy 52 North, Rochester, MN 55901-7829<br>
<br>
> Chaos reigns within.<br>
> Reflect, repent, and reboot.<br>
> Order shall return.<br>
</font>