[mpich2-dev] More ROMIO performance questions
Bob Cernohous
bobc at us.ibm.com
Tue Sep 15 14:58:07 CDT 2009
Rob Latham wrote on 09/15/2009 02:39:30 PM:
> Isn't that what 'bgl_nodes_pset' is supposed to address? If you are
> i/o rich or i/o poor, aggregate down to 'bgl_nodes_pset' aggregators
> per io node. There are tons of things ROMIO can do in collective I/O.
Yes, exactly. When we finally got bgl_nodes_pset and cb_buffer_size
hinted right, they got reasonable performance. But not as good as their
customized testcase. I'm still looking into this a bit.
> If you pass around a token in the MPI layer, you can easily starve
> processes. Independent I/O means there's no guarantee when any
> process will be in that call. so, do you have rank 1 give up the
> token after exiting MPI_FILE_WRITE? Who does he pass to? Will they
> be in an MPI call and able to make progress on the receive? Do you
> have rank 4 take the token from someone when he's ready to do
> I/O?
>
Our thought was doing this within collective i/o. At some point, instead
of collecting/moving large contiguous buffers and writing at the
aggregator -- pass around the token and write at each node in the set.
Either way, data is written cb_block_size at a time. It saves passing
cb_buffer_size around. This is different than romio_cb_write=automatic
because I don't want large contiguous buffers to switch back completely to
independent writes. Maybe romio_cb_write=coordinated :)
Anyway, I think my question's been answered. It isn't possible now in
MPIIO. Obviously customized apps can do whatever they like. Meanwhile I
need to pursue the config and look for the underlying problem or
limitation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20090915/a6232585/attachment.htm>
More information about the mpich2-dev
mailing list