[mpich-discuss] Question on MPI-IO
Rob Latham
robl at mcs.anl.gov
Tue Jun 5 12:39:22 CDT 2012
On Tue, Jun 05, 2012 at 09:58:36AM -0500, Rob Latham wrote:
> On Mon, May 28, 2012 at 10:57:38AM -0300, Luiz Carlos da Costa Junior wrote:
> > Hello,
> >
> > I have an implementation in which I have to distribute the same data across
> > all processes. Because of concurrent access issues, a master process is
> > responsible for reading the file and, after, distribute its content using
> > MPI_BCAST.
> > Now I have a new file in the same situation.
> >
> > I have read a little about MPI-IO and I am considering to use it.
> > Is it suitable to situations like this? Can someone please point me out
> > what I should expect comparing my current implementation with MPI-IO?
>
> You can certainly use MPI-IO to request that all processes read the
> same data. I caution you though that this request might not be
> handled particularly efficiently -- it's actually quite difficult for
> the MPI-IO layer to detect "everybody read the same region" workloads.
Talking it over with Rajeev, I have probably over-compensated for a
problem we've run into on BlueGene systems.
Go ahead and do an "MPI_File_read_all()" from every processor and
everything will likely work just fine.
==rob
> Your "read and broadcast" approach will work well in an I/O context,
> and probably deliver better performance.
>
> If you have situations where your data is decomposed across
> processors, such as each processor reading a different set of initial
> conditions, or each processor contributing data to an output file,
> then consider MPI-IO.
>
> Even better, maybe a library like HDF5 or Parallel-NetCDF will be
> better suited to your needs.
>
> ==rob
>
>
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the mpich-discuss
mailing list