[mpich-discuss] read the same file on all processes
Wei-keng Liao
wkliao at ece.northwestern.edu
Wed Oct 22 01:43:26 CDT 2008
You can use collective read, MPI_File_read_all(). Internally, MPICH will
make each process read only 1/P of the entire file (assuming there are P
processes) and then redistribute the data to all other processes. At the
end, all processes will get the entire file.
The codes look like:
MPI_File_open(MPI_COMM_WORLD, filename, MPI_MODE_RDONLY,
MPI_INFO_NULL, &fh);
MPI_File_get_size(fh, &filesize);
MPI_File_read_all(fh, buf, filesize, MPI_BYTE, &status);
MPI_File_close(&fh);
Wei-keng
On Wed, 22 Oct 2008, Luiz Carlos da Costa Junior wrote:
> Hi all,
> Let me join this conversation. I also "suffer" from these doubts.
> In my case, I have an application in two versions, Windows (NTFS) and Linux
> (FAT32) and I have first implemented the first approach (make one separated
> copy for each machine).
>
> But recently, I started to deal with bigger files (200Mb ~ 1Gb) and this
> became very inefficiently. Actually, the reason I suspect is that even we
> have multiple processes, the hard disk device that is responsible for manage
> all these readings is just one. In other words, this operation is
> intrinsically sequential and became a bottleneck (am I right?).
>
> I didn't changed my implementation yet, but I was thinking to move to the
> second approach (rank 0 reads and BCast the info) expecting to have better
> results.
>
> Does anyone have any experience?
>
> Actually I am not sure if this will be better. I understand that MPI uses
> sockets to pass all messages and an natural question is if this operation is
> faster than reading from files?
>
> Best regards,
> Luiz
>
> On Wed, Oct 22, 2008 at 12:10 AM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
>
> > How big is the file? What kind of file system is it on?
> >
> > Rajeev
> >
> > > -----Original Message-----
> > > From: owner-mpich-discuss at mcs.anl.gov
> > > [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
> > > Kamaraju Kusumanchi
> > > Sent: Tuesday, October 21, 2008 8:27 PM
> > > To: mpich-discuss at mcs.anl.gov
> > > Subject: [mpich-discuss] read the same file on all processes
> > >
> > > Hi all,
> > >
> > > I have a file which needs to be read on all the processes
> > > of an MPI job. If I read the same file simultaneously on all
> > > the processes, will it cause any problems?
> > >
> > > I can think of two other options such as
> > >
> > > - make multiple copies of the same file and read a separate
> > > file on different processes
> > > - read the file on rank 0 process, then use MPI_Bcast and
> > > transfer the contents across the remaining processes.
> > >
> > > Which approach should be preferred? I am thinking this
> > > must be something encountered by others. So, if there is a
> > > book/web page which explains these kind of things, a pointer
> > > to them would be most appreciated.
> > >
> > > regards
> > > raju
> > >
> > >
> >
> >
>
More information about the mpich-discuss
mailing list