[MPICH] RE: Romio status and mailing list
Rajeev Thakur
thakur at mcs.anl.gov
Thu Mar 16 09:56:46 CST 2006
Sylvain,
On issue is that the MPI Standard defines the default error handler
to the MPI_ERRORS_RETURN for I/O, whereas it is MPI_ERRORS_ARE_FATAL for the
rest of MPI. Are you checking the error returns from the MPI-IO functions?
Rajeev
PS: To post to mpich-discuss, you need to subscribe to the list. See
http://www-unix.mcs.anl.gov/mpi/mpich2/maillist.htm
> -----Original Message-----
> From: Sylvain Jeaugey [mailto:sylvain.jeaugey at bull.net]
> Sent: Thursday, March 16, 2006 2:28 AM
> To: Rajeev Thakur
> Cc: 'Sylvain Jeaugey'; mpich-discuss at mcs.anl.gov
> Subject: RE: Romio status and mailing list
>
> Rajeev,
>
> Thanks for your mail.
> It would seem fine to me if the other processes caused an
> abort. Still, it
> doesn't happen (did I misconfigure MPICH ?), and it often
> causes a hang
> (0 being in barrier, send/recv or finalize), and this is much more
> an issue in my point of view.
>
> I think that aborting would be enough to guarantee a "clean" job
> termination and still keep good performance (gather being often much
> slower than bcast).
>
> Comments welcome.
>
> Cheers,
> Sylvain
>
> PS: I added mpich-discuss to CCs and removed *@mcs.anl.gov.
>
> On Tue, 14 Mar 2006, Rajeev Thakur wrote:
>
> > Sylvain,
> > With MPI_Bcast, at least one process will detect
> the inconsistent
> > parameter and complain, not necessarily the root. It's not
> essential that
> > the root be the complainer I think.
> >
> > > From: Sylvain Jeaugey [mailto:sylvain.jeaugey at bull.net]
> > > Sent: Tuesday, March 14, 2006 5:51 AM
> > > Subject: Romio status and mailing list
> > >
> > > We have a few concerns about the code behaviour in error
> > > cases, especially
> > > on the arguments checks done with MPI_Bcast. Indeed, using
> > > this function,
> > > the root process won't be able to detect an
> inconsistency. Thus, we're
> > > wondering if MPI_Gather wouldn't do the job better than Bcast
> > > (though it
> > > would be worse in terms of performance).
> > >
> > > Sylvain
> > >
> > >
> > >
> >
>
>
More information about the mpich-discuss
mailing list