[mpich-discuss] Hydra, Runtime error

Dave Goodell goodell at mcs.anl.gov
Wed Jan 26 09:19:08 CST 2011


On Jan 26, 2011, at 7:44 AM CST, Paul Hart wrote:

> I have been using mpd as the process manager. I would like to change to hydra since mpd is being deprecated. I compiled MPICH2-1.3.1 and was able to run the cpi example program. I then attempted to run another program and receive the following error (ran in verbose mode for more info). I am able to run the same program using mpd.
>  
> To my knowledge no one in the community that uses this program (Fire Dynamics Simulator, open source CFD tailored to fire, produced by community lead by the National Institute of Standards and Technology) has attempted to use hydra. They are still running on mpd.
[...]
> Fatal error in PMPI_Gatherv: Internal MPI error!, error stack:
> PMPI_Gatherv(376).....: MPI_Gatherv failed(sbuf=0x27a6c40, scount=1, MPI_DOUBLE_PRECISION, rbuf=0x27a6c40, rcnts=0x25b9670, displs=0x25b96f0, MPI_DOUBLE_PRECISION, root=0, MPI_COMM_WORLD) failed
> MPIR_Gatherv_impl(189): 
> MPIR_Gatherv(102).....: 
> MPIR_Localcopy(346)...: memcpy arguments alias each other, dst=0x27a6c40 src=0x27a6c40 len=8
> APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

It doesn't look like the problem has anything to do with hydra.  Your program is passing dst==src to MPI_Gatherv, but MPI does not permit the send and recv buffers to alias each other.  We usually try to check for these sorts of things at a high level, but occasionally we miss the upper level check and this lower level check in MPIR_Localcopy triggers instead.  This check was added sometime after 1.2, IIRC, so you hit it because you upgraded not because of hydra.

The correct fix is to pass MPI_IN_PLACE as the value of sendbuf at the root process.

I'll put an error check in MPI_Gatherv in order to make the error a bit easier to understand.

-Dave



More information about the mpich-discuss mailing list