[mpich-discuss] Internal memory allocation error?

Rajeev Thakur thakur at mcs.anl.gov
Sat Oct 18 16:39:12 CDT 2008


Yes, you do need MPI_Wait or MPI_Waitall but you can call the Irecv as early
as possible and delay the Wait until just before you need the data.

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
> Sent: Saturday, October 18, 2008 11:38 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] Internal memory allocation error?
> 
> Thanks Rajeev...since MPI_Irecv is nonblocking, should I pair 
> it up with an MPI_Wait to make sure I'm not trying to access 
> a buffer that hasn't been written to yet?
> 
> On Sat, Oct 18, 2008 at 9:38 AM, Rajeev Thakur 
> <thakur at mcs.anl.gov> wrote:
> > This can happen if the sender does too many sends and the receiver 
> > doesn't post receives fast enough. Try using MPI_Irecv and posting 
> > enough of them to match the incoming sends.
> >
> > Rajeev
> >
> >> -----Original Message-----
> >> From: owner-mpich-discuss at mcs.anl.gov 
> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
> >> Sent: Friday, October 17, 2008 4:19 PM
> >> To: mpich-discuss at mcs.anl.gov
> >> Subject: [mpich-discuss] Internal memory allocation error?
> >>
> >> Hello list-
> >>
> >> I have a fortran 90 program that loops over pixels in an image in 
> >> parallel.  There are 211K total pixels in the 
> field-of-view, and the 
> >> code always crashes around the 160K^th pixel, give or take 
> a hundred 
> >> or so, with the following message:
> >>
> >> Fatal error in MPI_Recv: Other MPI error, error stack:
> >> MPI_Recv(186).............................:
> >> MPI_Recv(buf=0x82210d0, count=2, MPI_INTEGER, src=0, 
> tag=MPI_ANY_TAG, 
> >> MPI_COMM_WORLD,
> >> status=0x82210e0) failed
> >> MPIDI_CH3i_Progress_wait(214).............: an error 
> occurred while 
> >> handling an event returned by MPIDU_Sock_Wait()
> >> MPIDI_CH3I_Progress_handle_sock_event(436):
> >> MPIDI_EagerContigIsend(567)...............: failure occurred while 
> >> allocating memory for a request object[cli_2]: aborting job:
> >>
> >> Now, I have no dynamically allocatable variables in the 
> code, so the 
> >> error means there is not enough memory in the buffer for all the 
> >> communication at this step?  I have increased 
> MP_BUFFER_MEM from the 
> >> default 64M to 128M with no change in the error.  Is it 
> possible that 
> >> I'm just trying to do too much at once with my dual-core 
> processor?  
> >> I wouldn't think so, I'm only running the code with 6 
> processes...and 
> >> I don't believe this is a data problem.
> >>
> >> Any ideas would be appreciated, and I can post any other 
> information 
> >> anyone wants.  Thanks!
> >>
> >>
> >>
> >> --
> >> Cheers,
> >> Brian
> >> brian.harker at gmail.com
> >>
> >>
> >> "In science, there is only physics; all the rest is 
> stamp-collecting."
> >>
> >> -Ernest Rutherford
> >>
> >>
> >
> >
> 
> 
> 
> --
> Cheers,
> Brian
> brian.harker at gmail.com
> 
> 
> "In science, there is only physics; all the rest is stamp-collecting."
> 
> -Ernest Rutherford
> 
> 




More information about the mpich-discuss mailing list