[mpich-discuss] Internal memory allocation error?

Brian Harker brian.harker at gmail.com
Sun Oct 19 16:39:04 CDT 2008


Hello Rajeev and list-

Well, I've replaced MPI_Send with MPI_ISend and MPI_Recv with
MPI_Irecv, with the corresponding MPI_Wait calls as late as I possibly
can while doing the intermediate calculations, and I still get the
error.  The error even comes up when I use only one slave process to
do the calculations (in essence the serial version of the algorithm).

Is there a limit on the tag value that accompanies the MPI_Send?



On Sat, Oct 18, 2008 at 3:39 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
> Yes, you do need MPI_Wait or MPI_Waitall but you can call the Irecv as early
> as possible and delay the Wait until just before you need the data.
>
> Rajeev
>
>> -----Original Message-----
>> From: owner-mpich-discuss at mcs.anl.gov
>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>> Sent: Saturday, October 18, 2008 11:38 AM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] Internal memory allocation error?
>>
>> Thanks Rajeev...since MPI_Irecv is nonblocking, should I pair
>> it up with an MPI_Wait to make sure I'm not trying to access
>> a buffer that hasn't been written to yet?
>>
>> On Sat, Oct 18, 2008 at 9:38 AM, Rajeev Thakur
>> <thakur at mcs.anl.gov> wrote:
>> > This can happen if the sender does too many sends and the receiver
>> > doesn't post receives fast enough. Try using MPI_Irecv and posting
>> > enough of them to match the incoming sends.
>> >
>> > Rajeev
>> >
>> >> -----Original Message-----
>> >> From: owner-mpich-discuss at mcs.anl.gov
>> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>> >> Sent: Friday, October 17, 2008 4:19 PM
>> >> To: mpich-discuss at mcs.anl.gov
>> >> Subject: [mpich-discuss] Internal memory allocation error?
>> >>
>> >> Hello list-
>> >>
>> >> I have a fortran 90 program that loops over pixels in an image in
>> >> parallel.  There are 211K total pixels in the
>> field-of-view, and the
>> >> code always crashes around the 160K^th pixel, give or take
>> a hundred
>> >> or so, with the following message:
>> >>
>> >> Fatal error in MPI_Recv: Other MPI error, error stack:
>> >> MPI_Recv(186).............................:
>> >> MPI_Recv(buf=0x82210d0, count=2, MPI_INTEGER, src=0,
>> tag=MPI_ANY_TAG,
>> >> MPI_COMM_WORLD,
>> >> status=0x82210e0) failed
>> >> MPIDI_CH3i_Progress_wait(214).............: an error
>> occurred while
>> >> handling an event returned by MPIDU_Sock_Wait()
>> >> MPIDI_CH3I_Progress_handle_sock_event(436):
>> >> MPIDI_EagerContigIsend(567)...............: failure occurred while
>> >> allocating memory for a request object[cli_2]: aborting job:
>> >>
>> >> Now, I have no dynamically allocatable variables in the
>> code, so the
>> >> error means there is not enough memory in the buffer for all the
>> >> communication at this step?  I have increased
>> MP_BUFFER_MEM from the
>> >> default 64M to 128M with no change in the error.  Is it
>> possible that
>> >> I'm just trying to do too much at once with my dual-core
>> processor?
>> >> I wouldn't think so, I'm only running the code with 6
>> processes...and
>> >> I don't believe this is a data problem.
>> >>
>> >> Any ideas would be appreciated, and I can post any other
>> information
>> >> anyone wants.  Thanks!
>> >>
>> >>
>> >>
>> >> --
>> >> Cheers,
>> >> Brian
>> >> brian.harker at gmail.com
>> >>
>> >>
>> >> "In science, there is only physics; all the rest is
>> stamp-collecting."
>> >>
>> >> -Ernest Rutherford
>> >>
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brian
>> brian.harker at gmail.com
>>
>>
>> "In science, there is only physics; all the rest is stamp-collecting."
>>
>> -Ernest Rutherford
>>
>>
>
>



-- 
Cheers,
Brian
brian.harker at gmail.com


"In science, there is only physics; all the rest is stamp-collecting."

-Ernest Rutherford




More information about the mpich-discuss mailing list