[mpich-discuss] Internal memory allocation error?
Rob Ross
rross at mcs.anl.gov
Sat Oct 18 11:48:50 CDT 2008
Hi Brian,
Yes, you definitely need to wait for the request to complete before
accessing the buffer.
Rob
On Oct 18, 2008, at 11:37 AM, Brian Harker wrote:
> Thanks Rajeev...since MPI_Irecv is nonblocking, should I pair it up
> with an MPI_Wait to make sure I'm not trying to access a buffer that
> hasn't been written to yet?
>
> On Sat, Oct 18, 2008 at 9:38 AM, Rajeev Thakur <thakur at mcs.anl.gov>
> wrote:
>> This can happen if the sender does too many sends and the receiver
>> doesn't
>> post receives fast enough. Try using MPI_Irecv and posting enough
>> of them to
>> match the incoming sends.
>>
>> Rajeev
>>
>>> -----Original Message-----
>>> From: owner-mpich-discuss at mcs.anl.gov
>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>>> Sent: Friday, October 17, 2008 4:19 PM
>>> To: mpich-discuss at mcs.anl.gov
>>> Subject: [mpich-discuss] Internal memory allocation error?
>>>
>>> Hello list-
>>>
>>> I have a fortran 90 program that loops over pixels in an
>>> image in parallel. There are 211K total pixels in the
>>> field-of-view, and the code always crashes around the 160K^th
>>> pixel, give or take a hundred or so, with the following message:
>>>
>>> Fatal error in MPI_Recv: Other MPI error, error stack:
>>> MPI_Recv(186).............................:
>>> MPI_Recv(buf=0x82210d0, count=2, MPI_INTEGER, src=0,
>>> tag=MPI_ANY_TAG, MPI_COMM_WORLD,
>>> status=0x82210e0) failed
>>> MPIDI_CH3i_Progress_wait(214).............: an error occurred
>>> while handling an event returned by MPIDU_Sock_Wait()
>>> MPIDI_CH3I_Progress_handle_sock_event(436):
>>> MPIDI_EagerContigIsend(567)...............: failure occurred
>>> while allocating memory for a request object[cli_2]: aborting job:
>>>
>>> Now, I have no dynamically allocatable variables in the code,
>>> so the error means there is not enough memory in the buffer
>>> for all the communication at this step? I have increased
>>> MP_BUFFER_MEM from the default 64M to 128M with no change in
>>> the error. Is it possible that I'm just trying to do too
>>> much at once with my dual-core processor? I wouldn't think
>>> so, I'm only running the code with 6 processes...and I don't
>>> believe this is a data problem.
>>>
>>> Any ideas would be appreciated, and I can post any other
>>> information anyone wants. Thanks!
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Brian
>>> brian.harker at gmail.com
>>>
>>>
>>> "In science, there is only physics; all the rest is stamp-
>>> collecting."
>>>
>>> -Ernest Rutherford
>>>
>>>
>>
>>
>
>
>
> --
> Cheers,
> Brian
> brian.harker at gmail.com
>
>
> "In science, there is only physics; all the rest is stamp-collecting."
>
> -Ernest Rutherford
>
More information about the mpich-discuss
mailing list