[mpich-discuss] Internal memory allocation error?

Brian Harker brian.harker at gmail.com
Sun Oct 19 18:47:36 CDT 2008


Hi Gib and list-

nx and ny are declared as constant parameters in module "subroutines"
before the "contains" line, so they should be accessible to any scope
that "use"s the "subroutines" module.  The variable "numsent" is
updated every time the master process sends new pixel coordinates to a
slave process.  I have extensively checked the distribution of pixel
coordinates, and it is as it should be.  Thanks!

On Sun, Oct 19, 2008 at 5:31 PM, Gib Bogle <g.bogle at auckland.ac.nz> wrote:
> Hi Brian,
>
> I've just looked quickly at the first few lines of your main program, and I
> see a couple odd things.
>
> (1) You say that nx and ny are set in the module subroutines, but I don't
> see any call to a subroutine to set nx and ny before they are used.
>
> (2) Assuming that somehow you initialize nx to 415 and ny to 509, I don't
> understand these lines:
>
>  pxl(1) = INT(numsent/ny) + 1
>  pxl(2) = MOD(numsent,ny) + 1
>
> since numsent < proc_num, which makes pxl(1) = 1, and pxl(2) = numsent+1.
>  Is this what you want?
>
> Gib
>
> Brian Harker wrote:
>>
>> Hi Rajeev and list-
>>
>> Here's a code sample.  I'm assuming you could replace my subroutine
>> "invert_pixel" with a dummy subroutine, and integer parameters, nx and
>> ny (415 and 509 in my code) with something else.  BTW, I am using
>> MPICH2 1.0.7 with the Intel icc,icpc,ifort compiler suite.  Thanks a
>> lot!
>>
>> On Sun, Oct 19, 2008 at 3:59 PM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
>>>
>>> Can you send us a code fragment that shows exactly what you are doing and
>>> how many sends/recvs are being issued? You don't need to change sends to
>>> isends, just the recvs.
>>>
>>> Rajeev
>>>
>>>> -----Original Message-----
>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>>>> Sent: Sunday, October 19, 2008 4:39 PM
>>>> To: mpich-discuss at mcs.anl.gov
>>>> Subject: Re: [mpich-discuss] Internal memory allocation error?
>>>>
>>>> Hello Rajeev and list-
>>>>
>>>> Well, I've replaced MPI_Send with MPI_ISend and MPI_Recv with
>>>> MPI_Irecv, with the corresponding MPI_Wait calls as late as I
>>>> possibly can while doing the intermediate calculations, and I
>>>> still get the error.  The error even comes up when I use only
>>>> one slave process to do the calculations (in essence the
>>>> serial version of the algorithm).
>>>>
>>>> Is there a limit on the tag value that accompanies the MPI_Send?
>>>>
>>>>
>>>>
>>>> On Sat, Oct 18, 2008 at 3:39 PM, Rajeev Thakur
>>>> <thakur at mcs.anl.gov> wrote:
>>>>>
>>>>> Yes, you do need MPI_Wait or MPI_Waitall but you can call
>>>>
>>>> the Irecv as
>>>>>
>>>>> early as possible and delay the Wait until just before you
>>>>
>>>> need the data.
>>>>>
>>>>> Rajeev
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>>>>>> Sent: Saturday, October 18, 2008 11:38 AM
>>>>>> To: mpich-discuss at mcs.anl.gov
>>>>>> Subject: Re: [mpich-discuss] Internal memory allocation error?
>>>>>>
>>>>>> Thanks Rajeev...since MPI_Irecv is nonblocking, should I
>>>>
>>>> pair it up
>>>>>>
>>>>>> with an MPI_Wait to make sure I'm not trying to access a
>>>>
>>>> buffer that
>>>>>>
>>>>>> hasn't been written to yet?
>>>>>>
>>>>>> On Sat, Oct 18, 2008 at 9:38 AM, Rajeev Thakur
>>>>
>>>> <thakur at mcs.anl.gov>
>>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>> This can happen if the sender does too many sends and
>>>>
>>>> the receiver
>>>>>>>
>>>>>>> doesn't post receives fast enough. Try using MPI_Irecv
>>>>
>>>> and posting
>>>>>>>
>>>>>>> enough of them to match the incoming sends.
>>>>>>>
>>>>>>> Rajeev
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>>>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
>>>>
>>>> Brian Harker
>>>>>>>>
>>>>>>>> Sent: Friday, October 17, 2008 4:19 PM
>>>>>>>> To: mpich-discuss at mcs.anl.gov
>>>>>>>> Subject: [mpich-discuss] Internal memory allocation error?
>>>>>>>>
>>>>>>>> Hello list-
>>>>>>>>
>>>>>>>> I have a fortran 90 program that loops over pixels in
>>>>
>>>> an image in
>>>>>>>>
>>>>>>>> parallel.  There are 211K total pixels in the
>>>>>>
>>>>>> field-of-view, and the
>>>>>>>>
>>>>>>>> code always crashes around the 160K^th pixel, give or take
>>>>>>
>>>>>> a hundred
>>>>>>>>
>>>>>>>> or so, with the following message:
>>>>>>>>
>>>>>>>> Fatal error in MPI_Recv: Other MPI error, error stack:
>>>>>>>> MPI_Recv(186).............................:
>>>>>>>> MPI_Recv(buf=0x82210d0, count=2, MPI_INTEGER, src=0,
>>>>>>
>>>>>> tag=MPI_ANY_TAG,
>>>>>>>>
>>>>>>>> MPI_COMM_WORLD,
>>>>>>>> status=0x82210e0) failed
>>>>>>>> MPIDI_CH3i_Progress_wait(214).............: an error
>>>>>>
>>>>>> occurred while
>>>>>>>>
>>>>>>>> handling an event returned by MPIDU_Sock_Wait()
>>>>>>>> MPIDI_CH3I_Progress_handle_sock_event(436):
>>>>>>>> MPIDI_EagerContigIsend(567)...............: failure
>>>>
>>>> occurred while
>>>>>>>>
>>>>>>>> allocating memory for a request object[cli_2]: aborting job:
>>>>>>>>
>>>>>>>> Now, I have no dynamically allocatable variables in the
>>>>>>
>>>>>> code, so the
>>>>>>>>
>>>>>>>> error means there is not enough memory in the buffer
>>>>
>>>> for all the
>>>>>>>>
>>>>>>>> communication at this step?  I have increased
>>>>>>
>>>>>> MP_BUFFER_MEM from the
>>>>>>>>
>>>>>>>> default 64M to 128M with no change in the error.  Is it
>>>>>>
>>>>>> possible that
>>>>>>>>
>>>>>>>> I'm just trying to do too much at once with my dual-core
>>>>>>
>>>>>> processor?
>>>>>>>>
>>>>>>>> I wouldn't think so, I'm only running the code with 6
>>>>>>
>>>>>> processes...and
>>>>>>>>
>>>>>>>> I don't believe this is a data problem.
>>>>>>>>
>>>>>>>> Any ideas would be appreciated, and I can post any other
>>>>>>
>>>>>> information
>>>>>>>>
>>>>>>>> anyone wants.  Thanks!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cheers,
>>>>>>>> Brian
>>>>>>>> brian.harker at gmail.com
>>>>>>>>
>>>>>>>>
>>>>>>>> "In science, there is only physics; all the rest is
>>>>>>
>>>>>> stamp-collecting."
>>>>>>>>
>>>>>>>> -Ernest Rutherford
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Cheers,
>>>>>> Brian
>>>>>> brian.harker at gmail.com
>>>>>>
>>>>>>
>>>>>> "In science, there is only physics; all the rest is
>>>>
>>>> stamp-collecting."
>>>>>>
>>>>>> -Ernest Rutherford
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Cheers,
>>>> Brian
>>>> brian.harker at gmail.com
>>>>
>>>>
>>>> "In science, there is only physics; all the rest is stamp-collecting."
>>>>
>>>> -Ernest Rutherford
>>>>
>>>>
>>>
>>
>>
>>
>
>



-- 
Cheers,
Brian
brian.harker at gmail.com


"In science, there is only physics; all the rest is stamp-collecting."

-Ernest Rutherford




More information about the mpich-discuss mailing list