[mpich-discuss] Internal memory allocation error?
Brian Harker
brian.harker at gmail.com
Mon Oct 20 11:16:20 CDT 2008
Hi Rajeev-
Yes, I am using 1.0.7, but I'm now considering installing 1.0.8rc1
just to see if it makes a difference.
On Mon, Oct 20, 2008 at 9:48 AM, Rajeev Thakur <thakur at mcs.anl.gov> wrote:
> Are you using the latest release, 1.0.7?
>
> Rajeev
>
>> -----Original Message-----
>> From: owner-mpich-discuss at mcs.anl.gov
>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>> Sent: Monday, October 20, 2008 9:07 AM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] Internal memory allocation error?
>>
>> Hi Rajeev-
>>
>> After the master process sends the coordinates for around the
>> 162,000th pixel, the targeted slave process fails to receive
>> them, and I get the error from my original post. Thanks!
>>
>> On Sun, Oct 19, 2008 at 7:11 PM, Rajeev Thakur
>> <thakur at mcs.anl.gov> wrote:
>> > After how many sends and receives does it fail?
>> >
>> > Rajeev
>> >
>> >> -----Original Message-----
>> >> From: owner-mpich-discuss at mcs.anl.gov
>> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Brian Harker
>> >> Sent: Sunday, October 19, 2008 7:28 PM
>> >> To: mpich-discuss at mcs.anl.gov
>> >> Subject: Re: [mpich-discuss] Internal memory allocation error?
>> >>
>> >> Hi Rajeev-
>> >>
>> >> The code I have posted is pared down quite a bit. I print the
>> >> "numsent" variable after each send and receive to make sure the
>> >> correct pixel has been dispatched. I guess I should
>> mention that if
>> >> I run the same program on some other data that I have,
>> which has only
>> >> 68K pixels, it works flawlessly.
>> >>
>> >> On Sun, Oct 19, 2008 at 5:57 PM, Rajeev Thakur
>> <thakur at mcs.anl.gov>
>> >> wrote:
>> >> > Brian,
>> >> > This code should not even need an Irecv because the
>> master and
>> >> > slave operate in lock step. And if you call Irecv followed
>> >> by a Wait
>> >> > immediately, you could as well call Recv.
>> >> >
>> >> > Since it fails with even one slave, run the 1 master 1
>> >> slave version
>> >> > and add print statements after the sends and receives to
>> >> see how many
>> >> > sends and receives are happening.
>> >> >
>> >> > Rajeev
>> >> >
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: owner-mpich-discuss at mcs.anl.gov
>> >> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
>> Brian Harker
>> >> >> Sent: Sunday, October 19, 2008 6:05 PM
>> >> >> To: mpich-discuss at mcs.anl.gov
>> >> >> Subject: Re: [mpich-discuss] Internal memory allocation error?
>> >> >>
>> >> >> Hi list-
>> >> >>
>> >> >> Here's my sample code for those that are wary of opening email
>> >> >> attachments :) Didn't think about that until just now.
>> >> >>
>> >> >> PROGRAM mpi_main
>> >> >>
>> >> >> USE subroutines ! contains subroutine "invert_pixel"
>> >> >> IMPLICIT NONE
>> >> >>
>> >> >> INCLUDE "mpif.h"
>> >> >>
>> >> >> ! total number of pixels
>> >> >> INTEGER, PARAMETER :: np = nx * ny !nx,ny specified in
>> >> >> module "subroutines"
>> >> >>
>> >> >> ! parameter at each pixel
>> >> >> DOUBLE PRECISION :: par
>> >> >>
>> >> >> ! full-size parameter array
>> >> >> DOUBLE PRECISION, DIMENSION(nx,ny) :: map
>> >> >>
>> >> >> ! pixel coordinates
>> >> >> INTEGER, DIMENSION(2) :: pxl
>> >> >>
>> >> >> ! dummy termination message
>> >> >> INTEGER, PARAMETER :: term = 0
>> >> >>
>> >> >> ! MPI-related
>> >> >> INTEGER :: proc_id
>> >> >> INTEGER :: proc_num
>> >> >> INTEGER :: master
>> >> >> INTEGER :: ierr
>> >> >> INTEGER :: sender
>> >> >> INTEGER :: tag
>> >> >> INTEGER :: numsent
>> >> >> INTEGER :: request
>> >> >> INTEGER, DIMENSION(MPI_STATUS_SIZE) :: status
>> >> >>
>> >> >> ! local indices
>> >> >> INTEGER :: i
>> >> >>
>> >> >>
>> >>
>> !********************************************************************!
>> >> >> !************************** BEGIN PROGRAM
>> >> ***************************!
>> >> >>
>> >>
>> !********************************************************************!
>> >> >> master = 0
>> >> >>
>> >> >> ! initialize MPI libraries...
>> >> >> CALL MPI_Init( ierr )
>> >> >> CALL MPI_Comm_size( MPI_COMM_WORLD, proc_num, ierr ) CALL
>> >> >> MPI_Comm_rank( MPI_COMM_WORLD, proc_id, ierr )
>> >> >>
>> >> >> ! --------------------------------------------- !
>> >> >> ! master process code !
>> >> >> ! --------------------------------------------- !
>> >> >> IF ( proc_id == master ) THEN
>> >> >>
>> >> >> ! dispatch first group of pixels to slave processes...
>> >> >> numsent = 0
>> >> >> DO i = 1, proc_num - 1
>> >> >>
>> >> >> pxl(1) = INT(numsent/ny) + 1
>> >> >> pxl(2) = MOD(numsent,ny) + 1
>> >> >> CALL MPI_Send( pxl, 2, MPI_INTEGER, i, numsent+1,
>> >> MPI_COMM_WORLD,
>> >> >> ierr )
>> >> >> numsent = numsent + 1
>> >> >>
>> >> >> END DO
>> >> >>
>> >> >> mstr: DO
>> >> >>
>> >> >> ! receive inferred parameter back from a slave...
>> >> >> CALL MPI_Recv( par, 1, MPI_DOUBLE_PRECISION,
>> >> MPI_ANY_SOURCE,
>> >> >> MPI_ANY_TAG, MPI_COMM_WORLD, status, ierr )
>> >> >> sender = status(MPI_SOURCE)
>> >> >> tag = status(MPI_TAG)
>> >> >>
>> >> >> ! assemble into full-size array...
>> >> >> x = INT(tag/ny) + 1
>> >> >> y = MOD(tag,ny) + 1
>> >> >>
>> >> >> map(x,y) = par
>> >> >>
>> >> >> IF ( tag == np ) THEN
>> >> >> ! all done, send termination message...
>> >> >> DO j = 1, proc_num - 1
>> >> >> CALL MPI_Send( term, 1, MPI_INTEGER, j, 0,
>> >> MPI_COMM_WORLD,
>> >> >> ierr )
>> >> >> END DO
>> >> >> EXIT mstr
>> >> >> END IF
>> >> >>
>> >> >> ! send the next available pixel to same slave...
>> >> >> pxl(1) = INT((numsent+1) / ny) + 1
>> >> >> pxl(2) = MOD((numsent+1), ny) + 1
>> >> >>
>> >> >> CALL MPI_Send( pxl, 2, MPI_INTEGER, sender,
>> >> >> numsent+1, MPI_COMM_WORLD, ierr )
>> >> >> numsent = numsent + 1
>> >> >>
>> >> >> END DO mstr
>> >> >> ! --------------------------------------------- !
>> >> >> ! slave process code !
>> >> >> ! --------------------------------------------- !
>> >> >> ELSE
>> >> >>
>> >> >> slv: DO
>> >> >>
>> >> >> ! receive pixel coordinates from master...
>> >> >> CALL MPI_Irecv( pxl, 2, MPI_INTEGER, master,
>> MPI_ANY_TAG,
>> >> >> MPI_COMM_WORLD, request, ierr )
>> >> >> CALL MPI_Wait( request, status, ierr )
>> >> >> tag = status(MPI_TAG)
>> >> >> IF ( tag == 0 ) THEN
>> >> >> ! all done, exit program...
>> >> >> EXIT slv
>> >> >> END IF
>> >> >>
>> >> >> ! call genetic algorithm for inversion...
>> >> >> CALL invert_pixel( pxl, par )
>> >> >>
>> >> >> ! send parameter result back to master...
>> >> >> CALL MPI_Send( par, 1, MPI_DOUBLE_PRECISION,
>> master, tag,
>> >> >> MPI_COMM_WORLD, ierr )
>> >> >> END DO slv
>> >> >>
>> >> >> END IF
>> >> >>
>> >> >> ! shut down MPI...
>> >> >> CALL MPI_Finalize( ierr )
>> >> >> WRITE(*,*) ""
>> >> >> IF ( ierr == 0 ) WRITE(*,*) "Goodbye from process",proc_id
>> >> >>
>> >> >>
>> >>
>> !********************************************************************!
>> >> >> !**************************** END PROGRAM
>> >> ***************************!
>> >> >>
>> >>
>> !********************************************************************!
>> >> >>
>> >> >> STOP
>> >> >> END PROGRAM mpi_main
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Cheers,
>> >> >> Brian
>> >> >> brian.harker at gmail.com
>> >> >>
>> >> >>
>> >> >> "In science, there is only physics; all the rest is
>> >> stamp-collecting."
>> >> >>
>> >> >> -Ernest Rutherford
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Cheers,
>> >> Brian
>> >> brian.harker at gmail.com
>> >>
>> >>
>> >> "In science, there is only physics; all the rest is
>> stamp-collecting."
>> >>
>> >> -Ernest Rutherford
>> >>
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brian
>> brian.harker at gmail.com
>>
>>
>> "In science, there is only physics; all the rest is stamp-collecting."
>>
>> -Ernest Rutherford
>>
>>
>
>
--
Cheers,
Brian
brian.harker at gmail.com
"In science, there is only physics; all the rest is stamp-collecting."
-Ernest Rutherford
More information about the mpich-discuss
mailing list