[mpich-discuss] MPICH2 hangs during debug
Prashanth
prashanth.dumpuri at gmail.com
Thu Dec 9 16:43:22 CST 2010
Thanks for all your help. My code works but part of it still has memory
leaks. Hopefully the solution to that lies in the answer to this
question(s):
Currently I let all processors create one sequential array and populate
the same array. Is this thread-safe? Or should I let root node create the
array and then broadcast it? If I let root node create that array, how do i
ensure other nodes/processors are not going to jump and execute the
statements below the array creation?
Thanks again
On Thu, Dec 9, 2010 at 4:30 PM, Jayesh Krishna <jayesh at mcs.anl.gov> wrote:
> Hi,
> Yes, you can execute non-parallel version of class 1 from the root MPI
> process (rank = 0) and broadcast the result to other MPI processes.
> Typically you divide work (computation) among MPI processes if the work is
> scalable and the cost of computation of the results using a single worker is
> greater than the (cost of computation of the divided work in a single worker
> + cost of communication of the result among MPI processes). If the work is
> significantly small you might be better off doing it in a single process.
> Make sure that when you call MPI_Bcast() from all the MPI processes
> (Correctly speaking, all MPI processes belonging to the same communicator
> should call MPI_Bcast(). In your case the communicator, MPI_COMM_WORLD,
> consists of all the MPI processes).
> From your email I am guessing that you no longer have any problems
> executing your MPI program with MPICH2 (Am I right ? ).
>
> Regards,
> Jayesh
> ----- Original Message -----
> From: Prashanth <prashanth.dumpuri at gmail.com>
> To: Jayesh Krishna <jayesh at mcs.anl.gov>
> Cc: mpich-discuss at mcs.anl.gov
> Sent: Thu, 09 Dec 2010 15:38:55 -0600 (CST)
> Subject: Re: [mpich-discuss] MPICH2 hangs during debug
>
> Jayesh,
> Just playing catch up in the next 2 lines - I have 3 classes 2 of
> which have been parallelized using MPI and all 3 are being instantiated and
> executed in my main code. Since one of the classes was not parallelized i
> had it inside an "if" condition: if rank==zero then execute the
> non-parallel
> class1. I then used MPI_BCast to broadcast the results from class1 to all
> the processors.
> Turns out this "if" condition caused the child processors (those
> with rank not equal to zero) to jump directly to the functions below the
> MPI_BCast command while root node was still stuck at MPI_BCast. This caused
> my code/debugger to hang.
> It takes me less than 2seconds to execute class1 so I can get away
> with running class1 on all the processors but I want to learn the "proper"
> way to execute a MPI application. Any thoughts?
> Please let me know if the above email was confusing.
> Thanks
> Prashanth
>
>
> On Thu, Dec 9, 2010 at 1:48 PM, Prashanth <prashanth.dumpuri at gmail.com
> >wrote:
>
> > Jayesh,
> > Thanks for the immediate response (again). You need VTK (visualization
> > toolkit) and PETSc (parallel sparse matrix) libraries to compile my code.
> > Can I include these toolkits in the zip file i'm going to send you?
> > Thanks
> > Prashanth
> >
> > On Thu, Dec 9, 2010 at 1:44 PM, Jayesh Krishna <jayesh at mcs.anl.gov>
> wrote:
> >
> >> Hi,
> >> The parameters to MPI_Bcast() look alright. It might be easier for me
> to
> >> debug if you can send me the complete code. Can you send a zip of the
> >> complete code to jayesh at mcs.anl.gov ?
> >>
> >> Regards,
> >> Jayesh
> >> ----- Original Message -----
> >> From: Prashanth <prashanth.dumpuri at gmail.com>
> >> To: Jayesh Krishna <jayesh at mcs.anl.gov>
> >> Cc: mpich-discuss at mcs.anl.gov
> >> Sent: Thu, 09 Dec 2010 13:33:08 -0600 (CST)
> >> Subject: Re: [mpich-discuss] MPICH2 hangs during debug
> >>
> >> Jayesh,
> >> Thanks for the immediate response. The example MPI application ran
> fine
> >> in the Debug mode. MPI_BCast did not hang for the example application.
> To
> >> answer your question as to why i'm reading the same file from all MPI
> >> processes, I couldn't figure out how to use MPI_Byte(s) / MPI_Send() to
> >> send
> >> data from root node to all the child nodes. Once I figure that out I'll
> >> ask
> >> my code to read the data on the root node and then broadcast it to child
> >> nodes.
> >> I'm not sure if can attach documents to this mailing list and hence
> 'am
> >> just pasting the snippet of my code containing the MPI call. If you need
> >> the
> >> entire code, please let me know how to send it to you and i'll send it
> to
> >> you. The code hangs at the MPI_BCast command in the Debug mode. FYI,
> I'm
> >> using another toolkit VTK - visualization toolkit - to format my data.
> >> Output1 from class1 is in the vtk data format and translates to a double
> >> array in C++.
> >>
> >> int main( int argc, char * argv[] )
> >> {
> >>
> >> MPI_Init(&argc,&argv);
> >> int rank, number_of_processors;
> >> MPI_Comm_size(MPI_COMM_WORLD,& number_of_processors);
> >> MPI_Comm_rank(MPI_COMM_WORLD,&rank);
> >>
> >> // read inputs on all processors
> >> // snipped for brevity
> >>
> >> // class1 - read the inputs and generate output1: need not be
> >> parallelized
> >> vtkSmartPointer<vtkGetSignedClosestPointDistances>
> >> ComputeSignedClosestPointDistances =
> >>
> >>
> >> vtkSmartPointer<vtkGetSignedClosestPointDistances>::New();
> >> if ( rank == 0 )
> >> {
> >> ComputeSignedClosestPointDistances->SetInput( inputs i just
> read
> >> );
> >> ComputeSignedClosestPointDistances->Update();
> >> }
> >> // get output1 from class1
> >> vtkSmartPointer<vtkDoubleArray> signedclosestpointdistancesbefore =
> >> vtkSmartPointer<vtkDoubleArray>::New();
> >>
> >> signedclosestpointdistancesbefore->DeepCopy(
> >> ComputeSignedClosestPointDistances->GetSignedClosestPointDistances() );
> >>
> >> // GetVoidPointer(0) returns the void pointer at 0th element
> >> // GetNumberOfTuples - size of the array
> >> MPI_Bcast( signedclosestpointdistancesbefore->GetVoidPointer(0),
> >> signedclosestpointdistancesbefore->GetNumberOfTuples(), MPI_DOUBLE, 0,
> >> MPI_COMM_WORLD );
> >>
> >> // code snipped for brevity
> >>
> >> MPI_Finalize();
> >>
> >> return 0;
> >> }
> >>
> >> Thanks again for all your help
> >> Prashanth
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101209/26a327d3/attachment.htm>
More information about the mpich-discuss
mailing list