[mpich-discuss] MPICH2 hangs during debug

Thu Dec 9 16:30:19 CST 2010

Hi,
 Yes, you can execute non-parallel version of class 1 from the root MPI process (rank = 0) and broadcast the result to other MPI processes. Typically you divide work (computation) among MPI processes if the work is scalable and the cost of computation of the results using a single worker is greater than the (cost of computation of the divided work in a single worker + cost of communication of the result among MPI processes). If the work is significantly small you might be better off doing it in a single process.
 Make sure that when you call MPI_Bcast() from all the MPI processes (Correctly speaking, all MPI processes belonging to the same communicator should call MPI_Bcast(). In your case the communicator, MPI_COMM_WORLD, consists of all the MPI processes).
 From your email I am guessing that you no longer have any problems executing your MPI program with MPICH2 (Am I right ? ).

Regards,
Jayesh
----- Original Message -----
From: Prashanth <prashanth.dumpuri at gmail.com>
To: Jayesh Krishna <jayesh at mcs.anl.gov>
Cc: mpich-discuss at mcs.anl.gov
Sent: Thu, 09 Dec 2010 15:38:55 -0600 (CST)
Subject: Re: [mpich-discuss] MPICH2 hangs during debug

Jayesh,
       Just playing catch up in the next 2 lines - I have 3 classes 2 of
which have been parallelized using MPI and all 3 are being instantiated and
executed in my main code. Since one of the classes was not parallelized i
had it inside an "if" condition: if rank==zero then execute the non-parallel
class1. I then used MPI_BCast to broadcast the results from class1 to all
the processors.
        Turns out this "if" condition caused the child processors (those
with rank not equal to zero) to jump directly to the functions below the
MPI_BCast command while root node was still stuck at MPI_BCast. This caused
my code/debugger to hang.
         It takes me less than 2seconds to execute class1 so I can get away
with running class1 on all the processors but I want to learn the "proper"
way to execute a MPI application. Any thoughts?
         Please let me know if the above email was confusing.
Thanks
Prashanth

On Thu, Dec 9, 2010 at 1:48 PM, Prashanth <prashanth.dumpuri at gmail.com>wrote:

> Jayesh,
>    Thanks for the immediate response (again). You need VTK (visualization
> toolkit) and PETSc (parallel sparse matrix) libraries to compile my code.
> Can I include these toolkits in the zip file i'm going to send you?
> Thanks
> Prashanth
>
> On Thu, Dec 9, 2010 at 1:44 PM, Jayesh Krishna <jayesh at mcs.anl.gov> wrote:
>
>> Hi,
>>  The parameters to MPI_Bcast() look alright. It might be easier for me to
>> debug if you can send me the complete code. Can you send a zip of the
>> complete code to jayesh at mcs.anl.gov ?
>>
>> Regards,
>> Jayesh
>> ----- Original Message -----
>> From: Prashanth <prashanth.dumpuri at gmail.com>
>> To: Jayesh Krishna <jayesh at mcs.anl.gov>
>> Cc: mpich-discuss at mcs.anl.gov
>> Sent: Thu, 09 Dec 2010 13:33:08 -0600 (CST)
>> Subject: Re: [mpich-discuss] MPICH2 hangs during debug
>>
>> Jayesh,
>>    Thanks for the immediate response. The example MPI application ran fine
>> in the Debug mode. MPI_BCast did not hang for the example application. To
>> answer your question as to why i'm reading the same file from all MPI
>> processes, I couldn't figure out how to use MPI_Byte(s) / MPI_Send() to
>> send
>> data from root node to all the child nodes. Once I figure that out I'll
>> ask
>> my code to read the data on the root node and then broadcast it to child
>> nodes.
>>   I'm not sure if can attach documents to this mailing list and hence 'am
>> just pasting the snippet of my code containing the MPI call. If you need
>> the
>> entire code, please let me know how to send it to you and i'll send it to
>> you.  The code hangs at the MPI_BCast command in the Debug mode. FYI, I'm
>> using another toolkit VTK - visualization toolkit - to format my data.
>> Output1 from class1 is in the vtk data format and translates to a double
>> array in C++.
>>
>> int main(  int argc, char * argv[] )
>> {
>>
>>    MPI_Init(&argc,&argv);
>>    int rank, number_of_processors;
>>    MPI_Comm_size(MPI_COMM_WORLD,& number_of_processors);
>>    MPI_Comm_rank(MPI_COMM_WORLD,&rank);
>>
>>    // read inputs on all processors
>>    // snipped for brevity
>>
>>   // class1 - read the inputs and generate output1: need not be
>> parallelized
>>  vtkSmartPointer<vtkGetSignedClosestPointDistances>
>> ComputeSignedClosestPointDistances =
>>
>>
>>  vtkSmartPointer<vtkGetSignedClosestPointDistances>::New();
>>  if ( rank == 0 )
>>  {
>>           ComputeSignedClosestPointDistances->SetInput( inputs i just read
>> );
>>           ComputeSignedClosestPointDistances->Update();
>>  }
>>   // get output1 from class1
>>   vtkSmartPointer<vtkDoubleArray> signedclosestpointdistancesbefore =
>> vtkSmartPointer<vtkDoubleArray>::New();
>>
>> signedclosestpointdistancesbefore->DeepCopy(
>> ComputeSignedClosestPointDistances->GetSignedClosestPointDistances() );
>>
>>    // GetVoidPointer(0) returns the void pointer at 0th element
>>    // GetNumberOfTuples - size of the array
>>    MPI_Bcast( signedclosestpointdistancesbefore->GetVoidPointer(0),
>> signedclosestpointdistancesbefore->GetNumberOfTuples(), MPI_DOUBLE, 0,
>>                                                 MPI_COMM_WORLD );
>>
>>   // code snipped for brevity
>>
>>   MPI_Finalize();
>>
>>    return 0;
>> }
>>
>> Thanks again for all your help
>> Prashanth
>>
>>
>