[MPICH] exception hangs app

Tom Hilinski tom.hilinski at comcast.net
Wed Aug 30 20:26:35 CDT 2006


This is as short as I can get which has a similar structure. 
Essentially, there are 3+ tasks. Task 0 throws the exception after 
sending a message to task 1. Task 1 receives messages. Task 2+ does the 
work.

Running on a uniprocessor machine, mpich configured w/out theads, I get 
the following output:

   $ ./mpiexec -n 3 ./test1.exe
   Task::ThrowException: Send
   DoTask rank 2

The task never stops. mpdlistjobs shows 3 processes running.
Here's the example code:

// mpi test of throwing std::exception

#include <string>
#include <stdexcept>
#include <memory>
#include <iostream>
using std::cout;
using std::endl;
#define MPICH_IGNORE_CXX_SEEK
#include "mpi.h"

class LogsMessages
{
  public:

    LogsMessages ()
    {
        cout << "LogsMessages: Probe/Recv" << endl;
        MPI::Status status;
        MPI::COMM_WORLD.Probe ( MPI_ANY_SOURCE, MPI_ANY_TAG, status );
        int const count = status.Get_count (MPI::CHAR);
        char * buffer = new char [count + 1];
        MPI::COMM_WORLD.Recv (
                buffer, count, MPI::CHAR,
                status.Get_source(), status.Get_tag() );
        std::string msg (buffer, buffer + count);
        cout << "LogsMessages msg: " << msg << endl;
        delete [] buffer;
    }
};

class Task
{
  public:

    Task (int const rank)
    {
      if ( rank == 0 )
        ThrowException (rank);
      else if ( rank == 1 )
        LogsMessages msgs ();
      else
        DoTask (rank);
    }

  private:

    void ThrowException (int const rank)
    {
      cout << "Task::ThrowException: Send" << endl;
      std::string msg = "thrown by Task::ThrowException";
      MPI::COMM_WORLD.Send ( msg.data(), msg.size(), MPI::CHAR, 1, 0 );
      throw std::runtime_error (msg);
    }

    void DoTask (int const rank)
    {
      cout << "DoTask rank " << rank << endl;
    }
};

int main (int argc, char** argv)
{
    std::auto_ptr<Task> pTask;
    int rank = -1;
    try
    {
        MPI::Init (argc, argv);
        MPI::COMM_WORLD.Set_errhandler (MPI::ERRORS_THROW_EXCEPTIONS);
        rank = MPI::COMM_WORLD.Get_rank();
        pTask.reset ( new Task ( rank ) );
    }
    catch (MPI::Exception & e)
    {
        cout << rank << ": MPI::Exception: " << e.Get_error_string() << 
endl;
        MPI::COMM_WORLD.Abort (1);
    }
    catch (std::exception const & e)
    {
        cout << rank << ": std::exception: " << e.what() << endl;
        MPI::COMM_WORLD.Abort (2);
    }
    MPI::Finalize ();
    return 0;
}



William Gropp wrote:
> Can you send a short program that illustrates the problem?  What you 
> describe should work.
>
> Bill
>
> On Aug 30, 2006, at 4:00 PM, Tom Hilinski wrote:
>
>> For mpi/c++ gurus: I have an app that has classes can throw an 
>> exception, specifically std::runtime_error. main() has a try-catch 
>> for these as well as MPI::Exception. When a process (rank == 0) 
>> throws an exception, the process seems to go off into space - I can't 
>> trace it in a debugger, and the exception is not caught by the catch 
>> (which immediately writes e.what() to cout).  I can kill it with 
>> mpdkilljob.
>>
>> I've reproduced this behavior on both suse linux 10.0/x86_64 and 
>> Win2K/cygwin. MPICH version is mpich2-1.0.4p1, which I've built with 
>> debugging enabled (--enable-debuginfo --enable-g=dbg,handle) and no 
>> threads.
>>
>> Any suggestions you have would be great.
>>
>
>

-- 


Tom Hilinski
-----------------------------------------------
e-mail    Personal: Tom.Hilinski at comcast.net
          CSU:      Tom.Hilinski at ColoState.edu
-----------------------------------------------




More information about the mpich-discuss mailing list