[MPICH] exception hangs app
Tom Hilinski
tom.hilinski at comcast.net
Wed Aug 30 20:26:35 CDT 2006
This is as short as I can get which has a similar structure.
Essentially, there are 3+ tasks. Task 0 throws the exception after
sending a message to task 1. Task 1 receives messages. Task 2+ does the
work.
Running on a uniprocessor machine, mpich configured w/out theads, I get
the following output:
$ ./mpiexec -n 3 ./test1.exe
Task::ThrowException: Send
DoTask rank 2
The task never stops. mpdlistjobs shows 3 processes running.
Here's the example code:
// mpi test of throwing std::exception
#include <string>
#include <stdexcept>
#include <memory>
#include <iostream>
using std::cout;
using std::endl;
#define MPICH_IGNORE_CXX_SEEK
#include "mpi.h"
class LogsMessages
{
public:
LogsMessages ()
{
cout << "LogsMessages: Probe/Recv" << endl;
MPI::Status status;
MPI::COMM_WORLD.Probe ( MPI_ANY_SOURCE, MPI_ANY_TAG, status );
int const count = status.Get_count (MPI::CHAR);
char * buffer = new char [count + 1];
MPI::COMM_WORLD.Recv (
buffer, count, MPI::CHAR,
status.Get_source(), status.Get_tag() );
std::string msg (buffer, buffer + count);
cout << "LogsMessages msg: " << msg << endl;
delete [] buffer;
}
};
class Task
{
public:
Task (int const rank)
{
if ( rank == 0 )
ThrowException (rank);
else if ( rank == 1 )
LogsMessages msgs ();
else
DoTask (rank);
}
private:
void ThrowException (int const rank)
{
cout << "Task::ThrowException: Send" << endl;
std::string msg = "thrown by Task::ThrowException";
MPI::COMM_WORLD.Send ( msg.data(), msg.size(), MPI::CHAR, 1, 0 );
throw std::runtime_error (msg);
}
void DoTask (int const rank)
{
cout << "DoTask rank " << rank << endl;
}
};
int main (int argc, char** argv)
{
std::auto_ptr<Task> pTask;
int rank = -1;
try
{
MPI::Init (argc, argv);
MPI::COMM_WORLD.Set_errhandler (MPI::ERRORS_THROW_EXCEPTIONS);
rank = MPI::COMM_WORLD.Get_rank();
pTask.reset ( new Task ( rank ) );
}
catch (MPI::Exception & e)
{
cout << rank << ": MPI::Exception: " << e.Get_error_string() <<
endl;
MPI::COMM_WORLD.Abort (1);
}
catch (std::exception const & e)
{
cout << rank << ": std::exception: " << e.what() << endl;
MPI::COMM_WORLD.Abort (2);
}
MPI::Finalize ();
return 0;
}
William Gropp wrote:
> Can you send a short program that illustrates the problem? What you
> describe should work.
>
> Bill
>
> On Aug 30, 2006, at 4:00 PM, Tom Hilinski wrote:
>
>> For mpi/c++ gurus: I have an app that has classes can throw an
>> exception, specifically std::runtime_error. main() has a try-catch
>> for these as well as MPI::Exception. When a process (rank == 0)
>> throws an exception, the process seems to go off into space - I can't
>> trace it in a debugger, and the exception is not caught by the catch
>> (which immediately writes e.what() to cout). I can kill it with
>> mpdkilljob.
>>
>> I've reproduced this behavior on both suse linux 10.0/x86_64 and
>> Win2K/cygwin. MPICH version is mpich2-1.0.4p1, which I've built with
>> debugging enabled (--enable-debuginfo --enable-g=dbg,handle) and no
>> threads.
>>
>> Any suggestions you have would be great.
>>
>
>
--
Tom Hilinski
-----------------------------------------------
e-mail Personal: Tom.Hilinski at comcast.net
CSU: Tom.Hilinski at ColoState.edu
-----------------------------------------------
More information about the mpich-discuss
mailing list