<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7226.0">
<TITLE>RE: [MPICH] Deadlock with multiple threads</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>Here is a test program I wrote to demonstrate this.<BR>
I tested it under Linux Redhat 9, with the g++ 3.4.3 compiler<BR>
on 3 dual xeon nodes.<BR>
and LD_ASSUME_KERNEL=2.4.1 (use classic less buggy thread library)<BR>
<BR>
runmpi -n 2 Pthread //works<BR>
runmpi -n 3 Pthread //deadlocks<BR>
<BR>
If I can hazard a guess, I suggest the spinlock in Recv is blocking the broadcast.<BR>
<BR>
David<BR>
<BR>
<BR>
<BR>
-----Original Message-----<BR>
From: Rusty Lusk [<A HREF="mailto:lusk@mcs.anl.gov">mailto:lusk@mcs.anl.gov</A>]<BR>
Sent: Thu 08-12-2005 6:18 PM<BR>
To: David Minor<BR>
Cc: mpich-discuss@mcs.anl.gov<BR>
Subject: Re: [MPICH] Deadlock with multiple threads<BR>
<BR>
Can you post the code? It sounds like a good test program.<BR>
<BR>
From: "David Minor" <david-m@orbotech.com><BR>
Subject: [MPICH] Deadlock with multiple threads<BR>
Date: Thu, 8 Dec 2005 17:02:56 +0200<BR>
<BR>
> Hi Mpiers,<BR>
><BR>
> The following situtation leads me to deadlock, why?<BR>
><BR>
> I have three nodes and two threads on each node.<BR>
> There are two communicators, on each node thread 1 (t1) is using communicator 1 (c1)<BR>
> and thread 2 is using communicator 2.<BR>
><BR>
> c1 is involved in a bcast on all three nodes on t1<BR>
> c2 on all t2's is in a recv from any source, for which a send has not yet been issued.<BR>
><BR>
> Why does the bcast on c1 deadlock? This seems to violate one of the rules for multi-threaded MPI.<BR>
><BR>
> When I run the above scenario on 2 nodes there is no deadlock.<BR>
><BR>
> Regards,<BR>
> David<BR>
<BR>
<BR>
</FONT>
</P>
</BODY>
</HTML>