<div class="gmail_quote">On Fri, Jan 20, 2012 at 11:31, Wen Jiang <span dir="ltr"><<a href="mailto:jiangwen84@gmail.com">jiangwen84@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div id=":460">The serial job is running without any problems and never stalls. Actually the parallel jobs also running successfully on distributed-memory desktop or on single node of cluster. It will get stuck if it is running on more than one compute node(now it is running on two nodes). Both the serial job and parallel job (running on distributed or cluster) I mentioned before have the same size(dofs). But If I ran a smaller job on cluster with two nodes, it might not get stuck and work fine. <br>
<br>As you said before, I add MAT_ASSEMBLY_FLUSH after every element stiffness matrix is inserted.</div></blockquote><div><br></div><div>This will deadlock unless the number of elements is *exactly* the same on every process.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div id=":460"> I got the output like below, and it gets stuck too.</div></blockquote></div><br><div>When it "gets stuck", attach a debugger and get stack traces.</div>