On Wed, Jan 25, 2012 at 8:36 AM, Dominik Szczerba <span dir="ltr">&lt;<a href="mailto:dominik@itis.ethz.ch">dominik@itis.ethz.ch</a>&gt;</span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
&gt; asserts are a terrible debugging tool. You need to either use a debugger, or<br>
&gt; output<br>
&gt; the matrix in a form that the ParMetis people can use and debug with.<br>
<br>
After a lot of fun running the program on a quadcore with 64 processes<br>
and as many gdb windows, typing &#39;c&#39; into all of them without closing<br>
them accidentally, then finding the ones that have exitted, I found<br>
the below pasted trace. Does it help to locate the problem?<br></blockquote><div><br></div><div>That should definitely be sent to the ParMetis team.</div><div><br></div><div>    Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Many thanks<br>
Dominik<br>
<br>
<br>
#0  0x00007fd4232433a5 in __GI_raise (sig=6)<br>
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64<br>
#1  0x00007fd423246b0b in __GI_abort () at abort.c:92<br>
#2  0x000000000109baff in __FM_2WayEdgeRefine (ctrl=0x7fff8f7d7b30,<br>
    graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, npasses=4) at fm.c:65<br>
#3  0x000000000109d483 in __GrowBisection (ctrl=0x7fff8f7d7b30,<br>
    graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, ubfactor=1) at initpart.c:188<br>
#4  0x000000000109ccd4 in __Init2WayPartition (ctrl=0x7fff8f7d7b30,<br>
    graph=0x341c7d0, tpwgts=0x7fff8f7d7a70, ubfactor=1) at initpart.c:36<br>
#5  0x0000000001084dc2 in __MlevelEdgeBisection (ctrl=0x7fff8f7d7b30,<br>
    graph=0x7fff8f7d7c20, tpwgts=0x7fff8f7d7a70, ubfactor=1) at pmetis.c:173<br>
#6  0x0000000001084a30 in __MlevelRecursiveBisection (ctrl=0x7fff8f7d7b30,<br>
    graph=0x7fff8f7d7c20, nparts=2, part=0xe0c4bc8, tpwgts=0x73871e0,<br>
    ubfactor=1, fpart=0) at pmetis.c:120<br>
#7  0x000000000108488f in METIS_WPartGraphRecursive (nvtxs=0x341b030,<br>
    xadj=0x7f60ef0, adjncy=0x7f61124, vwgt=0x7f60f80, adjwgt=0x7f621c4,<br>
    wgtflag=0x7fff8f7d7dd4, numflag=0x7fff8f7d7dd8, nparts=0x7fff8f7d7d7c,<br>
    tpwgts=0x7fff8f7d8110, options=0x7fff8f7d7d90, edgecut=0x7fff8f7d7ddc,<br>
    part=0xe0c4bc8) at pmetis.c:85<br>
#8  0x000000000105a520 in __MlevelKWayPartitioning (ctrl=0x7fff8f7d7e60,<br>
    graph=0x7fff8f7d7f50, nparts=2, part=0x7c7b9e0, tpwgts=0x7fff8f7d8110,<br>
    ubfactor=1) at kmetis.c:110<br>
#9  0x000000000105fcb3 in METIS_WPartGraphKway2 (nvtxs=0x33f6174,<br>
    xadj=0x8bf81b0, adjncy=0x7c41850, vwgt=0x8bf1b40, adjwgt=0x7b3b450,<br>
    wgtflag=0x7fff8f7d81c8, numflag=0x7fff8f7d81c4, nparts=0x7fff8f7d81c0,<br>
    tpwgts=0x7fff8f7d8110, options=0x7fff8f7d80e0, edgecut=0x7fff8f7d81cc,<br>
    part=0x7c7b9e0) at parmetis.c:79<br>
#10 0x0000000001031d20 in Mc_InitPartition_RB__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x3a5a060, wspace=0x7fff8f8109f0) at initpart.c:95<br>
#11 0x0000000001031348 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x3a5a060, wspace=0x7fff8f8109f0) at kmetis.c:219<br>
#12 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x3a00cc0, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#13 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0xe145c20, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#14 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x3abdb60, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#15 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x3a016e0, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#16 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x7377a70, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#17 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x736d510, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#18 0x0000000001031475 in Mc_Global_Partition__ (ctrl=0x7fff8f7d8860,<br>
    graph=0x738b670, wspace=0x7fff8f8109f0) at kmetis.c:238<br>
#19 0x0000000001030d5f in ParMETIS_V3_PartKway (vtxdist=0x9f15090,<br>
    xadj=0x7cf4e90, adjncy=0x7a57b60, vwgt=0x0, adjwgt=0xb26ebc0,<br>
    wgtflag=0x7fff8f810c24, numflag=0x7fff8f810c28, ncon=0x7fff8f810c2c,<br>
    nparts=0x7fff8f810c30, tpwgts=0x91b5c70, ubvec=0x91b4d50,<br>
    options=0x7fff8f810bb0, edgecut=0x91b54f0, part=0x3349f20, comm=0x91b5504)<br>
    at kmetis.c:146<br>
#20 0x0000000000a9e6c5 in MatPartitioningApply_Parmetis (part=0x91b34d0,<br>
    partitioning=0x7fff8f811008)<br>
    at /home/dsz/pack/petsc-3.2-p5/src/mat/partition/impls/pmetis/pmetis.c:96<br>
#21 0x0000000000695ecd in MatPartitioningApply (matp=0x91b34d0,<br>
    partitioning=0x7fff8f811008)<br>
    at /home/dsz/pack/petsc-3.2-p5/src/mat/partition/partition.c:226<br>
#22 0x00000000004d31d6 in FluidSolver::CreateSolverContexts (this=0x30eb400)<br>
    at /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolver.cxx:3104<br>
#23 0x00000000004c697f in FluidSolver::Solve (this=0x30eb400)<br>
    at /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolver.cxx:1925<br>
#24 0x00000000005177f9 in main (argc=3, argv=0x7fff8f812c78)<br>
    at /home/dsz/src/framework/sandbox/dsz/solvers/solve/FluidSolverMain.cxx:319<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>