[petsc-users] Hang while attempting to run EPSSolve()

Andrew Spott andrew at spott.us
Fri Feb 13 15:23:37 CST 2015


Thanks!  You just saved me hours of debugging.




I’ll look into linking against an earlier implementation of OpenMPI.




-Andrew

On Fri, Feb 13, 2015 at 2:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>   Andrew,
>     This is a bug in the 1.8.2 OpenMPI implementation they recently introduced. Can you link against an earlier OpenMPI implementation on the machine? Or do they have MPICH installed you could use?
>   Barry
>> On Feb 13, 2015, at 3:17 PM, Andrew Spott <andrew at spott.us> wrote:
>> 
>> Local tests on OS X can’t reproduce, but production tests on our local supercomputer always hang while waiting for a lock.
>> 
>> The back trace:
>> 
>> #0  0x00002ba2980df054 in __lll_lock_wait () from /lib64/libpthread.so.0
>> #1  0x00002ba2980da388 in _L_lock_854 () from /lib64/libpthread.so.0
>> #2  0x00002ba2980da257 in pthread_mutex_lock () from /lib64/libpthread.so.0
>> #3  0x00002ba29a1d9e2c in ompi_attr_get_c () from /curc/tools/x_86_64/rh6/openmpi/1.8.2/gcc/4.9.1/lib/libmpi.so.1
>> #4  0x00002ba29a207f8e in PMPI_Attr_get () from /curc/tools/x_86_64/rh6/openmpi/1.8.2/gcc/4.9.1/lib/libmpi.so.1
>> #5  0x00002ba294aa111e in Petsc_DelComm_Outer () at /home/ansp6066/local/src/petsc-3.5.3/src/sys/objects/pinit.c:409
>> #6  0x00002ba29a1dae02 in ompi_attr_delete_all () from /curc/tools/x_86_64/rh6/openmpi/1.8.2/gcc/4.9.1/lib/libmpi.so.1
>> #7  0x00002ba29a1dcb6c in ompi_comm_free () from /curc/tools/x_86_64/rh6/openmpi/1.8.2/gcc/4.9.1/lib/libmpi.so.1
>> #8  0x00002ba29a20c713 in PMPI_Comm_free () from /curc/tools/x_86_64/rh6/openmpi/1.8.2/gcc/4.9.1/lib/libmpi.so.1
>> #9  0x00002ba294aba7cf in PetscSubcommCreate_contiguous(_n_PetscSubcomm*) () from /home/ansp6066/local/petsc-3.5.3-debug/lib/libpetsc.so.3.5
>> #10 0x00002ba294ab89d5 in PetscSubcommSetType () from /home/ansp6066/local/petsc-3.5.3-debug/lib/libpetsc.so.3.5
>> #11 0x00002ba2958ce437 in PCSetUp_Redundant(_p_PC*) () from /home/ansp6066/local/petsc-3.5.3-debug/lib/libpetsc.so.3.5
>> #12 0x00002ba2957a243d in PCSetUp () at /home/ansp6066/local/src/petsc-3.5.3/src/ksp/pc/interface/precon.c:902
>> #13 0x00002ba2958dea31 in KSPSetUp () at /home/ansp6066/local/src/petsc-3.5.3/src/ksp/ksp/interface/itfunc.c:306
>> #14 0x00002ba29a7f8e70 in STSetUp_Sinvert(_p_ST*) () at /home/ansp6066/local/src/slepc-3.5.3/src/sys/classes/st/impls/sinvert/sinvert.c:145
>> #15 0x00002ba29a7e92cf in STSetUp () at /home/ansp6066/local/src/slepc-3.5.3/src/sys/classes/st/interface/stsolve.c:301
>> #16 0x00002ba29a845ea6 in EPSSetUp () at /home/ansp6066/local/src/slepc-3.5.3/src/eps/interface/epssetup.c:207
>> #17 0x00002ba29a849f91 in EPSSolve () at /home/ansp6066/local/src/slepc-3.5.3/src/eps/interface/epssolve.c:88
>> #18 0x0000000000410de5 in petsc::EigenvalueSolver::solve() () at /home/ansp6066/code/petsc_cpp_wrapper/src/petsc_cpp/EigenvalueSolver.cpp:40
>> #19 0x00000000004065c7 in main () at /home/ansp6066/code/new_work_project/src/main.cpp:165
>> 
>> This happens for MPI or single process runs.  Does anyone have any hints on how I can debug this?  I honestly have no idea.
>> 
>> -Andrew
>> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150213/1b3e8122/attachment.html>


More information about the petsc-users mailing list