[petsc-dev] cusparse solve with ex19 crashes

Karl Rupp rupp at iue.tuwien.ac.at
Sat Nov 8 05:31:04 CST 2014


Hi,

I can reproduce this on my desktop machine, so it's not due to 
insufficient memory of the mobile GPU (with 4K displays this is actually 
a fairly common problem).

However, I don't know enough about the bindings to CUSPARSE in order to 
say what is going wrong here. Maybe Dominic has an idea?

Best regards,
Karli


 >    It seems not to be able handle multiple copies of the matrix to 
the GPU?
>
>    Barry
>
> $ ./ex19 -ksp_monitor -pc_type lu -pc_factor_mat_solver_package cusparse -dm_mat_type seqaijcusparse -start_in_debugger noxterm,lldb
> [0]PETSC ERROR: PETSC: Attaching lldb to ./ex19 of pid 77751 on Barrys-MacBook-Pro-6.local
> (lldb) process attach --pid 77751
> Process 77751 stopped
> Executable module set to "/Users/barrysmith/Src/PETSc/src/snes/examples/tutorials/./ex19".
> Architecture set to: x86_64-apple-macosx.
> (lldb) b MatSolve
> Breakpoint 1: where = libpetsc.3.05.dylib`MatSolve + 24 at matrix.c:3140, address = 0x000000010aa89ff8
> (lldb) c
> Process 77751 resuming
> (lldb) lid velocity = 0.0625, prandtl # = 1, grashof # = 1
> [0]PETSC ERROR: MatSeqAIJCUSPARSECopyToGPU() line 1228 in /Users/barrysmith/Src/PETSc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu Data structure should not be initialized.
> Process 77751 stopped
> * thread #1: tid = 0xf4f338, 0x00007fff84917282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
>      frame #0: 0x00007fff84917282 libsystem_kernel.dylib`__pthread_kill + 10
> libsystem_kernel.dylib`__pthread_kill + 10:
> -> 0x7fff84917282:  jae    0x7fff8491728c            ; __pthread_kill + 20
>     0x7fff84917284:  movq   %rax, %rdi
>     0x7fff84917287:  jmp    0x7fff84912ca3            ; cerror_nocancel
>     0x7fff8491728c:  retq
> (lldb) bt
> * thread #1: tid = 0xf4f338, 0x00007fff84917282 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
>    * frame #0: 0x00007fff84917282 libsystem_kernel.dylib`__pthread_kill + 10
>      frame #1: 0x00007fff8da684c3 libsystem_pthread.dylib`pthread_kill + 90
>      frame #2: 0x00007fff811f7b73 libsystem_c.dylib`abort + 129
>      frame #3: 0x000000010a677e95 libpetsc.3.05.dylib`PetscAbortErrorHandler(comm=1, line=1228, fun=0x000000010bcb104e, file=0x000000010bcb00c8, n=76, p=PETSC_ERROR_INITIAL, mess=0x00007fff5560d840, ctx=0x0000000000000000) + 485 at errabort.c:58
>      frame #4: 0x000000010a670ea0 libpetsc.3.05.dylib`PetscError(comm=1, line=1228, func=0x000000010bcb104e, file=0x000000010bcb00c8, n=76, p=PETSC_ERROR_INITIAL, mess=0x000000010bcb1069) + 1248 at err.c:378
>      frame #5: 0x000000010b062882 libpetsc.3.05.dylib`MatSeqAIJCUSPARSECopyToGPU(A=0x00007fe3d0ca9660) + 914 at aijcusparse.cu:1228
>      frame #6: 0x000000010b058b17 libpetsc.3.05.dylib`MatAssemblyEnd_SeqAIJCUSPARSE(A=0x00007fe3d0ca9660, mode=MAT_FINAL_ASSEMBLY) + 551 at aijcusparse.cu:1607
>      frame #7: 0x000000010aa97d9e libpetsc.3.05.dylib`MatAssemblyEnd(mat=0x00007fe3d0ca9660, type=MAT_FINAL_ASSEMBLY) + 1310 at matrix.c:5074
>      frame #8: 0x000000010afe3e0b libpetsc.3.05.dylib`MatFDColoringApply_AIJ(J=0x00007fe3d0ca9660, coloring=0x00007fe3d5451860, x1=0x00007fe3d0c89a60, sctx=0x00007fe3d4263a60) + 9163 at fdmpiaij.c:351
>      frame #9: 0x000000010aa4523a libpetsc.3.05.dylib`MatFDColoringApply(J=0x00007fe3d0ca9660, coloring=0x00007fe3d5451860, x1=0x00007fe3d0c89a60, sctx=0x00007fe3d4263a60) + 3002 at fdmatrix.c:609
>      frame #10: 0x000000010b8e4078 libpetsc.3.05.dylib`SNESComputeJacobian_DMDA(snes=0x00007fe3d4263a60, X=0x00007fe3d0c89a60, A=0x00007fe3d0ca9660, B=0x00007fe3d0ca9660, ctx=0x00007fe3d0c48c60) + 3880 at dmdasnes.c:201
>      frame #11: 0x000000010b93cded libpetsc.3.05.dylib`SNESComputeJacobian(snes=0x00007fe3d4263a60, X=0x00007fe3d0c89a60, A=0x00007fe3d0ca9660, B=0x00007fe3d0ca9660) + 5533 at snes.c:2190
>      frame #12: 0x000000010b9d8768 libpetsc.3.05.dylib`SNESSolve_NEWTONLS(snes=0x00007fe3d4263a60) + 7000 at ls.c:230
>      frame #13: 0x000000010b94a4b0 libpetsc.3.05.dylib`SNESSolve(snes=0x00007fe3d4263a60, b=0x0000000000000000, x=0x00007fe3d0c89a60) + 5120 at snes.c:3740
>      frame #14: 0x000000010a5f1a73 ex19`main + 3443
>      frame #15: 0x00007fff8f7155c9 libdyld.dylib`start + 1
> (lldb) ^D
> Abort trap: 6
>
>
>




More information about the petsc-dev mailing list