[petsc-dev] Error running on Titan with GPUs & GNU

Smith, Barry F. bsmith at mcs.anl.gov
Mon Oct 29 17:53:18 CDT 2018


   Here is the code

  PetscStackCallBLAS("LAPACKgeev",LAPACKgeev_("N","N",&bn,R,&bN,realpart,imagpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));
  if (lierr) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Error in LAPACK routine %d",(int)lierr);

   What is unfathomable is that it prints (int) lierr of 0 but then the if () test should not be satisfied. 

   Do a ./configure with debugging turned on, could be an optimizing compiler error.

   Barry


> On Oct 29, 2018, at 3:56 PM, Mark Adams via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
> 
> I get this error running the tests using GPUs. An error in an LAPACK routine.
> 
> 16:50 master= /lustre/atlas/proj-shared/geo127/petsc$ make PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda PETSC_ARCH="" test
> Running test examples to verify correct installation
> Using PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda and PETSC_ARCH=
> *******************Error detected during compile or link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> /lustre/atlas/proj-shared/geo127/petsc/src/snes/examples/tutorials ex19
> *********************************************************************************
> cc -o ex19.o -c -O   -I/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/include    `pwd`/ex19.c
> cc -O  -o ex19 ex19.o  -L/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib -Wl,-rpath,/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib -L/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib -lpetsc -lHYPRE -lparmetis -lmetis -ldl
> /lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib/libpetsc.a(dlimpl.o): In function `PetscDLOpen':
> dlimpl.c:(.text+0x3b): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
> /lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib/libpetsc.a(send.o): In function `PetscOpenSocket':
> send.c:(.text+0x3be): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
> true ex19
> rm ex19.o
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI process
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Application 19079964 resources: utime ~1s, stime ~1s, Rss ~29412, inblocks ~37563, outblocks ~131654
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: [0]PETSC ERROR: Error in external library
> Error in external library
> [1]PETSC ERROR: [0]PETSC ERROR: Error in LAPACK routine 0
> Error in LAPACK routine 0
> [1]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [1]PETSC ERROR: [0]PETSC ERROR: Petsc Development GIT revision: v3.10.2-461-g0ed19bb123  GIT Date: 2018-10-29 13:43:53 +0100
> Petsc Development GIT revision: v3.10.2-461-g0ed19bb123  GIT Date: 2018-10-29 13:43:53 +0100
> [1]PETSC ERROR: [0]PETSC ERROR: ./ex19 on a  named nid16438 by adams Mon Oct 29 16:52:05 2018
> ./ex19 on a  named nid16438 by adams Mon Oct 29 16:52:05 2018
> [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cudac=1 --with-batch=0 --prefix=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda --download-hypre --download-metis --download-parmetis --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-shared-libraries=0 --known-mpi-shared-libraries=1 --with-mpiexec=aprun --with-x=0 --with-64-bit-indices --with-debugging=0 PETSC_ARCH=arch-titan-opt64idx-gnu-cuda PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc
> Configure options --with-cudac=1 --with-batch=0 --prefix=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda --download-hypre --download-metis --download-parmetis --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-shared-libraries=0 --known-mpi-shared-libraries=1 --with-mpiexec=aprun --with-x=0 --with-64-bit-indices --with-debugging=0 PETSC_ARCH=arch-titan-opt64idx-gnu-cuda PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc
> [1]PETSC ERROR: [0]PETSC ERROR: #1 KSPComputeEigenvalues_GMRES() line 144 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/gmreig.c
> #1 KSPComputeEigenvalues_GMRES() line 144 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/gmreig.c
> [1]PETSC ERROR: [0]PETSC ERROR: #2 KSPComputeEigenvalues() line 132 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> #2 KSPComputeEigenvalues() line 132 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: [0]PETSC ERROR: #3 KSPChebyshevComputeExtremeEigenvalues_Private() line 288 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> #3 KSPChebyshevComputeExtremeEigenvalues_Private() line 288 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> [1]PETSC ERROR: [0]PETSC ERROR: #4 KSPSolve_Chebyshev() line 390 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> #4 KSPSolve_Chebyshev() line 390 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> [1]PETSC ERROR: [0]PETSC ERROR: #5 KSPSolve() line 780 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> #5 KSPSolve() line 780 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: [0]PETSC ERROR: #6 PCMGMCycle_Private() line 20 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> #6 PCMGMCycle_Private() line 20 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> [1]PETSC ERROR: [0]PETSC ERROR: #7 PCApply_MG() line 377 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> #7 PCApply_MG() line 377 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> [1]PETSC ERROR: [0]PETSC ERROR: #8 PCApply() line 462 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/interface/precon.c
> #8 PCApply() line 462 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/interface/precon.c
> [1]PETSC ERROR: [0]PETSC ERROR: #9 KSP_PCApply() line 281 in /lustre/atlas/proj-shared/geo127/petsc/include/petsc/private/kspimpl.h
> #9 KSP_PCApply() line 281 in /lustre/atlas/proj-shared/geo127/petsc/include/petsc/private/kspimpl.h
> [1]PETSC ERROR: [0]PETSC ERROR: #10 KSPFGMRESCycle() line 166 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> #10 KSPFGMRESCycle() line 166 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> [1]PETSC ERROR: [0]PETSC ERROR: #11 KSPSolve_FGMRES() line 291 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> #11 KSPSolve_FGMRES() line 291 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> [1]PETSC ERROR: [0]PETSC ERROR: #12 KSPSolve() line 780 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> #12 KSPSolve() line 780 in /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() line 224 in /lustre/atlas1/geo127/proj-shared/petsc/src/snes/impls/ls/ls.c
> #13 SNESSolve_NEWTONLS() line 224 in /lustre/atlas1/geo127/proj-shared/petsc/src/snes/impls/ls/ls.c
> [1]PETSC ERROR: [0]PETSC ERROR: #14 SNESSolve() line 4396 in /lustre/atlas1/geo127/proj-shared/petsc/src/snes/interface/snes.c
> #14 SNESSolve() line 4396 in /lustre/atlas1/geo127/proj-shared/petsc/src/snes/interface/snes.c
> [1]PETSC ERROR: [0]PETSC ERROR: #15 main() line 161 in /lustre/atlas/proj-shared/geo127/petsc/src/snes/examples/tutorials/ex19.c
> #15 main() line 161 in /lustre/atlas/proj-shared/geo127/petsc/src/snes/examples/tutorials/ex19.c
> [1]PETSC ERROR: [0]PETSC ERROR: PETSc Option Table entries:
> PETSc Option Table entries:
> [1]PETSC ERROR: [0]PETSC ERROR: -da_refine 3
> -da_refine 3
> [1]PETSC ERROR: [0]PETSC ERROR: -ksp_type fgmres
> -ksp_type fgmres
> [1]PETSC ERROR: [0]PETSC ERROR: -pc_type mg
> -pc_type mg
> [1]PETSC ERROR: [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
> ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
> Rank 0 [Mon Oct 29 16:52:05 2018] [c7-7c1s4n2] application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0
> Rank 1 [Mon Oct 29 16:52:05 2018] [c7-7c1s4n2] application called MPI_Abort(MPI_COMM_WORLD, 76) - process 1
> _pmiu_daemon(SIGCHLD): [NID 16438] [c7-7c1s4n2] [Mon Oct 29 16:52:05 2018] PE RANK 1 exit signal Aborted
> Application 19079965 exit codes: 134
> Application 19079965 resources: utime ~1s, stime ~1s, Rss ~29412, inblocks ~37571, outblocks ~131660
> 5a6
> > Application 19079968 resources: utime ~1s, stime ~1s, Rss ~29412, inblocks ~37586, outblocks ~131654
> 



More information about the petsc-dev mailing list