[petsc-dev] Error running on Titan with GPUs & GNU

Mark Adams mfadams at lbl.gov
Mon Oct 29 19:47:37 CDT 2018


On Mon, Oct 29, 2018 at 6:55 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>    Here is the code
>
>
> PetscStackCallBLAS("LAPACKgeev",LAPACKgeev_("N","N",&bn,R,&bN,realpart,imagpart,&sdummy,&idummy,&sdummy,&idummy,work,&lwork,&lierr));
>   if (lierr) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"Error in LAPACK
> routine %d",(int)lierr);
>
>    What is unfathomable is that it prints (int) lierr of 0 but then the if
> () test should not be satisfied.
>
>    Do a ./configure with debugging turned on, could be an optimizing
> compiler error.
>

Configuring debug now.

Note, I was able to run ex56 (ksp) which does not use GMRES. This error was
from a GMRES method so maybe this is an isolated problem.


>
>    Barry
>
>
> > On Oct 29, 2018, at 3:56 PM, Mark Adams via petsc-dev <
> petsc-dev at mcs.anl.gov> wrote:
> >
> > I get this error running the tests using GPUs. An error in an LAPACK
> routine.
> >
> > 16:50 master= /lustre/atlas/proj-shared/geo127/petsc$ make
> PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda
> PETSC_ARCH="" test
> > Running test examples to verify correct installation
> > Using
> PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda
> and PETSC_ARCH=
> > *******************Error detected during compile or
> link!*******************
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > /lustre/atlas/proj-shared/geo127/petsc/src/snes/examples/tutorials ex19
> >
> *********************************************************************************
> > cc -o ex19.o -c -O
>  -I/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/include
>   `pwd`/ex19.c
> > cc -O  -o ex19 ex19.o
> -L/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib
> -Wl,-rpath,/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib
> -L/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib
> -lpetsc -lHYPRE -lparmetis -lmetis -ldl
> >
> /lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib/libpetsc.a(dlimpl.o):
> In function `PetscDLOpen':
> > dlimpl.c:(.text+0x3b): warning: Using 'dlopen' in statically linked
> applications requires at runtime the shared libraries from the glibc
> version used for linking
> >
> /lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda/lib/libpetsc.a(send.o):
> In function `PetscOpenSocket':
> > send.c:(.text+0x3be): warning: Using 'gethostbyname' in statically
> linked applications requires at runtime the shared libraries from the glibc
> version used for linking
> > true ex19
> > rm ex19.o
> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> process
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> > Number of SNES iterations = 2
> > Application 19079964 resources: utime ~1s, stime ~1s, Rss ~29412,
> inblocks ~37563, outblocks ~131654
> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
> processes
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> > [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > --------------------- Error Message
> --------------------------------------------------------------
> > [1]PETSC ERROR: [0]PETSC ERROR: Error in external library
> > Error in external library
> > [1]PETSC ERROR: [0]PETSC ERROR: Error in LAPACK routine 0
> > Error in LAPACK routine 0
> > [1]PETSC ERROR: [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
> shooting.
> > [1]PETSC ERROR: [0]PETSC ERROR: Petsc Development GIT revision:
> v3.10.2-461-g0ed19bb123  GIT Date: 2018-10-29 13:43:53 +0100
> > Petsc Development GIT revision: v3.10.2-461-g0ed19bb123  GIT Date:
> 2018-10-29 13:43:53 +0100
> > [1]PETSC ERROR: [0]PETSC ERROR: ./ex19 on a  named nid16438 by adams Mon
> Oct 29 16:52:05 2018
> > ./ex19 on a  named nid16438 by adams Mon Oct 29 16:52:05 2018
> > [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cudac=1
> --with-batch=0
> --prefix=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda
> --download-hypre --download-metis --download-parmetis --with-cc=cc
> --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0
> --with-fc=ftn --with-fortranlib-autodetect=0 --with-shared-libraries=0
> --known-mpi-shared-libraries=1 --with-mpiexec=aprun --with-x=0
> --with-64-bit-indices --with-debugging=0
> PETSC_ARCH=arch-titan-opt64idx-gnu-cuda
> PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc
> > Configure options --with-cudac=1 --with-batch=0
> --prefix=/lustre/atlas/proj-shared/geo127/petsc_titan_opt64idx_gnu_cuda
> --download-hypre --download-metis --download-parmetis --with-cc=cc
> --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0
> --with-fc=ftn --with-fortranlib-autodetect=0 --with-shared-libraries=0
> --known-mpi-shared-libraries=1 --with-mpiexec=aprun --with-x=0
> --with-64-bit-indices --with-debugging=0
> PETSC_ARCH=arch-titan-opt64idx-gnu-cuda
> PETSC_DIR=/lustre/atlas/proj-shared/geo127/petsc
> > [1]PETSC ERROR: [0]PETSC ERROR: #1 KSPComputeEigenvalues_GMRES() line
> 144 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/gmreig.c
> > #1 KSPComputeEigenvalues_GMRES() line 144 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/gmreig.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #2 KSPComputeEigenvalues() line 132 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> > #2 KSPComputeEigenvalues() line 132 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #3
> KSPChebyshevComputeExtremeEigenvalues_Private() line 288 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> > #3 KSPChebyshevComputeExtremeEigenvalues_Private() line 288 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #4 KSPSolve_Chebyshev() line 390 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> > #4 KSPSolve_Chebyshev() line 390 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/cheby/cheby.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #5 KSPSolve() line 780 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> > #5 KSPSolve() line 780 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #6 PCMGMCycle_Private() line 20 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> > #6 PCMGMCycle_Private() line 20 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #7 PCApply_MG() line 377 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> > #7 PCApply_MG() line 377 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/impls/mg/mg.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #8 PCApply() line 462 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/interface/precon.c
> > #8 PCApply() line 462 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/pc/interface/precon.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #9 KSP_PCApply() line 281 in
> /lustre/atlas/proj-shared/geo127/petsc/include/petsc/private/kspimpl.h
> > #9 KSP_PCApply() line 281 in
> /lustre/atlas/proj-shared/geo127/petsc/include/petsc/private/kspimpl.h
> > [1]PETSC ERROR: [0]PETSC ERROR: #10 KSPFGMRESCycle() line 166 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> > #10 KSPFGMRESCycle() line 166 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #11 KSPSolve_FGMRES() line 291 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> > #11 KSPSolve_FGMRES() line 291 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #12 KSPSolve() line 780 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> > #12 KSPSolve() line 780 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/ksp/ksp/interface/itfunc.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #13 SNESSolve_NEWTONLS() line 224 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/snes/impls/ls/ls.c
> > #13 SNESSolve_NEWTONLS() line 224 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/snes/impls/ls/ls.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #14 SNESSolve() line 4396 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/snes/interface/snes.c
> > #14 SNESSolve() line 4396 in
> /lustre/atlas1/geo127/proj-shared/petsc/src/snes/interface/snes.c
> > [1]PETSC ERROR: [0]PETSC ERROR: #15 main() line 161 in
> /lustre/atlas/proj-shared/geo127/petsc/src/snes/examples/tutorials/ex19.c
> > #15 main() line 161 in
> /lustre/atlas/proj-shared/geo127/petsc/src/snes/examples/tutorials/ex19.c
> > [1]PETSC ERROR: [0]PETSC ERROR: PETSc Option Table entries:
> > PETSc Option Table entries:
> > [1]PETSC ERROR: [0]PETSC ERROR: -da_refine 3
> > -da_refine 3
> > [1]PETSC ERROR: [0]PETSC ERROR: -ksp_type fgmres
> > -ksp_type fgmres
> > [1]PETSC ERROR: [0]PETSC ERROR: -pc_type mg
> > -pc_type mg
> > [1]PETSC ERROR: [0]PETSC ERROR: ----------------End of Error Message
> -------send entire error message to petsc-maint at mcs.anl.gov----------
> > ----------------End of Error Message -------send entire error message to
> petsc-maint at mcs.anl.gov----------
> > Rank 0 [Mon Oct 29 16:52:05 2018] [c7-7c1s4n2] application called
> MPI_Abort(MPI_COMM_WORLD, 76) - process 0
> > Rank 1 [Mon Oct 29 16:52:05 2018] [c7-7c1s4n2] application called
> MPI_Abort(MPI_COMM_WORLD, 76) - process 1
> > _pmiu_daemon(SIGCHLD): [NID 16438] [c7-7c1s4n2] [Mon Oct 29 16:52:05
> 2018] PE RANK 1 exit signal Aborted
> > Application 19079965 exit codes: 134
> > Application 19079965 resources: utime ~1s, stime ~1s, Rss ~29412,
> inblocks ~37571, outblocks ~131660
> > 5a6
> > > Application 19079968 resources: utime ~1s, stime ~1s, Rss ~29412,
> inblocks ~37586, outblocks ~131654
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20181029/e1651a6b/attachment.html>


More information about the petsc-dev mailing list