[petsc-dev] TS error with optimimzied build

Barry Smith bsmith at mcs.anl.gov
Thu Dec 15 17:43:33 CST 2016


   Mark,

    My records indicate you have accounts at MCS, at least we are paying for them :-). You should be able to access them
with 

ssh adams at login.mcs.anl.gov 

then 

ssh cg

then 

cd /sandbox/
mkdir  adams
cd adams 
git clone git at bitbucket.org:petsc/petsc.git
cd petsc
./configure --download-mpich

If you never set your ssh key to login then you need to do it at accounts.mcs.anl.gov (note that you cannot ssh via a password you need to set the ssh key).

Barry

> On Dec 15, 2016, at 5:32 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> OK, that was useful. I don't have access to a Linux machine right now. I must have an uninitialized variable (that the compiler did not catch).
> Thanks,
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR:
> [0]PETSC ERROR: SNESSolve has not converged due to Nan or Inf norm
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.4-2475-g4fbd04f  GIT Date: 2016-12-11 10:34:33 -0500
> [0]PETSC ERROR: /global/u2/m/madams/landaufem/Plex/./landaufem on a arch-xc30-opt64-intel named nid00012 by madams Thu Dec 15 15:31:15 2016
> [0]PETSC ERROR: Configure options COPTFLAGS="-fast -no-ipo -g" CXXOPTFLAGS="-fast -no-ipo -g" FOPTFLAGS="-fast -no-ipo -g" --download-hypre --download-parmetis --download-metis --download-p4est --with-hdf5-dir=/opt/cray/hdf5-parallel/1.8.16/INTEL/15.0 --with-ssl=0 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-shared-libraries=0 --with-x=0 --with-mpiexec=srun LIBS=-lstdc++ --with-64-bit-indices PETSC_ARCH=arch-xc30-opt64-intel
> [0]PETSC ERROR: #1 SNESSolve_NEWTONLS() line 186 in /global/u2/m/madams/petsc/src/snes/impls/ls/ls.c
> TSSolve failed
> 0 TS time steps, 528 cells, Nq=7 (3696 IPs), T=0.
> [0]PETSC ERROR: #2 SNESSolve() line 4128 in /global/u2/m/madams/petsc/src/snes/interface/snes.c
> [0]PETSC ERROR: #3 TS_SNESSolve() line 189 in /global/u2/m/madams/petsc/src/ts/impls/implicit/theta/theta.c
> 
> 
> On Thu, Dec 15, 2016 at 6:24 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    It is useful to run with valgrind, even on a completely different machine, when you have errors because it will detect any memory corruption. So currently I do my valgrind runs on a linux machine.
> 
>    Assuming the prefix is correct -snes_monitor should print an initial residual norm before doing any solves so it is curious you got no output. You can run with -snes_error_if_not_converged -ksp_error_if_not_converged to try to get it to output as soon as a problem is detected.
> 
>    Barry
> 
> 
> > On Dec 15, 2016, at 4:56 PM, Mark Adams <mfadams at lbl.gov> wrote:
> >
> > I have a code that work on my Mac but it fails on both a Cray XC30 and a KNL, unless the code is build with debug. I get this error message + -info output. I am using -snes_monitor but get no output. This code was working and I added a new feature. It does seem to fail when this new feature is used.
> >
> > And, alas, I do not have a functioning valgrind right now.
> >
> > I will start toggling optimization flags but any thoughts would be welcome.
> >
> > Mark
> >
> > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4290 X 4290; storage space: 0 unneeded,132612 used
> > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 50
> > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 4290) < 0.6. Do not use CompressedRow routines.
> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374784
> > [0] DMGetDMKSP(): Creating new DMKSP
> > [0] TSAdaptCheckStage(): Step=0, nonlinear solve failures 1 greater than current TS allowed, stopping solve
> > [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [0]PETSC ERROR:
> > [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE, increase -ts_max_snes_failures or make negative to attempt recovery
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > [0]PETSC ERROR: Petsc Development GIT revision: unknown  GIT Date: unknown
> > [0]PETSC ERROR: /global/u2/m/madams/landaufem/Plex/./landaufem on a arch-cori-knl-opt64-novector-intel named nid09355 by madams Thu Dec 15 14:25:00 2016
> > [0]PETSC ERROR: TSSolve failed
> > 1 TS time steps, 512 cells, Nq=9 (4608 IPs), T=0.
> > Configure options COPTFLAGS="  -g -O1 -fp-model fast -qopt-report=5 -hcpu=mic-knl -no-simd" CXXOPTFLAGS="-g -O1 -fp-model fast -qopt-report=5 -hcpu=mic-knl -no-simd" FOPTFLAGS="  -g -O1 -fp-model fast -qopt-report=5 -hcpu=mic-knl -no-simd" --download-metis=1 --download-parmetis=1 --with-blas-lapack-dir=/usr/common/software/intel/compilers_and_libraries_2016.3.210/linux/mkl --with-cc=mpiicc --with-cxx=mpiicpc --with-debugging=0 --with-fc=mpiifort --with-mpiexec=srun --with-batch=0 --with-memalign=64 --with-64-bit-indices PETSC_ARCH=arch-cori-knl-opt64-novector-intel --with-openmp=0 --download-p4est=0
> > [0]PETSC ERROR: #1 TSStep() line 3972 in /global/u2/m/madams/petsc/src/ts/interface/ts.c
> > [0]PETSC ERROR: #2 TSSolve() line 4218 in /global/u2/m/madams/petsc/src/ts/interface/ts.c
> >
> 
> 




More information about the petsc-dev mailing list