[petsc-dev] Big in MF hessian and BNK?

Dener, Alp adener at anl.gov
Thu Oct 14 11:00:35 CDT 2021


Hi Stefano,

It’s odd that the matmult works for several iterations before crashing. I will dig into the code and see if we do something to the matrix that disrupts it’s state along the way. Only thing I can think of is that BNLS does attempt to do diagonal shifts when it detects ill conditioning and maybe that’s causing a bug with MFFD. The other Newton methods don’t do this because they don’t have a rigid descent direction requirement like the line search version does.

Anyway thanks for pointing this out. And yes we should definitely add this to the tests.

Alp Dener
Argonne National Laboratory
Mathematics and Computer Science
https://alp.dener.me

On Oct 14, 2021, at 10:19 AM, Stefano Zampini <stefano.zampini at gmail.com> wrote:


Alp

I have found this problem, reproducible with release (see below using src/tao/unconstrained/tutorials/minsurf1)
I was trying MFFD hessian with bntr.

Using assembled hessian is fine

(tf-oneapi) [szampini at localhost tutorials]$ ./minsurf1 -tao_smonitor -tao_type bntr -mx 10 -my 8 -tao_bnk_max_cg_its 3 -tao_gatol 1.e-4

---- Minimum Surface Area Problem -----
mx: 10     my: 8

iter =   0, Function value 1.45591, Residual: 0.21372
iter =   0, Function value 1.43469, Residual: 0.205909
  iter =   0,   Function value 1.43469,   Residual: 0.205909
  iter =   1,   Function value 1.42058,   Residual: 0.0881207
  iter =   2,   Function value 1.41987,   Residual: 0.10019
  iter =   3,   Function value 1.41797,   Residual: 0.0209656
iter =   1, Function value 1.41775, Residual: 0.000288154
  iter =   0,   Function value 1.41775,   Residual: 0.000288154
  iter =   1,   Function value 1.41775,   Residual: 0.000207788
  iter =   2,   Function value 1.41775,   Residual: 0.000123638
  iter =   3,   Function value 1.41775,   Residual: 0.000117563
iter =   2, Function value 1.41775, Residual: < 1.0e-6

Using MFFD hessian error (using bnls segefaults, we should try to check all possible Tao solvers affected by this bug?)
(tf-oneapi) [szampini at localhost tutorials]$ ./minsurf1 -tao_smonitor -tao_type bntr -mx 10 -my 8 -tao_bnk_max_cg_its 3 -tao_gatol 1.e-4 -tao_mf_hessian

---- Minimum Surface Area Problem -----
mx: 10     my: 8

iter =   0, Function value 1.45591, Residual: 0.21372
iter =   0, Function value 1.43469, Residual: 0.205909
  iter =   0,   Function value 1.43469,   Residual: 0.205909
  iter =   1,   Function value 1.42058,   Residual: 0.0881207
  iter =   2,   Function value 1.41987,   Residual: 0.10019
  iter =   3,   Function value 1.41797,   Residual: 0.0209656
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: MatMFFDSetBase() has not been called, this is often caused by forgetting to call
MatAssemblyBegin/End on the first Mat in the SNES compute function
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.16.0, unknown
[0]PETSC ERROR: ./minsurf1 on a arch-conda-tf-oneapi-single-real named localhost.localdomain by szampini Thu Oct 14 18:11:23 2021
[0]PETSC ERROR: Configure options --with-blaslapack-dir=/home/szampini/Devel/miniforge/envs/tf-oneapi --download-thrust --LDFLAGS=-liomp5 --with-debugging=0 --with-openmp --with-precision=single --with-fc=0 --download-h2opus --download-slepc --download-slepc-commit=origin/release PETSC_ARCH=arch-conda-tf-oneapi-single-real PETSC_DIR=/home/szampini/Devel/miniforge/Devel/petsc
[0]PETSC ERROR: #1 MatMult_MFFD() at /home/szampini/Devel/miniforge/Devel/petsc/src/mat/impls/mffd/mffd.c:333
[0]PETSC ERROR: #2 MatMult_Shell() at /home/szampini/Devel/miniforge/Devel/petsc/src/mat/impls/shell/shell.c:1066
[0]PETSC ERROR: #3 MatMult() at /home/szampini/Devel/miniforge/Devel/petsc/src/mat/interface/matrix.c:2439
[0]PETSC ERROR: #4 KSP_MatMult() at /home/szampini/Devel/miniforge/Devel/petsc/include/petsc/private/kspimpl.h:346
[0]PETSC ERROR: #5 KSPCGSolve_STCG() at /home/szampini/Devel/miniforge/Devel/petsc/src/ksp/ksp/impls/cg/stcg/stcg.c:183
[0]PETSC ERROR: #6 KSPSolve_Private() at /home/szampini/Devel/miniforge/Devel/petsc/src/ksp/ksp/interface/itfunc.c:914
[0]PETSC ERROR: #7 KSPSolve() at /home/szampini/Devel/miniforge/Devel/petsc/src/ksp/ksp/interface/itfunc.c:1086
[0]PETSC ERROR: #8 TaoBNKComputeStep() at /home/szampini/Devel/miniforge/Devel/petsc/src/tao/bound/impls/bnk/bnk.c:477
[0]PETSC ERROR: #9 TaoSolve_BNTR() at /home/szampini/Devel/miniforge/Devel/petsc/src/tao/bound/impls/bnk/bntr.c:139
[0]PETSC ERROR: #10 TaoSolve() at /home/szampini/Devel/miniforge/Devel/petsc/src/tao/interface/taosolver.c:227
[0]PETSC ERROR: #11 main() at minsurf1.c:110
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -check_pointer_intensity 0
[0]PETSC ERROR: -mx 10
[0]PETSC ERROR: -my 8
[0]PETSC ERROR: -tao_bnk_max_cg_its 3
[0]PETSC ERROR: -tao_gatol 1.e-4
[0]PETSC ERROR: -tao_mf_hessian
[0]PETSC ERROR: -tao_smonitor
[0]PETSC ERROR: -tao_type bntr
[0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
Abort(73) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0

Using fd is fine

(tf-oneapi) [szampini at localhost tutorials]$ ./minsurf1 -tao_smonitor -tao_type bntr -mx 10 -my 8 -tao_bnk_max_cg_its 3 -tao_gatol 1.e-4 -tao_fd_hessian

---- Minimum Surface Area Problem -----
mx: 10     my: 8

iter =   0, Function value 1.45591, Residual: 0.21372
iter =   0, Function value 1.43469, Residual: 0.205909
  iter =   0,   Function value 1.43469,   Residual: 0.205909
  iter =   1,   Function value 1.42058,   Residual: 0.0881207
  iter =   2,   Function value 1.41987,   Residual: 0.10019
  iter =   3,   Function value 1.41797,   Residual: 0.0209656
iter =   1, Function value 1.41775, Residual: 0.000342734
  iter =   0,   Function value 1.41775,   Residual: 0.000342734
  iter =   1,   Function value 1.41775,   Residual: 0.000275372
  iter =   2,   Function value 1.41775,   Residual: 0.000148513
  iter =   3,   Function value 1.41775,   Residual: 0.000150663
iter =   2, Function value 1.41775, Residual: < 1.0e-6



--
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20211014/636b682a/attachment-0001.html>


More information about the petsc-dev mailing list