[petsc-users] Use block Jacobi preconditioner with SNES
Ali Reza Khaz'ali
arkhazali at cc.iut.ac.ir
Mon Aug 27 16:04:59 CDT 2018
> Normal AIJ.
Can I use block preconditioners (like block ILU or Block Jacobi) with
this matrix format? (for a matrix having variable sized blocks, of course)
> This should take on the order of a second to solve with a sparse direct
> solver. What shape is your domain? Can you send a solver profile and
> output from running with -snes_view -log_view?
>
I have to admit that I made a mistake. The runtime and memeroy usage is
what I reported if block Jacobi preconditioer with MKL PARDISO
sub-preconditioer and GMRes solver is used. I apologize. If only PARDISO
is utilized to solve the system with KSPPREONLY, the runtime is much
better, but memory usage is still high. The following log is for a
30x30x10 cubic system and 3 hydrocarbon components (108000 unknowns total):
type: newtonls
maximum iterations=2000, maximum function evaluations=2000
tolerances: relative=0.0001, absolute=1e-05, solution=1e-05
total number of linear solver iterations=1
total number of function evaluations=2
norm schedule ALWAYS
SNESLineSearch Object: 1 MPI processes
type: bt
interpolation: cubic
alpha=1.000000e-04
maxstep=1.000000e+08, minlambda=1.000000e-12
tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
maximum iterations=40
KSP Object: 1 MPI processes
type: preonly
maximum iterations=5000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-08, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: 1 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 0., needed 0.
Factored matrix follows:
Mat Object: 1 MPI processes
type: mkl_pardiso
rows=108000, cols=108000
package used to perform factorization: mkl_pardiso
total: nonzeros=2868000, allocated nonzeros=2868000
total number of mallocs used during MatSetValues calls =0
MKL_PARDISO run parameters:
MKL_PARDISO phase: 33
MKL_PARDISO iparm[1]: 1
MKL_PARDISO iparm[2]: 2
MKL_PARDISO iparm[3]: 2
MKL_PARDISO iparm[4]: 0
MKL_PARDISO iparm[5]: 0
MKL_PARDISO iparm[6]: 0
MKL_PARDISO iparm[7]: 0
MKL_PARDISO iparm[8]: 0
MKL_PARDISO iparm[9]: 0
MKL_PARDISO iparm[10]: 13
MKL_PARDISO iparm[11]: 1
MKL_PARDISO iparm[12]: 0
MKL_PARDISO iparm[13]: 1
MKL_PARDISO iparm[14]: 0
MKL_PARDISO iparm[15]: 232829
MKL_PARDISO iparm[16]: 150886
MKL_PARDISO iparm[17]: 2171171
MKL_PARDISO iparm[18]: 272581106
MKL_PARDISO iparm[19]: 892866
MKL_PARDISO iparm[20]: 0
MKL_PARDISO iparm[21]: 0
MKL_PARDISO iparm[22]: 0
MKL_PARDISO iparm[23]: 0
MKL_PARDISO iparm[24]: 0
MKL_PARDISO iparm[25]: 0
MKL_PARDISO iparm[26]: 0
MKL_PARDISO iparm[27]: 0
MKL_PARDISO iparm[28]: 0
MKL_PARDISO iparm[29]: 0
MKL_PARDISO iparm[30]: 0
MKL_PARDISO iparm[31]: 0
MKL_PARDISO iparm[32]: 0
MKL_PARDISO iparm[33]: 0
MKL_PARDISO iparm[34]: -1
MKL_PARDISO iparm[35]: 1
MKL_PARDISO iparm[36]: 0
MKL_PARDISO iparm[37]: 0
MKL_PARDISO iparm[38]: 0
MKL_PARDISO iparm[39]: 0
MKL_PARDISO iparm[40]: 0
MKL_PARDISO iparm[41]: 0
MKL_PARDISO iparm[42]: 0
MKL_PARDISO iparm[43]: 0
MKL_PARDISO iparm[44]: 0
MKL_PARDISO iparm[45]: 0
MKL_PARDISO iparm[46]: 0
MKL_PARDISO iparm[47]: 0
MKL_PARDISO iparm[48]: 0
MKL_PARDISO iparm[49]: 0
MKL_PARDISO iparm[50]: 0
MKL_PARDISO iparm[51]: 0
MKL_PARDISO iparm[52]: 0
MKL_PARDISO iparm[53]: 0
MKL_PARDISO iparm[54]: 0
MKL_PARDISO iparm[55]: 0
MKL_PARDISO iparm[56]: 0
MKL_PARDISO iparm[57]: -1
MKL_PARDISO iparm[58]: 0
MKL_PARDISO iparm[59]: 0
MKL_PARDISO iparm[60]: 0
MKL_PARDISO iparm[61]: 232829
MKL_PARDISO iparm[62]: 103811
MKL_PARDISO iparm[63]: 0
MKL_PARDISO iparm[64]: 0
MKL_PARDISO maxfct: 1
MKL_PARDISO mnum: 1
MKL_PARDISO mtype: 11
MKL_PARDISO n: 108000
MKL_PARDISO nrhs: 1
MKL_PARDISO msglvl: 0
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=108000, cols=108000
total: nonzeros=2868000, allocated nonzeros=8640000
total number of mallocs used during MatSetValues calls =0
not using I-node routines
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
-fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------
E:\Documents\Visual Studio 2015\Projects\compsim\x64\Release\compsim.exe
on a named ALIREZA-PC with 1 processor, by AliReza Tue Aug 28 00:43:44 2018
Using Petsc Release Version 3.9.3, Jul, 02, 2018
Max Max/Min Avg Total
Time (sec): 3.180e+02 1.00000 3.180e+02
Objects: 2.400e+01 1.00000 2.400e+01
Flop: 7.032e+06 1.00000 7.032e+06 7.032e+06
Flop/sec: 2.211e+04 1.00000 2.211e+04 2.211e+04
MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Reductions: 0.000e+00 0.00000
Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length
N --> 2N flop
and VecAXPY() for complex vectors of length
N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages
--- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts
%Total Avg %Total counts %Total
0: Main Stage: 3.1802e+02 100.0% 7.0320e+06 100.0% 0.000e+00
0.0% 0.000e+00 0.0% 0.000e+00 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this
phase
%M - percent messages in this phase %L - percent message
lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec)
Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg
len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
BuildTwoSidedF 2 1.0 2.2665e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SNESSolve 1 1.0 2.9114e+02 1.0 7.03e+06 1.0 0.0e+00 0.0e+00
0.0e+00 92100 0 0 0 92100 0 0 0 0
SNESFunctionEval 2 1.0 1.3104e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 4 0 0 0 0 4 0 0 0 0 0
SNESJacobianEval 1 1.0 9.1535e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 29 0 0 0 0 29 0 0 0 0 0
SNESLineSearch 1 1.0 1.9715e+01 1.0 6.82e+06 1.0 0.0e+00 0.0e+00
0.0e+00 6 97 0 0 0 6 97 0 0 0 0
VecDot 1 1.0 7.6773e-01 1.0 2.16e+05 1.0 0.0e+00 0.0e+00
0.0e+00 0 3 0 0 0 0 3 0 0 0 0
VecNorm 3 1.0 3.8735e-02 1.0 6.48e+05 1.0 0.0e+00 0.0e+00
0.0e+00 0 9 0 0 0 0 9 0 0 0 17
VecCopy 2 1.0 6.6713e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 1 1.0 1.6375e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecWAXPY 1 1.0 1.3113e-02 1.0 1.08e+05 1.0 0.0e+00 0.0e+00
0.0e+00 0 2 0 0 0 0 2 0 0 0 8
VecAssemblyBegin 2 1.0 3.4640e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 2 1.0 6.4148e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecReduceArith 2 1.0 5.0032e-01 1.0 4.32e+05 1.0 0.0e+00 0.0e+00
0.0e+00 0 6 0 0 0 0 6 0 0 0 1
VecReduceComm 1 1.0 2.8122e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMult 1 1.0 4.8387e+00 1.0 5.63e+06 1.0 0.0e+00 0.0e+00
0.0e+00 2 80 0 0 0 2 80 0 0 0 1
MatSolve 1 1.0 5.9153e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 19 0 0 0 0 19 0 0 0 0 0
MatLUFactorSym 1 1.0 1.9012e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatLUFactorNum 1 1.0 1.0291e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 32 0 0 0 0 32 0 0 0 0 0
MatAssemblyBegin 2 1.0 8.5530e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 2 1.0 2.5865e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 1.2557e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.3958e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 2.0895e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 2 1.0 1.4709e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 4.2765e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.7075e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 54 0 0 0 0 54 0 0 0 0 0
PCSetUp 1 1.0 1.0949e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 34 0 0 0 0 34 0 0 0 0 0
PCApply 1 1.0 5.9153e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 19 0 0 0 0 19 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
SNES 1 0 0 0.
SNESLineSearch 1 0 0 0.
DMSNES 1 0 0 0.
Vector 5 0 0 0.
Matrix 2 0 0 0.
Distributed Mesh 2 0 0 0.
Index Set 2 0 0 0.
Star Forest Graph 4 0 0 0.
Discrete System 2 0 0 0.
Krylov Solver 1 0 0 0.
DMKSP interface 1 0 0 0.
Preconditioner 1 0 0 0.
Viewer 1 0 0 0.
========================================================================================================================
Average time to get PetscTime(): 8.55301e-08
#PETSc Option Table entries:
-ksp_atol 1e-6
-ksp_rtol 1e-5
-pc_factor_mat_solver_type mkl_pardiso
-pc_type lu
-snes_rtol 1e-4
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 8
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/home/alireza/Petsc393Install
--with-mkl_pardiso-dir=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mkl
--with-hyp
re-include=/cygdrive/E/hypre-2.11.2/Builds/Bins/include
--with-hypre-lib=/cygdrive/E/hypre-2.11.2/Builds/Bins/lib/HYPRE.lib
--with-ml-include=/cygdrive/E/Trilinos-master/
Bins/include --with-ml-lib=/cygdrive/E/Trilinos-master/Bins/lib/ml.lib
ظôwith-openmp --with-cc="win32fe icl" --with-fc="win32fe ifort"
--with-mpi-include=/cygdrive/E/Pro
gram_Files_x86/IntelSWTools/compilers_and_libraries/windows/mpi/intel64/include
--with-mpi-lib=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/
mpi/intel64/lib/impi.lib
--with-mpi-mpiexec=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mpi/intel64/bin/mpiexec.exe
--with-debugging=0 --wi
th-blas-lib=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib
--with-lapack-lib=/cygdrive/E/Program_Files_x86/Inte
lSWTools/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib
-CFLAGS="-O2 -MT -wd4996 -Qopenmp" -CXXFLAGS="-O2 -MT -wd4996 -Qopenmp"
-FFLAGS="-MT -O2 -Qopenmp"
-----------------------------------------
Libraries compiled on 2018-08-21 12:07:21 on AliReza-PC
Machine characteristics: CYGWIN_NT-6.1-2.10.0-0.325-5-3-x86_64-64bit
Using PETSc directory: /home/alireza/Petsc393Install
Using PETSc arch:
-----------------------------------------
Using C compiler:
/home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe icl -O2 -MT
-wd4996 -Qopenmp
Using Fortran compiler:
/home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe ifort -MT -O2
-Qopenmp -fpp
-----------------------------------------
Using include paths: -I/home/alireza/Petsc393Install/include
-I/home/alireza/Petsc393Install//include
-I/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries
/windows/mkl/include -I/cygdrive/E/hypre-2.11.2/Builds/Bins/include
-I/cygdrive/E/Trilinos-master/Bins/include
-I/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_
libraries/windows/mpi/intel64/include
-----------------------------------------
Using C linker: /home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe icl
Using Fortran linker:
/home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe ifort
Using libraries: -L/home/alireza/Petsc393Install/lib
-L/home/alireza/Petsc393Install/lib -lpetsc
/cygdrive/E/hypre-2.11.2/Builds/Bins/lib/HYPRE.lib /cygdrive/E/Trilinos-m
aster/Bins/lib/ml.lib
/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib
/cygdrive/E/Program_Files_x86/IntelSWTools
/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib
/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mpi/intel64/lib/impi.lib
Gdi32.
lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib
-----------------------------------------
More information about the petsc-users
mailing list