[petsc-users] Use block Jacobi preconditioner with SNES

Ali Reza Khaz'ali arkhazali at cc.iut.ac.ir
Mon Aug 27 16:04:59 CDT 2018


> Normal AIJ.
Can I use block preconditioners (like block ILU or Block Jacobi) with 
this matrix format? (for a matrix having variable sized blocks, of course)

> This should take on the order of a second to solve with a sparse direct
> solver.  What shape is your domain?  Can you send a solver profile and
> output from running with -snes_view -log_view?
>
I have to admit that I made a mistake. The runtime and memeroy usage is 
what I reported if block Jacobi preconditioer with MKL PARDISO 
sub-preconditioer and GMRes solver is used. I apologize. If only PARDISO 
is utilized to solve the system with KSPPREONLY, the runtime is much 
better, but memory usage is still high. The following log is for a 
30x30x10 cubic system and 3 hydrocarbon components (108000 unknowns total):

   type: newtonls
   maximum iterations=2000, maximum function evaluations=2000
   tolerances: relative=0.0001, absolute=1e-05, solution=1e-05
   total number of linear solver iterations=1
   total number of function evaluations=2
   norm schedule ALWAYS
   SNESLineSearch Object: 1 MPI processes
     type: bt
       interpolation: cubic
       alpha=1.000000e-04
     maxstep=1.000000e+08, minlambda=1.000000e-12
     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
lambda=1.000000e-08
     maximum iterations=40
   KSP Object: 1 MPI processes
     type: preonly
     maximum iterations=5000, initial guess is zero
     tolerances:  relative=1e-08, absolute=1e-08, divergence=10000.
     left preconditioning
     using NONE norm type for convergence test
   PC Object: 1 MPI processes
     type: lu
       out-of-place factorization
       tolerance for zero pivot 2.22045e-14
       matrix ordering: nd
       factor fill ratio given 0., needed 0.
         Factored matrix follows:
           Mat Object: 1 MPI processes
             type: mkl_pardiso
             rows=108000, cols=108000
             package used to perform factorization: mkl_pardiso
             total: nonzeros=2868000, allocated nonzeros=2868000
             total number of mallocs used during MatSetValues calls =0
               MKL_PARDISO run parameters:
               MKL_PARDISO phase:             33
               MKL_PARDISO iparm[1]:     1
               MKL_PARDISO iparm[2]:     2
               MKL_PARDISO iparm[3]:     2
               MKL_PARDISO iparm[4]:     0
               MKL_PARDISO iparm[5]:     0
               MKL_PARDISO iparm[6]:     0
               MKL_PARDISO iparm[7]:     0
               MKL_PARDISO iparm[8]:     0
               MKL_PARDISO iparm[9]:     0
               MKL_PARDISO iparm[10]:     13
               MKL_PARDISO iparm[11]:     1
               MKL_PARDISO iparm[12]:     0
               MKL_PARDISO iparm[13]:     1
               MKL_PARDISO iparm[14]:     0
               MKL_PARDISO iparm[15]:     232829
               MKL_PARDISO iparm[16]:     150886
               MKL_PARDISO iparm[17]:     2171171
               MKL_PARDISO iparm[18]:     272581106
               MKL_PARDISO iparm[19]:     892866
               MKL_PARDISO iparm[20]:     0
               MKL_PARDISO iparm[21]:     0
               MKL_PARDISO iparm[22]:     0
               MKL_PARDISO iparm[23]:     0
               MKL_PARDISO iparm[24]:     0
               MKL_PARDISO iparm[25]:     0
               MKL_PARDISO iparm[26]:     0
               MKL_PARDISO iparm[27]:     0
               MKL_PARDISO iparm[28]:     0
               MKL_PARDISO iparm[29]:     0
               MKL_PARDISO iparm[30]:     0
               MKL_PARDISO iparm[31]:     0
               MKL_PARDISO iparm[32]:     0
               MKL_PARDISO iparm[33]:     0
               MKL_PARDISO iparm[34]:     -1
               MKL_PARDISO iparm[35]:     1
               MKL_PARDISO iparm[36]:     0
               MKL_PARDISO iparm[37]:     0
               MKL_PARDISO iparm[38]:     0
               MKL_PARDISO iparm[39]:     0
               MKL_PARDISO iparm[40]:     0
               MKL_PARDISO iparm[41]:     0
               MKL_PARDISO iparm[42]:     0
               MKL_PARDISO iparm[43]:     0
               MKL_PARDISO iparm[44]:     0
               MKL_PARDISO iparm[45]:     0
               MKL_PARDISO iparm[46]:     0
               MKL_PARDISO iparm[47]:     0
               MKL_PARDISO iparm[48]:     0
               MKL_PARDISO iparm[49]:     0
               MKL_PARDISO iparm[50]:     0
               MKL_PARDISO iparm[51]:     0
               MKL_PARDISO iparm[52]:     0
               MKL_PARDISO iparm[53]:     0
               MKL_PARDISO iparm[54]:     0
               MKL_PARDISO iparm[55]:     0
               MKL_PARDISO iparm[56]:     0
               MKL_PARDISO iparm[57]:     -1
               MKL_PARDISO iparm[58]:     0
               MKL_PARDISO iparm[59]:     0
               MKL_PARDISO iparm[60]:     0
               MKL_PARDISO iparm[61]:     232829
               MKL_PARDISO iparm[62]:     103811
               MKL_PARDISO iparm[63]:     0
               MKL_PARDISO iparm[64]:     0
               MKL_PARDISO maxfct:     1
               MKL_PARDISO mnum:     1
               MKL_PARDISO mtype:     11
               MKL_PARDISO n:     108000
               MKL_PARDISO nrhs:     1
               MKL_PARDISO msglvl:     0
     linear system matrix = precond matrix:
     Mat Object: 1 MPI processes
       type: seqaij
       rows=108000, cols=108000
       total: nonzeros=2868000, allocated nonzeros=8640000
       total number of mallocs used during MatSetValues calls =0
         not using I-node routines
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r 
-fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance 
Summary: ----------------------------------------------

E:\Documents\Visual Studio 2015\Projects\compsim\x64\Release\compsim.exe 
on a  named ALIREZA-PC with 1 processor, by AliReza Tue Aug 28 00:43:44 2018
Using Petsc Release Version 3.9.3, Jul, 02, 2018

                          Max       Max/Min        Avg      Total
Time (sec):           3.180e+02      1.00000   3.180e+02
Objects:              2.400e+01      1.00000   2.400e+01
Flop:                 7.032e+06      1.00000   7.032e+06  7.032e+06
Flop/sec:            2.211e+04      1.00000   2.211e+04  2.211e+04
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00      0.00000

Flop counting convention: 1 flop = 1 real number operation of type 
(multiply/divide/add/subtract)
                             e.g., VecAXPY() for real vectors of length 
N --> 2N flop
                             and VecAXPY() for complex vectors of length 
N --> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages 
---  -- Message Lengths --  -- Reductions --
                         Avg     %Total     Avg     %Total   counts 
%Total     Avg         %Total   counts   %Total
  0:      Main Stage: 3.1802e+02 100.0%  7.0320e+06 100.0% 0.000e+00   
0.0%  0.000e+00        0.0%  0.000e+00   0.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on 
interpreting output.
Phase summary info:
    Count: number of times phase was executed
    Time and Flop: Max - maximum over all processors
                    Ratio - ratio of maximum to minimum over all processors
    Mess: number of messages sent
    Avg. len: average message length (bytes)
    Reduct: number of global reductions
    Global: entire computation
    Stage: stages of a computation. Set stages with PetscLogStagePush() 
and PetscLogStagePop().
       %T - percent time in this phase         %F - percent flop in this 
phase
       %M - percent messages in this phase     %L - percent message 
lengths in this phase
       %R - percent reductions in this phase
    Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time 
over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec) 
Flop                             --- Global ---  --- Stage --- Total
                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg 
len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

BuildTwoSidedF         2 1.0 2.2665e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESSolve              1 1.0 2.9114e+02 1.0 7.03e+06 1.0 0.0e+00 0.0e+00 
0.0e+00 92100  0  0  0  92100  0  0  0     0
SNESFunctionEval       2 1.0 1.3104e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  4  0  0  0  0   4  0  0  0  0     0
SNESJacobianEval       1 1.0 9.1535e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 29  0  0  0  0  29  0  0  0  0     0
SNESLineSearch         1 1.0 1.9715e+01 1.0 6.82e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  6 97  0  0  0   6 97  0  0  0     0
VecDot                 1 1.0 7.6773e-01 1.0 2.16e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  3  0  0  0   0  3  0  0  0     0
VecNorm                3 1.0 3.8735e-02 1.0 6.48e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  9  0  0  0   0  9  0  0  0    17
VecCopy                2 1.0 6.6713e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 1 1.0 1.6375e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecWAXPY               1 1.0 1.3113e-02 1.0 1.08e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  2  0  0  0   0  2  0  0  0     8
VecAssemblyBegin       2 1.0 3.4640e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         2 1.0 6.4148e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith         2 1.0 5.0032e-01 1.0 4.32e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  6  0  0  0   0  6  0  0  0     1
VecReduceComm          1 1.0 2.8122e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult                1 1.0 4.8387e+00 1.0 5.63e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  2 80  0  0  0   2 80  0  0  0     1
MatSolve               1 1.0 5.9153e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 19  0  0  0  0  19  0  0  0  0     0
MatLUFactorSym         1 1.0 1.9012e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatLUFactorNum         1 1.0 1.0291e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 32  0  0  0  0  32  0  0  0  0     0
MatAssemblyBegin       2 1.0 8.5530e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 2.5865e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 1.2557e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.3958e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         1 1.0 2.0895e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                2 1.0 1.4709e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               1 1.0 4.2765e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 1.7075e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 54  0  0  0  0  54  0  0  0  0     0
PCSetUp                1 1.0 1.0949e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 34  0  0  0  0  34  0  0  0  0     0
PCApply                1 1.0 5.9153e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 19  0  0  0  0  19  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

                 SNES     1              0            0     0.
       SNESLineSearch     1              0            0     0.
               DMSNES     1              0            0     0.
               Vector     5              0            0     0.
               Matrix     2              0            0     0.
     Distributed Mesh     2              0            0     0.
            Index Set     2              0            0     0.
    Star Forest Graph     4              0            0     0.
      Discrete System     2              0            0     0.
        Krylov Solver     1              0            0     0.
      DMKSP interface     1              0            0     0.
       Preconditioner     1              0            0     0.
               Viewer     1              0            0     0.
========================================================================================================================
Average time to get PetscTime(): 8.55301e-08
#PETSc Option Table entries:
-ksp_atol 1e-6
-ksp_rtol 1e-5
-pc_factor_mat_solver_type mkl_pardiso
-pc_type lu
-snes_rtol 1e-4
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 8 
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/home/alireza/Petsc393Install 
--with-mkl_pardiso-dir=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mkl 
--with-hyp
re-include=/cygdrive/E/hypre-2.11.2/Builds/Bins/include 
--with-hypre-lib=/cygdrive/E/hypre-2.11.2/Builds/Bins/lib/HYPRE.lib 
--with-ml-include=/cygdrive/E/Trilinos-master/
Bins/include --with-ml-lib=/cygdrive/E/Trilinos-master/Bins/lib/ml.lib 
ظ€ôwith-openmp --with-cc="win32fe icl" --with-fc="win32fe ifort" 
--with-mpi-include=/cygdrive/E/Pro
gram_Files_x86/IntelSWTools/compilers_and_libraries/windows/mpi/intel64/include 
--with-mpi-lib=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/
mpi/intel64/lib/impi.lib 
--with-mpi-mpiexec=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mpi/intel64/bin/mpiexec.exe 
--with-debugging=0 --wi
th-blas-lib=/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib 
--with-lapack-lib=/cygdrive/E/Program_Files_x86/Inte
lSWTools/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib 
-CFLAGS="-O2 -MT -wd4996 -Qopenmp" -CXXFLAGS="-O2 -MT -wd4996 -Qopenmp" 
-FFLAGS="-MT -O2 -Qopenmp"

-----------------------------------------
Libraries compiled on 2018-08-21 12:07:21 on AliReza-PC
Machine characteristics: CYGWIN_NT-6.1-2.10.0-0.325-5-3-x86_64-64bit
Using PETSc directory: /home/alireza/Petsc393Install
Using PETSc arch:
-----------------------------------------

Using C compiler: 
/home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe icl -O2 -MT 
-wd4996 -Qopenmp
Using Fortran compiler: 
/home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe ifort -MT -O2 
-Qopenmp -fpp
-----------------------------------------

Using include paths: -I/home/alireza/Petsc393Install/include 
-I/home/alireza/Petsc393Install//include 
-I/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries
/windows/mkl/include -I/cygdrive/E/hypre-2.11.2/Builds/Bins/include 
-I/cygdrive/E/Trilinos-master/Bins/include 
-I/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_
libraries/windows/mpi/intel64/include
-----------------------------------------

Using C linker: /home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe icl
Using Fortran linker: 
/home/alireza/petsc-3.9.3/lib/petsc/bin/win32fe/win32fe ifort
Using libraries: -L/home/alireza/Petsc393Install/lib 
-L/home/alireza/Petsc393Install/lib -lpetsc 
/cygdrive/E/hypre-2.11.2/Builds/Bins/lib/HYPRE.lib /cygdrive/E/Trilinos-m
aster/Bins/lib/ml.lib 
/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib 
/cygdrive/E/Program_Files_x86/IntelSWTools
/compilers_and_libraries/windows/mkl/lib/intel64_win/mkl_rt.lib 
/cygdrive/E/Program_Files_x86/IntelSWTools/compilers_and_libraries/windows/mpi/intel64/lib/impi.lib 
Gdi32.
lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib
-----------------------------------------



More information about the petsc-users mailing list