[petsc-users] Direct inversion methods in parallel

Timothée Nicolas timothee.nicolas at gmail.com
Fri Sep 25 00:24:27 CDT 2015


Hi all, from the manual, I get that the options

-ksp_type preonly -pc_type lu

to solve a problem by direct LU inversion are available only for sequential
matrices. Should I conclude that there is no method to try a direct
inversion of a big problem in parallel ?

I plan to use the direct inversion only as a check that my approximation to
the inverse problem is OK, because so far my algorithm which should work is
not working at all and I need to debug what is going on. Namely I use an
approximation to the linear problem using an approximate Schur complement,
and I want to know if my approximation is false or if from the start my
matrices are false.

I have tried a direct inversion on one process with the above options for a
quite small problem (12x12x40 with 8 dof), but it did not work, I suppose
because of memory limitation (output with log_summary at the end attached
just in case).

Best

Timothee NICOLAS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150925/a2cd0c29/attachment-0001.html>
-------------- next part --------------
 
 This is an implicit MHD code based on the MIPS code 
 
 Setting all options 
 
 Start: Reading HINT2 equilibrium file 
 
 DATA: lr,lz,lphi=          12          12          40
 lsymmetry=           1
 r_min,r_max=   2.7000000000000002        4.5999999999999996     
 z_min,z_max= -0.94999999999999996       0.94999999999999996     
 phi_min,phi_max=   0.0000000000000000        6.2831853071795862     
 dr,dz,dphi=  0.17272727272727267       0.17272727272727273       0.15707963267948966     
 pmax=   4.0818670444571345E-003
 bmax=   2.9663632417550203     
 
 End: Reading HINT2 equilibrium file 
 
 Creating nonlinear solver, getting geometrical info, and setting vectors 
 
 Second order centered finite differences are used  
 Allocating some arrays 
 Masks definition 
 Major radius, angles, and vacuum definition 
 Initializing PETSc Vecs with equilibrium values 
 Set the initial force local vectors (used to enforce the equilibrium) 
 Writing the unperturbed fields 
 Add a coherent perturbation to the velocity 
 Poloidal mode number of the perturbation: m =  1 
 Toroidal mode number of the perturbation: n =  1 
 Writing the perturbed fields 
 Creating the matrices. This takes some time 
 Creating the matrices took 1.7204E+00 
 Entering the main MHD Loop 
 
 Iteration number = 1 
 Time (tau_A) = 9.99999978E-03 
 CPU time used for Building TotLinmat: 7.7378E-01 
 Total CPU time since PetscInitialize: 1.7185E+02 
 CPU time used for SNESSolve: 1.7011E+02 
 Number of linear iterations :  0 
 Number of function evaluations :  1 
 Kinetic Energy =  8.863760E-17 
 Magnetic Energy =  0.000000E+00 
 
 
 Iteration number = 2 
 Time (tau_A) = 1.99999998E-02 
 CPU time used for Building TotLinmat: 7.3685E-01 
 Total CPU time since PetscInitialize: 3.3283E+02 
 CPU time used for SNESSolve: 1.6099E+02 
 Number of linear iterations :  0 
 Number of function evaluations :  1 
 Kinetic Energy =  8.863760E-17 
 Magnetic Energy =  0.000000E+00 
 
 
 Exiting the main MHD Loop 
 
 Deallocating remaining arrays 
 
 Destroying remaining Petsc elements 
 
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./mips_implicit on a arch-darwin-c-debug named iMac27Nicolas.nifs.ac.jp with 1 processor, by timotheenicolas Fri Sep 25 13:57:08 2015
Using Petsc Release Version 3.6.1, Jul, 22, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           3.331e+02      1.00000   3.331e+02
Objects:              7.400e+01      1.00000   7.400e+01
Flops:                3.361e+11      1.00000   3.361e+11  3.361e+11
Flops/sec:            1.009e+09      1.00000   1.009e+09  1.009e+09
Memory:               2.362e+09      1.00000              2.362e+09
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00      0.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 3.3309e+02 100.0%  3.3613e+11 100.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

SNESSolve              2 1.0 3.2958e+02 1.0 3.36e+11 1.0 0.0e+00 0.0e+00 0.0e+00 99100  0  0  0  99100  0  0  0  1020
SNESFunctionEval       2 1.0 3.2958e+02 1.0 3.36e+11 1.0 0.0e+00 0.0e+00 0.0e+00 99100  0  0  0  99100  0  0  0  1020
VecView                5 1.0 1.6438e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNorm                6 1.0 3.7619e-04 1.0 5.53e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1470
VecCopy               14 1.0 5.7548e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                24 1.0 2.5896e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                7 1.0 9.8065e-04 1.0 6.45e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   658
VecWAXPY               3 1.0 4.4243e-04 1.0 1.38e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   312
VecAssemblyBegin       5 1.0 1.7768e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         5 1.0 1.6867e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult       4 1.0 5.7678e-04 1.0 1.84e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   320
VecScatterBegin       10 1.0 1.2419e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatSolve               2 1.0 5.8557e-01 1.0 6.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1077
MatLUFactorSym         1 1.0 7.7204e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
MatLUFactorNum         2 1.0 3.2121e+02 1.0 3.35e+11 1.0 0.0e+00 0.0e+00 0.0e+00 96100  0  0  0  96100  0  0  0  1044
MatAssemblyBegin       3 1.0 1.1565e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         3 1.0 3.6987e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 1.2978e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 5.1436e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 1.4995e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               2 1.0 3.2957e+02 1.0 3.36e+11 1.0 0.0e+00 0.0e+00 0.0e+00 99100  0  0  0  99100  0  0  0  1020
PCSetUp                2 1.0 3.2898e+02 1.0 3.35e+11 1.0 0.0e+00 0.0e+00 0.0e+00 99100  0  0  0  99100  0  0  0  1020
PCApply                2 1.0 5.8560e-01 1.0 6.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1077
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

                SNES     1              1         1324     0
      SNESLineSearch     1              1          856     0
              DMSNES     2              2         1312     0
              Vector    27             27      8261968     0
      Vector Scatter     3              3         1944     0
             MatMFFD     1              1          752     0
              Matrix     3              3   1999326700     0
    Distributed Mesh     3              3        14392     0
Star Forest Bipartite Graph     6              6         4928     0
     Discrete System     3              3         2520     0
           Index Set    11             11       519936     0
   IS L to G Mapping     2              2        51920     0
       Krylov Solver     2              2         2272     0
     DMKSP interface     1              1          640     0
      Preconditioner     2              2         1968     0
              Viewer     6              5         3800     0
========================================================================================================================
Average time to get PetscTime(): 2.61003e-08
#PETSc Option Table entries:
-cimplicit 1
-dt 1e-2
-fixed_dt
-ksp_monitor
-ksp_type preonly
-log_summary
-nts 2
-pc_type lu
-snes_mf
-snes_monitor
-total_matrix
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich
-----------------------------------------
Libraries compiled on Thu Sep 24 09:58:51 2015 on iMac27Nicolas.nifs.ac.jp 
Machine characteristics: Darwin-14.5.0-x86_64-i386-64bit
Using PETSc directory: /Users/timotheenicolas/PETSC/petsc-3.6.1
Using PETSc arch: arch-darwin-c-debug
-----------------------------------------

Using C compiler: /Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/bin/mpicc  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/bin/mpif90  -fPIC  -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -g -O0  ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/include -I/Users/timotheenicolas/PETSC/petsc-3.6.1/include -I/Users/timotheenicolas/PETSC/petsc-3.6.1/include -I/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/include -I/opt/X11/include
-----------------------------------------

Using C linker: /Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/bin/mpicc
Using Fortran linker: /Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/bin/mpif90
Using libraries: -Wl,-rpath,/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/lib -L/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/lib -lpetsc -Wl,-rpath,/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/lib -L/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/lib -lflapack -lfblas -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -lssl -lcrypto -Wl,-rpath,/Library/Developer/CommandLineTools/usr/lib/clang/7.0.0/lib/darwin -L/Library/Developer/CommandLineTools/usr/lib/clang/7.0.0/lib/darwin -lmpifort -lgfortran -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin14.0.0/5.0.0 -L/usr/local/lib/gcc/x86_64-apple-darwin14.0.0/5.0.0 -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpicxx -lc++ -Wl,-rpath,/Library/Developer/CommandLineTools/usr/bin/../lib/clang/7.0.0/lib/darwin -L/Library/Developer/CommandLineTools/usr/bin/../lib/clang/7.0.0/lib/darwin -lclang_rt.osx -Wl,-rpath,/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/lib -L/Users/timotheenicolas/PETSC/petsc-3.6.1/arch-darwin-c-debug/lib -ldl -lmpi -lpmpi -lSystem -Wl,-rpath,/Library/Developer/CommandLineTools/usr/bin/../lib/clang/7.0.0/lib/darwin -L/Library/Developer/CommandLineTools/usr/bin/../lib/clang/7.0.0/lib/darwin -lclang_rt.osx -ldl 
-----------------------------------------

WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
Option left: name:-ksp_monitor (no value)


More information about the petsc-users mailing list