[petsc-users] Null space for Poisson with zero Neumann BCs

Åsmund Ervik asmund.ervik at ntnu.no
Mon Nov 18 11:33:29 CST 2013


Good people of PETSc,

I have read in the list archives that setting the null space is the
preferred method of ensuring that the pressure Poisson equation is
non-singular. Setting the pressure value at one point is not recommended.

I have tried this advice using the following code:

call MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,nullspace,ierr)
call MatSetNullSpace(A,nullspace,ierr)
call MatNullSpaceRemove(nullspace,rhs_vec,PETSC_NULL_OBJECT,ierr)
call MatNullSpaceDestroy(nullspace,ierr)

after assembling the mat/vec and before KSPSetOperators, KSPSolve etc.

With this I get a better looking pressure field, but the runtime is
horrible (10x longer) when compared to my old "set the value at some
point" method. Also, I see that it takes a long time for some time
steps, but then goes fast for others. The Poisson equation should be
essentially the same for all these time steps, the velocity field is
changing slowly.

Below is log_summary for "old method" (first) and "set nullspace"
(last). (And yes, I know I'm running an old version of "dev", but I've
been too swamped to change to the git setup.) Any suggestions on
improving the runtime?

Best regards,
Åsmund



---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------

./run on a arch-linux2-c-opt named ty with 1 processor, by asmunder Mon
Nov 18 18:29:26 2013
Using Petsc Development HG revision:
3a41f882cfc717ec37b4c7f6b31f43b10211af66  HG Date: Sun Feb 17 13:07:58
2013 -0600

                         Max       Max/Min        Avg      Total
Time (sec):           3.617e+00      1.00000   3.617e+00
Objects:              4.400e+01      1.00000   4.400e+01
Flops:                2.051e+07      1.00000   2.051e+07  2.051e+07
Flops/sec:            5.670e+06      1.00000   5.670e+06  5.670e+06
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       1.620e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N
--> 2N flops
                            and VecAXPY() for complex vectors of length
N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 3.6173e+00 100.0%  2.0508e+07 100.0%  0.000e+00
0.0%  0.000e+00        0.0%  1.610e+02  99.4%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this
phase
      %M - percent messages in this phase     %L - percent message
lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops
         --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

ThreadCommRunKer    1628 1.0 9.5713e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecDot               234 1.0 5.8794e-04 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  9  0  0  0   0  9  0  0  0  3260
VecDotNorm2          117 1.0 5.3787e-04 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
1.2e+02  0  9  0  0 72   0  9  0  0 73  3564
VecNorm              177 1.0 4.3344e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCopy               60 1.0 1.6022e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               394 1.0 5.6601e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               30 1.0 1.1802e-04 1.0 2.46e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  1  0  0  0   0  1  0  0  0  2082
VecAXPBYCZ           234 1.0 1.3099e-03 1.0 3.83e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0 19  0  0  0   0 19  0  0  0  2927
VecWAXPY             234 1.0 1.1146e-03 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  9  0  0  0   0  9  0  0  0  1720
VecAssemblyBegin      30 1.0 4.7684e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        30 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult              264 1.0 5.7216e-03 1.0 1.07e+07 1.0 0.0e+00 0.0e+00
0.0e+00  0 52  0  0  0   0 52  0  0  0  1866
MatConvert            30 1.0 1.8499e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin      30 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        30 1.0 8.7762e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ           30 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp              30 1.0 1.9836e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
6.0e+00  0  0  0  0  4   0  0  0  0  4     0
KSPSolve              30 1.0 2.1702e-01 1.0 2.05e+07 1.0 0.0e+00 0.0e+00
1.6e+02  6100  0  0 97   6100  0  0 98    95
PCSetUp               30 1.0 8.2594e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
4.0e+00  2  0  0  0  2   2  0  0  0  2     0
PCApply              294 1.0 1.2338e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  3  0  0  0  0   3  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector    40             40      1370240     0
              Matrix     1              1       313964     0
       Krylov Solver     1              1         1288     0
      Preconditioner     1              1         1248     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 0
#PETSc Option Table entries:
-ksp_type bcgs
-log_summary
-pc_hypre_type boomeramg
-pc_type hypre
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Tue Feb 19 15:18:58 2013
Configure options: --with-threadcomm --with-pthreadclasses --with-openmp
--with-cc=icc --with-fc=ifort --with-debugging=0
--with-shared-libraries=1 --download-mpich --download-hypre
COPTFLAGS=-O3 FOPTFLAGS=-O3
-----------------------------------------
Libraries compiled on Tue Feb 19 15:18:58 2013 on ty
Machine characteristics: Linux-3.7.9-1-ARCH-x86_64-with-glibc2.2.5
Using PETSc directory: /opt/petsc/petsc-dev-ifort
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
 -fPIC -wd1572 -O3 -fopenmp  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler:
/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90  -fPIC  -O3
-fopenmp  ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------

Using include paths:
-I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
-I/opt/petsc/petsc-dev-ifort/include
-I/opt/petsc/petsc-dev-ifort/include
-I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
Using Fortran linker:
/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90
Using libraries:
-Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
-L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lpetsc
-Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
-L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lHYPRE
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
-Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2
-L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2 -lmpichcxx -llapack -lblas
-lX11 -lpthread -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl
-lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lirng -lipgo -ldecimal
-lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
-----------------------------------------








---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------

./run on a arch-linux2-c-opt named ty with 1 processor, by asmunder Mon
Nov 18 18:22:38 2013
Using Petsc Development HG revision:
3a41f882cfc717ec37b4c7f6b31f43b10211af66  HG Date: Sun Feb 17 13:07:58
2013 -0600

                         Max       Max/Min        Avg      Total
Time (sec):           9.378e+01      1.00000   9.378e+01
Objects:              7.400e+01      1.00000   7.400e+01
Flops:                1.630e+10      1.00000   1.630e+10  1.630e+10
Flops/sec:            1.738e+08      1.00000   1.738e+08  1.738e+08
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       3.004e+05      1.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N
--> 2N flops
                            and VecAXPY() for complex vectors of length
N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 9.3784e+01 100.0%  1.6300e+10 100.0%  0.000e+00
0.0%  0.000e+00        0.0%  3.004e+05 100.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this
phase
      %M - percent messages in this phase     %L - percent message
lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops
         --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

ThreadCommRunKer 1101474 1.0 6.9823e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  7  0  0  0  0   7  0  0  0  0     0
VecDot            200206 1.0 4.7812e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
0.0e+00  1 10  0  0  0   1 10  0  0  0  3430
VecDotNorm2       100103 1.0 4.3580e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
1.0e+05  0 10  0  0 33   0 10  0  0 33  3763
VecNorm           100163 1.0 2.1225e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCopy               60 1.0 1.6689e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet            200366 1.0 2.3949e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               30 1.0 1.1563e-04 1.0 2.46e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0  2125
VecAXPBYCZ        200206 1.0 1.1021e+00 1.0 3.28e+09 1.0 0.0e+00 0.0e+00
0.0e+00  1 20  0  0  0   1 20  0  0  0  2976
VecWAXPY          200206 1.0 9.3084e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
0.0e+00  1 10  0  0  0   1 10  0  0  0  1762
VecAssemblyBegin      30 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        30 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult           200236 1.0 4.2707e+00 1.0 8.10e+09 1.0 0.0e+00 0.0e+00
0.0e+00  5 50  0  0  0   5 50  0  0  0  1896
MatConvert            30 1.0 1.9252e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin      30 1.0 1.0967e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        30 1.0 8.7786e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ           30 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp              30 1.0 1.8668e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
6.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              30 1.0 8.9967e+01 1.0 1.63e+10 1.0 0.0e+00 0.0e+00
3.0e+05 96100  0  0100  96100  0  0100   181
PCSetUp               30 1.0 8.3021e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
4.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply           200266 1.0 8.0928e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 86  0  0  0  0  86  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector    40             40      1370240     0
              Matrix     1              1       313964     0
   Matrix Null Space    30             30        18120     0
       Krylov Solver     1              1         1288     0
      Preconditioner     1              1         1248     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
#PETSc Option Table entries:
-ksp_type bcgs
-log_summary
-pc_hypre_type boomeramg
-pc_type hypre
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Tue Feb 19 15:18:58 2013
Configure options: --with-threadcomm --with-pthreadclasses --with-openmp
--with-cc=icc --with-fc=ifort --with-debugging=0
--with-shared-libraries=1 --download-mpich --download-hypre
COPTFLAGS=-O3 FOPTFLAGS=-O3
-----------------------------------------
Libraries compiled on Tue Feb 19 15:18:58 2013 on ty
Machine characteristics: Linux-3.7.9-1-ARCH-x86_64-with-glibc2.2.5
Using PETSc directory: /opt/petsc/petsc-dev-ifort
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
 -fPIC -wd1572 -O3 -fopenmp  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler:
/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90  -fPIC  -O3
-fopenmp  ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------

Using include paths:
-I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
-I/opt/petsc/petsc-dev-ifort/include
-I/opt/petsc/petsc-dev-ifort/include
-I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
-----------------------------------------

Using C linker: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
Using Fortran linker:
/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90
Using libraries:
-Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
-L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lpetsc
-Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
-L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lHYPRE
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
-Wl,-rpath,/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
-L/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
-Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2
-L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2 -lmpichcxx -llapack -lblas
-lX11 -lpthread -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl
-lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lirng -lipgo -ldecimal
-lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
-----------------------------------------





More information about the petsc-users mailing list