[petsc-users] -log_view hangs unexpectedly // how to optimize my kspsolve

Sun Jan 8 18:05:28 CST 2017

   Manuel,

     Ok there are two (actually 3) distinct things you need to deal with to get get any kind of performance out of this machine.

0) When running on the machine you cannot share it with other peoples jobs or you will get timings all over the place so run streams and benchmarks of your code when no one else has jobs running (The Unix top command helps)

1)  mpiexec is making bad decisions about process binding (what MPI processes are bound/assigned to what MPI cores).

From streams you have

np  speedup
1 1.0
2 1.95
3 0.57
4 0.6
5 2.79
6 2.8
7 2.74
8 2.67
9 2.55
10 2.68
.....

This is nuts. When going from 2 to 3 processes the performance goes WAY down. If the machine is empty and MPI did a good assignment of processes to cores the speedup should not go down for more cores it should just stagnate. 

So you need to find out how to do process binding with MPI see http://www.mcs.anl.gov/petsc/documentation/faq.html#computers and the links from there.  You can run the streams test with binding by for example make streams NPMAX=4 MPI_BINDING="--binding cpu:sockets". 

Once you have a good binding for your MPI make sure you always run the mpiexec with that binding when running your code.

2) Both preconditioners you have tried for your problem are terrible. With block Jacobi it went from 156 linear iterations (for 5 linear solves) to 546 iterations. With AMG it went from 1463!! iterations to 1760. These are huge numbers of iterations for algebraic multigrid! 

   For some reason AMG doesn't like your pressure matrix (even though AMG generally loves pressure matrices). What do you have for boundary conditions for your pressure? 

   Please run with -ksp_view_mat binary -ksp_view_rhs binary and then send the resulting file binaryoutput to petsc-maint at mcs.anl.gov and we'll see if we can figure out why AMG doesn't like it.

> On Jan 8, 2017, at 4:41 PM, Manuel Valera <mvalera at mail.sdsu.edu> wrote:
> 
> Ok, i just did the streams and log_summary tests, im attaching the output for each run, with NPMAX=4 and NPMAX=32, also -log_summary runs with -pc_type hypre and without it, with 1 and 2 cores, all of this with debugging turned off. 
> 
> The matrix is 200,000x200,000, full curvilinear 3d meshes, non-hydrostatic pressure solver.
> 
> Thanks a lot for your insight,
> 
> Manuel 
> 
> On Sun, Jan 8, 2017 at 9:48 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   we need to see the -log_summary with hypre on 1 and 2 processes (with debugging tuned off) also we need to see the output from
> 
>    make stream NPMAX=4
> 
> run in the PETSc directory.
> 
> 
> 
> > On Jan 7, 2017, at 7:38 PM, Manuel Valera <mvalera at mail.sdsu.edu> wrote:
> >
> > Ok great, i tried those command line args and this is the result:
> >
> > when i use -pc_type gamg:
> >
> > [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [1]PETSC ERROR: Petsc has generated inconsistent data
> > [1]PETSC ERROR: Have un-symmetric graph (apparently). Use '-pc_gamg_sym_graph true' to symetrize the graph or '-pc_gamg_threshold -1.0' if the matrix is structurally symmetric.
> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > [1]PETSC ERROR: Petsc Release Version 3.7.4, unknown
> > [1]PETSC ERROR: ./ucmsMR on a arch-linux2-c-debug named ocean by valera Sat Jan  7 17:35:05 2017
> > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich --download-hdf5 --download-netcdf --download-hypre --download-metis --download-parmetis --download-trillinos
> > [1]PETSC ERROR: #1 smoothAggs() line 462 in /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/agg.c
> > [1]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 998 in /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/agg.c
> > [1]PETSC ERROR: #3 PCSetUp_GAMG() line 571 in /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/gamg.c
> > [1]PETSC ERROR: #4 PCSetUp() line 968 in /usr/dataC/home/valera/petsc/src/ksp/pc/interface/precon.c
> > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /usr/dataC/home/valera/petsc/src/ksp/ksp/interface/itfunc.c
> > application called MPI_Abort(comm=0x84000002, 77) - process 1
> >
> >
> > when i use -pc_type gamg and -pc_gamg_sym_graph true:
> >
> >  ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero
> > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [1]PETSC ERROR: ------------------------------------------------------------------------
> > [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> > [1]PETSC ERROR:       INSTEAD the line number of the start of the function
> > [1]PETSC ERROR:       is given.
> > [1]PETSC ERROR: [1] LAPACKgesvd line 42 /usr/dataC/home/valera/petsc/src/ksp/ksp/impls/gmres/gmreig.c
> > [1]PETSC ERROR: [1] KSPComputeExtremeSingularValues_GMRES line 24 /usr/dataC/home/valera/petsc/src/ksp/ksp/impls/gmres/gmreig.c
> > [1]PETSC ERROR: [1] KSPComputeExtremeSingularValues line 51 /usr/dataC/home/valera/petsc/src/ksp/ksp/interface/itfunc.c
> > [1]PETSC ERROR: [1] PCGAMGOptProlongator_AGG line 1187 /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/agg.c
> > [1]PETSC ERROR: [1] PCSetUp_GAMG line 472 /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/gamg.c
> > [1]PETSC ERROR: [1] PCSetUp line 930 /usr/dataC/home/valera/petsc/src/ksp/pc/interface/precon.c
> > [1]PETSC ERROR: [1] KSPSetUp line 305 /usr/dataC/home/valera/petsc/src/ksp/ksp/interface/itfunc.c
> > [0] PCGAMGOptProlongator_AGG line 1187 /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/agg.c
> > [0]PETSC ERROR: [0] PCSetUp_GAMG line 472 /usr/dataC/home/valera/petsc/src/ksp/pc/impls/gamg/gamg.c
> > [0]PETSC ERROR: [0] PCSetUp line 930 /usr/dataC/home/valera/petsc/src/ksp/pc/interface/precon.c
> > [0]PETSC ERROR: [0] KSPSetUp line 305 /usr/dataC/home/valera/petsc/src/ksp/ksp/interface/itfunc.c
> > [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >
> > when i use -pc_type hypre it actually shows something different on -ksp_view :
> >
> > KSP Object: 2 MPI processes
> >   type: gcr
> >     GCR: restart = 30
> >     GCR: restarts performed = 37
> >   maximum iterations=10000, initial guess is zero
> >   tolerances:  relative=1e-14, absolute=1e-50, divergence=10000.
> >   right preconditioning
> >   using UNPRECONDITIONED norm type for convergence test
> > PC Object: 2 MPI processes
> >   type: hypre
> >     HYPRE BoomerAMG preconditioning
> >     HYPRE BoomerAMG: Cycle type V
> >     HYPRE BoomerAMG: Maximum number of levels 25
> >     HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
> >     HYPRE BoomerAMG: Convergence tolerance PER hypre call 0.
> >     HYPRE BoomerAMG: Threshold for strong coupling 0.25
> >     HYPRE BoomerAMG: Interpolation truncation factor 0.
> >     HYPRE BoomerAMG: Interpolation: max elements per row 0
> >     HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
> >     HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
> >     HYPRE BoomerAMG: Maximum row sums 0.9
> >     HYPRE BoomerAMG: Sweeps down         1
> >     HYPRE BoomerAMG: Sweeps up           1
> >     HYPRE BoomerAMG: Sweeps on coarse    1
> >     HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
> >     HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
> >     HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
> >     HYPRE BoomerAMG: Relax weight  (all)      1.
> >     HYPRE BoomerAMG: Outer relax weight (all) 1.
> >     HYPRE BoomerAMG: Using CF-relaxation
> >     HYPRE BoomerAMG: Not using more complex smoothers.
> >     HYPRE BoomerAMG: Measure type        local
> >     HYPRE BoomerAMG: Coarsen type        Falgout
> >     HYPRE BoomerAMG: Interpolation type  classical
> >     HYPRE BoomerAMG: Using nodal coarsening (with HYPRE_BOOMERAMGSetNodal() 1
> >     HYPRE BoomerAMG: HYPRE_BoomerAMGSetInterpVecVariant() 1
> >   linear system matrix = precond matrix:
> >   Mat Object:   2 MPI processes
> >     type: mpiaij
> >     rows=200000, cols=200000
> >     total: nonzeros=3373340, allocated nonzeros=3373340
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node (on process 0) routines
> >
> >
> > but still the timing is terrible.
> >
> >
> >
> >
> > On Sat, Jan 7, 2017 at 5:28 PM, Jed Brown <jed at jedbrown.org> wrote:
> > Manuel Valera <mvalera at mail.sdsu.edu> writes:
> >
> > > Awesome Matt and Jed,
> > >
> > > The GCR is used because the matrix is not invertible and because this was
> > > the algorithm that the previous library used,
> > >
> > > The Preconditioned im aiming to use is multigrid, i thought i configured
> > > the hypre-boomerAmg solver for this, but i agree in that it doesn't show in
> > > the log anywhere, how can i be sure is being used ? i sent -ksp_view log
> > > before in this thread
> >
> > Did you run with -pc_type hypre?
> >
> > > I had a problem with the matrix block sizes so i couldn't make the petsc
> > > native multigrid solver to work,
> >
> > What block sizes?  If the only variable is pressure, the block size
> > would be 1 (default).
> >
> > > This is a nonhidrostatic pressure solver, it is an elliptic problem so
> > > multigrid is a must,
> >
> > Yes, multigrid should work well.
> >
> 
> 
> <logsumm1hypre.txt><logsumm1jacobi.txt><logsumm2hypre.txt><logsumm2jacobi.txt><steams4.txt><steams32.txt>