From gcdiwan at gmail.com Tue May 1 06:15:02 2018 From: gcdiwan at gmail.com (Ganesh Diwan) Date: Tue, 1 May 2018 12:15:02 +0100 Subject: [petsc-users] understanding gmres output In-Reply-To: References: Message-ID: > > You may/likely won't get the same convergence with Matlab not due to GMRES > but due to differences in the ILU in Matlab and PETSc. Better to run with > Jacobi or no preconditioner if you wish to get consistent results. Not sure how to specify 'no' preconditioner but I tried ksp.setOperators(A,Id) where Id is the nxn identity matrix and this gave me consistent results with Matlab. I am working on the Helmholtz problems with finite elements and I use a preconditioning matrix which is supposed to help with the GMRES convergence. From the options set in the code in my previous post, I believe PETSc will compute the norm of preconditioned residual (default for left preconditioning) and use it as the stopping criterion. From the Matlab documentation, I understand that it will use the norm of the preconditioned residual (same as PETSc). Despite using the same parameters (number of restarts, maximum iterations, tolerance and the same preconditioning matrix), I see PETSc and Matlab diverge in terms of the GMRES iterations as I refine my finite element mesh (I get consistent results only up to a certain mesh refinement level). Could you comment? Thank you, Ganesh On Thu, Apr 26, 2018 at 7:24 PM, Smith, Barry F. wrote: > > * You may/likely won't get the same convergence with Matlab not due to > GMRES but due to differences in the ILU in Matlab and PETSc. Better to run > with Jacobi or no preconditioner if you wish to get consistent results. > > * Check carefully if Matlab using left or right preconditioning for > GMRES (PETSc uses left by default). > > > I see that the residual < norm(b)*rtol for the 4th iteration itself, I > am not sure then why GMRES continues for one more iteration. > > * Since PETSc is using left preconditioning it is norm(B*b)*rtol that > determines the convergence, not norm(b)*rtol. Where B is application of > preconditioner. > > * The residual history for GMRES includes everything through any restarts > so if it takes 90 total iterations (and the restart is 30) the residual > history should have 90 values. > > > In the above specific example, it took GMRES a total of 5 iterations to > converge. Does it mean the convergence was achieved without having to > restart? > > In most cases if you need to restart then you need a better > preconditioner. > > > Barry > > > > > > On Apr 26, 2018, at 9:46 AM, Ganesh Diwan wrote: > > > > Dear Petsc developers, > > > > I am trying to learn to use PETSc GMRES with petsc4py and also trying to > follow the codes in petsc/src/ksp/ksp/examples/tutorials. Here is the > Python code I use to read a linear system and solve it with GMRES in PETSc > (petsc4py) > > > > # gmrestest.py > > # usual imports > > import numpy as np > > import scipy.io as sio > > import scipy > > from sys import getrefcount > > from petsc4py import PETSc > > > > # read the matrix data > > mat_contents = sio.loadmat('data.mat') > > mat_A = mat_contents['A'] > > vec_b = mat_contents['b'] > > n = mat_contents['n'] > > vec_x = mat_contents['x'] > > mat_nnzA = mat_contents['nnzA'] > > > > # form the petsc matrices > > x = PETSc.Vec().createWithArray(vec_x) > > b = PETSc.Vec().createWithArray(vec_b) > > p1=mat_A.indptr > > p2=mat_A.indices > > p3=mat_A.data > > A = PETSc.Mat().createAIJ(size=mat_A.shape,csr=(p1,p2,p3)) > > A = scipy.transpose(A) # transpose the csr format as my matrix is column > major originally > > A.assemblyBegin() > > A.assemblyEnd() > > > > # solve > > ksp = PETSc.KSP() > > ksp.create(PETSc.COMM_WORLD) > > ksp.setType('gmres') > > ksp.setOperators(A) > > ksp.setFromOptions() > > rtol = 1e-4 > > ksp.setTolerances(rtol, 1e-5, 10000., 10000) > > ksp.view() > > ksp.setConvergenceHistory() > > ksp.solve(b, x) > > > > # print > > print 'iterations = ', ksp.getIterationNumber() > > print 'residual = ', '{:.2e}'.format(ksp.getResidualNorm())# %.2E > ksp.getResidualNorm() > > print 'norm(b)*rtol = ', '{:.2e}'.format(PETSc.Vec.norm(b)*rtol)# %.2E > PETSc.Vec.norm(b)*rtol > > print 'converge reason# = ', ksp.getConvergedReason() > > print 'residuals at each iter = ', ksp.getConvergenceHistory() > > # > > > > Here is the output from the above code for a linear system of dimension > 100 by 100. > > > > > > KSP Object: 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=0.0001, absolute=1e-05, divergence=10000. > > left preconditioning > > using DEFAULT norm type for convergence test > > PC Object: 1 MPI processes > > type: ilu > > PC has not been set up so information may be incomplete > > out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=100, cols=100 > > total: nonzeros=2704, allocated nonzeros=2704 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 25 nodes, limit used is 5 > > iterations = 5 > > residual = 1.38e-03 > > norm(b)*rtol = 1.99e-01 > > converge reason# = 2 > > residuals at each iter = [ 2.05677686e+01 4.97916031e+00 > 4.82888782e-01 1.16849581e-01 > > 8.87159777e-03 1.37992327e-03] > > > > Sorry if this sounds stupid, but I am trying to understand the output > from PETSc by contrasting it with Matlab GMRES. > > I see that the residual < norm(b)*rtol for the 4th iteration itself, I > am not sure then why GMRES continues for one more iteration. Secondly, how > does one get total number of iterations? For eg. let's say if it takes 3 > outer iterations each with the default restart of 30, then one would expect > the length of residual vector to be 150. In the above specific example, it > took GMRES a total of 5 iterations to converge. Does it mean the > convergence was achieved without having to restart? > > > > Thank you, Ganesh > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 1 07:13:22 2018 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 1 May 2018 08:13:22 -0400 Subject: [petsc-users] understanding gmres output In-Reply-To: References: Message-ID: On Tue, May 1, 2018 at 7:15 AM, Ganesh Diwan wrote: > You may/likely won't get the same convergence with Matlab not due to GMRES >> but due to differences in the ILU in Matlab and PETSc. Better to run with >> Jacobi or no preconditioner if you wish to get consistent results. > > > Not sure how to specify 'no' preconditioner > -pc_type none > but I tried ksp.setOperators(A,Id) where Id is the nxn identity matrix and > this gave me consistent results with Matlab. > > I am working on the Helmholtz problems with finite elements > Notoriously difficult. For example https://www.unige.ch/~gander/Preprints/HelmholtzReview.pdf > and I use a preconditioning matrix which is supposed to help with the > GMRES convergence. From the options set in the code in my previous post, I > believe PETSc will compute the norm of preconditioned residual (default for > left preconditioning) and use it as the stopping criterion. From the > Matlab documentation, I understand that it will use the norm of the > preconditioned residual (same as PETSc). Despite using the same > parameters (number of restarts, maximum iterations, tolerance and the same > preconditioning matrix), I see PETSc and Matlab diverge in terms of the > GMRES iterations as I refine my finite element mesh (I get consistent > results only up to a certain mesh refinement level). Could you comment? > The problem becomes more ill-conditioned as you refine, and thus more sensitive to small numerical errors such as round-off. This sensitivity results in different answers when equivalent computations are done in a different order. Moreover, its possible that PETSc and Matlab for using different orthogonalization routines. Classical Gram-Schmidt, modified Gram-Schmidt, and Householder will likely give different answers. Thanks, Matt > > Thank you, Ganesh > > On Thu, Apr 26, 2018 at 7:24 PM, Smith, Barry F. > wrote: > >> >> * You may/likely won't get the same convergence with Matlab not due to >> GMRES but due to differences in the ILU in Matlab and PETSc. Better to run >> with Jacobi or no preconditioner if you wish to get consistent results. >> >> * Check carefully if Matlab using left or right preconditioning for >> GMRES (PETSc uses left by default). >> >> > I see that the residual < norm(b)*rtol for the 4th iteration itself, I >> am not sure then why GMRES continues for one more iteration. >> >> * Since PETSc is using left preconditioning it is norm(B*b)*rtol that >> determines the convergence, not norm(b)*rtol. Where B is application of >> preconditioner. >> >> * The residual history for GMRES includes everything through any restarts >> so if it takes 90 total iterations (and the restart is 30) the residual >> history should have 90 values. >> >> > In the above specific example, it took GMRES a total of 5 iterations to >> converge. Does it mean the convergence was achieved without having to >> restart? >> >> In most cases if you need to restart then you need a better >> preconditioner. >> >> >> Barry >> >> >> >> >> > On Apr 26, 2018, at 9:46 AM, Ganesh Diwan wrote: >> > >> > Dear Petsc developers, >> > >> > I am trying to learn to use PETSc GMRES with petsc4py and also trying >> to follow the codes in petsc/src/ksp/ksp/examples/tutorials. Here is the >> Python code I use to read a linear system and solve it with GMRES in PETSc >> (petsc4py) >> > >> > # gmrestest.py >> > # usual imports >> > import numpy as np >> > import scipy.io as sio >> > import scipy >> > from sys import getrefcount >> > from petsc4py import PETSc >> > >> > # read the matrix data >> > mat_contents = sio.loadmat('data.mat') >> > mat_A = mat_contents['A'] >> > vec_b = mat_contents['b'] >> > n = mat_contents['n'] >> > vec_x = mat_contents['x'] >> > mat_nnzA = mat_contents['nnzA'] >> > >> > # form the petsc matrices >> > x = PETSc.Vec().createWithArray(vec_x) >> > b = PETSc.Vec().createWithArray(vec_b) >> > p1=mat_A.indptr >> > p2=mat_A.indices >> > p3=mat_A.data >> > A = PETSc.Mat().createAIJ(size=mat_A.shape,csr=(p1,p2,p3)) >> > A = scipy.transpose(A) # transpose the csr format as my matrix is >> column major originally >> > A.assemblyBegin() >> > A.assemblyEnd() >> > >> > # solve >> > ksp = PETSc.KSP() >> > ksp.create(PETSc.COMM_WORLD) >> > ksp.setType('gmres') >> > ksp.setOperators(A) >> > ksp.setFromOptions() >> > rtol = 1e-4 >> > ksp.setTolerances(rtol, 1e-5, 10000., 10000) >> > ksp.view() >> > ksp.setConvergenceHistory() >> > ksp.solve(b, x) >> > >> > # print >> > print 'iterations = ', ksp.getIterationNumber() >> > print 'residual = ', '{:.2e}'.format(ksp.getResidualNorm())# %.2E >> ksp.getResidualNorm() >> > print 'norm(b)*rtol = ', '{:.2e}'.format(PETSc.Vec.norm(b)*rtol)# >> %.2E PETSc.Vec.norm(b)*rtol >> > print 'converge reason# = ', ksp.getConvergedReason() >> > print 'residuals at each iter = ', ksp.getConvergenceHistory() >> > # >> > >> > Here is the output from the above code for a linear system of dimension >> 100 by 100. >> > >> > >> > KSP Object: 1 MPI processes >> > type: gmres >> > restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> > happy breakdown tolerance 1e-30 >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=0.0001, absolute=1e-05, divergence=10000. >> > left preconditioning >> > using DEFAULT norm type for convergence test >> > PC Object: 1 MPI processes >> > type: ilu >> > PC has not been set up so information may be incomplete >> > out-of-place factorization >> > 0 levels of fill >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > linear system matrix = precond matrix: >> > Mat Object: 1 MPI processes >> > type: seqaij >> > rows=100, cols=100 >> > total: nonzeros=2704, allocated nonzeros=2704 >> > total number of mallocs used during MatSetValues calls =0 >> > using I-node routines: found 25 nodes, limit used is 5 >> > iterations = 5 >> > residual = 1.38e-03 >> > norm(b)*rtol = 1.99e-01 >> > converge reason# = 2 >> > residuals at each iter = [ 2.05677686e+01 4.97916031e+00 >> 4.82888782e-01 1.16849581e-01 >> > 8.87159777e-03 1.37992327e-03] >> > >> > Sorry if this sounds stupid, but I am trying to understand the output >> from PETSc by contrasting it with Matlab GMRES. >> > I see that the residual < norm(b)*rtol for the 4th iteration itself, I >> am not sure then why GMRES continues for one more iteration. Secondly, how >> does one get total number of iterations? For eg. let's say if it takes 3 >> outer iterations each with the default restart of 30, then one would expect >> the length of residual vector to be 150. In the above specific example, it >> took GMRES a total of 5 iterations to converge. Does it mean the >> convergence was achieved without having to restart? >> > >> > Thank you, Ganesh >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhbaghaei at mail.sjtu.edu.cn Tue May 1 11:41:33 2018 From: mhbaghaei at mail.sjtu.edu.cn (Mohammad Hassan Baghaei) Date: Wed, 2 May 2018 00:41:33 +0800 (CST) Subject: [petsc-users] TSMonitor: TSConvergedReason Message-ID: <000801d3e16b$3feaf770$bfc0e650$@mail.sjtu.edu.cn> Hello Using the TSMonitorSet , I am trying to monitor the time-stepping solver. I use the TSSetConvergedReason() to set a limit on norm error. I found that even though the error is below the limit at the current step, the solver continues to solve that step. Then, in the next step, it will be converged. I want to know whether there is a way to monitor the solver before the stepping. Sorry if my explanation is not clear. I would inform you, if need more information. Thanks for your time considering. Amir -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue May 1 13:20:12 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 1 May 2018 18:20:12 +0000 Subject: [petsc-users] TSMonitor: TSConvergedReason In-Reply-To: <000801d3e16b$3feaf770$bfc0e650$@mail.sjtu.edu.cn> References: <000801d3e16b$3feaf770$bfc0e650$@mail.sjtu.edu.cn> Message-ID: TSSetPreStep() > On May 1, 2018, at 11:41 AM, Mohammad Hassan Baghaei wrote: > > Hello > Using the TSMonitorSet , I am trying to monitor the time-stepping solver. I use the TSSetConvergedReason() to set a limit on norm error. I found that even though the error is below the limit at the current step, the solver continues to solve that step. Then, in the next step, it will be converged. I want to know whether there is a way to monitor the solver before the stepping. Sorry if my explanation is not clear. I would inform you, if need more information. Thanks for your time considering. > Amir From fande.kong at inl.gov Tue May 1 15:10:34 2018 From: fande.kong at inl.gov (Kong, Fande) Date: Tue, 1 May 2018 14:10:34 -0600 Subject: [petsc-users] Could not determine how to create a shared library! Message-ID: Hi All, I can build a static petsc library on a supercomputer, but could not do the same thing with " --with-shared-libraries=1". The log file is attached. Fande, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.zip Type: application/zip Size: 27161 bytes Desc: not available URL: From balay at mcs.anl.gov Tue May 1 15:22:50 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 1 May 2018 15:22:50 -0500 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: This is theta.. Try: using --LDFLAGS=-dynamic option [as listed in config/examples/arch-cray-xc40-knl-opt.py] Satish On Tue, 1 May 2018, Kong, Fande wrote: > Hi All, > > I can build a static petsc library on a supercomputer, but could not do the > same thing with " --with-shared-libraries=1". > > The log file is attached. > > > Fande, > From mvalera-w at sdsu.edu Wed May 2 15:19:11 2018 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Wed, 2 May 2018 13:19:11 -0700 Subject: [petsc-users] Help my solver scale Message-ID: Hello guys, We are working in writing a paper about the parallelization of our model using PETSc, which is very exciting since is the first time we see our model scaling, but so far i feel my results for the laplacian solver could be much better, For example, using CG/Multigrid i get less than 20% of efficiency after 16 cores, up to 64 cores where i get only 8% efficiency, I am defining efficiency as speedup over number of cores, and speedup as twall_n/twall_1 where n is the number of cores, i think that's pretty standard, The ksp_view for a distributed solve looks like this: KSP Object: 16 MPI processes type: cg maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 16 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical Using nodal coarsening (with HYPRE_BOOMERAMGSetNodal() 1 HYPRE_BoomerAMGSetInterpVecVariant() 1 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=213120, cols=213120 total: nonzeros=3934732, allocated nonzeros=8098560 total number of mallocs used during MatSetValues calls =0 has attached near null space And the log_view for the same case would be: ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./gcmSeamount on a timings named ocean with 16 processors, by valera Wed May 2 13:18:21 2018 Using Petsc Development GIT revision: v3.9-163-gbe3efd4 GIT Date: 2018-04-16 10:45:40 -0500 Max Max/Min Avg Total Time (sec): 1.355e+00 1.00004 1.355e+00 Objects: 4.140e+02 1.00000 4.140e+02 Flop: 7.582e+05 1.09916 7.397e+05 1.183e+07 Flop/sec: 5.594e+05 1.09918 5.458e+05 8.732e+06 MPI Messages: 1.588e+03 1.19167 1.468e+03 2.348e+04 MPI Message Lengths: 7.112e+07 1.37899 4.462e+04 1.048e+09 MPI Reductions: 4.760e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3553e+00 100.0% 1.1835e+07 100.0% 2.348e+04 100.0% 4.462e+04 100.0% 4.670e+02 98.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSidedF 2 1.0 9.1908e-03 2.2 0.00e+00 0.0 3.6e+02 1.6e+05 0.0e+00 1 0 2 6 0 1 0 2 6 0 0 VecTDot 1 1.0 6.4135e-05 1.1 2.66e+04 1.0 0.0e+00 0.0e+00 1.0e+00 0 4 0 0 0 0 4 0 0 0 6646 VecNorm 1 1.0 1.4589e-0347.1 2.66e+04 1.0 0.0e+00 0.0e+00 1.0e+00 0 4 0 0 0 0 4 0 0 0 292 VecScale 14 1.0 3.6144e-04 1.3 4.80e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 62 0 0 0 0 62 0 0 0 20346 VecCopy 7 1.0 1.0152e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 83 1.0 3.0013e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecPointwiseMult 12 1.0 2.7585e-04 1.4 2.43e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 31 0 0 0 0 31 0 0 0 13153 VecScatterBegin 111 1.0 2.5293e-02 1.8 0.00e+00 0.0 9.5e+03 3.4e+04 1.9e+01 1 0 40 31 4 1 0 40 31 4 0 VecScatterEnd 92 1.0 4.8771e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecNormalize 1 1.0 2.6941e-05 2.3 1.33e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 7911 MatConvert 1 1.0 1.1009e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 1 0 0 0 1 1 0 0 0 1 0 MatAssemblyBegin 3 1.0 2.8401e-02 1.0 0.00e+00 0.0 3.6e+02 1.6e+05 0.0e+00 2 0 2 6 0 2 0 2 6 0 0 MatAssemblyEnd 3 1.0 2.9033e-02 1.0 0.00e+00 0.0 6.0e+01 1.2e+04 2.0e+01 2 0 0 0 4 2 0 0 0 4 0 MatGetRowIJ 2 1.0 1.9073e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 1 1.0 3.0398e-04 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 1 1.0 4.7994e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 2.4850e-03 2.0 5.33e+04 1.0 0.0e+00 0.0e+00 2.0e+00 0 7 0 0 0 0 7 0 0 0 343 PCSetUp 2 1.0 2.2953e-02 1.0 1.33e+04 1.0 0.0e+00 0.0e+00 6.0e+00 2 2 0 0 1 2 2 0 0 1 9 PCApply 1 1.0 1.3151e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 172 170 70736264 0. Matrix 5 5 7125104 0. Matrix Null Space 1 1 608 0. Distributed Mesh 18 16 84096 0. Index Set 73 73 10022204 0. IS L to G Mapping 18 16 1180828 0. Star Forest Graph 36 32 27968 0. Discrete System 18 16 15040 0. Vec Scatter 67 64 38240520 0. Krylov Solver 2 2 2504 0. Preconditioner 2 2 2528 0. Viewer 2 1 848 0. ======================================================================================================================== Average time to get PetscTime(): 0. Average time for MPI_Barrier(): 2.38419e-06 Average time for zero size MPI_Send(): 2.11596e-06 #PETSc Option Table entries: -da_processors_z 1 -ksp_type cg -ksp_view -log_view -pc_hypre_boomeramg_nodal_coarsen 1 -pc_hypre_boomeramg_vec_interp_variant 1 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=timings --with-mpi-dir=/usr/lib64/openmpi --with-blaslapack-dir=/usr/lib64 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 --with-shared-libraries=1 --download-hypre --with-debugging=no --with-batch -known-mpi-shared-libraries=0 --known-64-bit-blas-indices=0 ----------------------------------------- Libraries compiled on 2018-04-27 21:13:11 on ocean Machine characteristics: Linux-3.10.0-327.36.3.el7.x86_64-x86_64-with-centos-7.2.1511-Core Using PETSc directory: /home/valera/petsc Using PETSc arch: timings ----------------------------------------- Using C compiler: /usr/lib64/openmpi/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -O3 Using Fortran compiler: /usr/lib64/openmpi/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ----------------------------------------- Using include paths: -I/home/valera/petsc/include -I/home/valera/petsc/timings/include -I/usr/lib64/openmpi/include ----------------------------------------- Using C linker: /usr/lib64/openmpi/bin/mpicc Using Fortran linker: /usr/lib64/openmpi/bin/mpif90 Using libraries: -Wl,-rpath,/home/valera/petsc/timings/lib -L/home/valera/petsc/timings/lib -lpetsc -Wl,-rpath,/home/valera/petsc/timings/lib -L/home/valera/petsc/timings/lib -Wl,-rpath,/usr/lib64 -L/usr/lib64 -Wl,-rpath,/usr/lib64/openmpi/lib -L/usr/lib64/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lHYPRE -llapack -lblas -lm -lstdc++ -ldl -lmpi_usempi -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl What do you see wrong here? what options could i try to improve my solver scaling? Thanks so much, Manuel -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 2 15:24:32 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 May 2018 16:24:32 -0400 Subject: [petsc-users] Help my solver scale In-Reply-To: References: Message-ID: On Wed, May 2, 2018 at 4:19 PM, Manuel Valera wrote: > Hello guys, > > We are working in writing a paper about the parallelization of our model > using PETSc, which is very exciting since is the first time we see our > model scaling, but so far i feel my results for the laplacian solver could > be much better, > > For example, using CG/Multigrid i get less than 20% of efficiency after 16 > cores, up to 64 cores where i get only 8% efficiency, > > I am defining efficiency as speedup over number of cores, and speedup as > twall_n/twall_1 where n is the number of cores, i think that's pretty > standard, > This is the first big problem. Not all "cores" are created equal. First, you need to run streams in the exact same configuration, so that you can see how much speedup to expect. The program is here cd src/benchmarks/streams and make streams will run it. You will probably need to submit the program yourself to the batch system to get the same configuration as your solver. This really matter because 16 cores on one nodes probably only has the potential for 5x speedup, so that your 20% is misguided. Thanks, Matt > The ksp_view for a distributed solve looks like this: > > KSP Object: 16 MPI processes > type: cg > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 16 MPI processes > type: hypre > HYPRE BoomerAMG preconditioning > Cycle type V > Maximum number of levels 25 > Maximum number of iterations PER hypre call 1 > Convergence tolerance PER hypre call 0. > Threshold for strong coupling 0.25 > Interpolation truncation factor 0. > Interpolation: max elements per row 0 > Number of levels of aggressive coarsening 0 > Number of paths for aggressive coarsening 1 > Maximum row sums 0.9 > Sweeps down 1 > Sweeps up 1 > Sweeps on coarse 1 > Relax down symmetric-SOR/Jacobi > Relax up symmetric-SOR/Jacobi > Relax on coarse Gaussian-elimination > Relax weight (all) 1. > Outer relax weight (all) 1. > Using CF-relaxation > Not using more complex smoothers. > Measure type local > Coarsen type Falgout > Interpolation type classical > Using nodal coarsening (with HYPRE_BOOMERAMGSetNodal() 1 > HYPRE_BoomerAMGSetInterpVecVariant() 1 > linear system matrix = precond matrix: > Mat Object: 16 MPI processes > type: mpiaij > rows=213120, cols=213120 > total: nonzeros=3934732, allocated nonzeros=8098560 > total number of mallocs used during MatSetValues calls =0 > has attached near null space > > > And the log_view for the same case would be: > > ************************************************************ > ************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > ************************************************************ > ************************************************************ > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > ./gcmSeamount on a timings named ocean with 16 processors, by valera Wed > May 2 13:18:21 2018 > Using Petsc Development GIT revision: v3.9-163-gbe3efd4 GIT Date: > 2018-04-16 10:45:40 -0500 > > Max Max/Min Avg Total > Time (sec): 1.355e+00 1.00004 1.355e+00 > Objects: 4.140e+02 1.00000 4.140e+02 > Flop: 7.582e+05 1.09916 7.397e+05 1.183e+07 > Flop/sec: 5.594e+05 1.09918 5.458e+05 8.732e+06 > MPI Messages: 1.588e+03 1.19167 1.468e+03 2.348e+04 > MPI Message Lengths: 7.112e+07 1.37899 4.462e+04 1.048e+09 > MPI Reductions: 4.760e+02 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flop > and VecAXPY() for complex vectors of length N > --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 1.3553e+00 100.0% 1.1835e+07 100.0% 2.348e+04 > 100.0% 4.462e+04 100.0% 4.670e+02 98.1% > > ------------------------------------------------------------ > ------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > ------------------------------------------------------------ > ------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------ > ------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > BuildTwoSidedF 2 1.0 9.1908e-03 2.2 0.00e+00 0.0 3.6e+02 1.6e+05 > 0.0e+00 1 0 2 6 0 1 0 2 6 0 0 > VecTDot 1 1.0 6.4135e-05 1.1 2.66e+04 1.0 0.0e+00 0.0e+00 > 1.0e+00 0 4 0 0 0 0 4 0 0 0 6646 > VecNorm 1 1.0 1.4589e-0347.1 2.66e+04 1.0 0.0e+00 0.0e+00 > 1.0e+00 0 4 0 0 0 0 4 0 0 0 292 > VecScale 14 1.0 3.6144e-04 1.3 4.80e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 62 0 0 0 0 62 0 0 0 20346 > VecCopy 7 1.0 1.0152e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 83 1.0 3.0013e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecPointwiseMult 12 1.0 2.7585e-04 1.4 2.43e+05 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 31 0 0 0 0 31 0 0 0 13153 > VecScatterBegin 111 1.0 2.5293e-02 1.8 0.00e+00 0.0 9.5e+03 3.4e+04 > 1.9e+01 1 0 40 31 4 1 0 40 31 4 0 > VecScatterEnd 92 1.0 4.8771e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 > VecNormalize 1 1.0 2.6941e-05 2.3 1.33e+04 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 7911 > MatConvert 1 1.0 1.1009e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 4.0e+00 1 0 0 0 1 1 0 0 0 1 0 > MatAssemblyBegin 3 1.0 2.8401e-02 1.0 0.00e+00 0.0 3.6e+02 1.6e+05 > 0.0e+00 2 0 2 6 0 2 0 2 6 0 0 > MatAssemblyEnd 3 1.0 2.9033e-02 1.0 0.00e+00 0.0 6.0e+01 1.2e+04 > 2.0e+01 2 0 0 0 4 2 0 0 0 4 0 > MatGetRowIJ 2 1.0 1.9073e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatView 1 1.0 3.0398e-04 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 1 1.0 4.7994e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.4850e-03 2.0 5.33e+04 1.0 0.0e+00 0.0e+00 > 2.0e+00 0 7 0 0 0 0 7 0 0 0 343 > PCSetUp 2 1.0 2.2953e-02 1.0 1.33e+04 1.0 0.0e+00 0.0e+00 > 6.0e+00 2 2 0 0 1 2 2 0 0 1 9 > PCApply 1 1.0 1.3151e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ------------------------------------------------------------ > ------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Vector 172 170 70736264 0. > Matrix 5 5 7125104 0. > Matrix Null Space 1 1 608 0. > Distributed Mesh 18 16 84096 0. > Index Set 73 73 10022204 0. > IS L to G Mapping 18 16 1180828 0. > Star Forest Graph 36 32 27968 0. > Discrete System 18 16 15040 0. > Vec Scatter 67 64 38240520 0. > Krylov Solver 2 2 2504 0. > Preconditioner 2 2 2528 0. > Viewer 2 1 848 0. > ============================================================ > ============================================================ > Average time to get PetscTime(): 0. > Average time for MPI_Barrier(): 2.38419e-06 > Average time for zero size MPI_Send(): 2.11596e-06 > #PETSc Option Table entries: > -da_processors_z 1 > -ksp_type cg > -ksp_view > -log_view > -pc_hypre_boomeramg_nodal_coarsen 1 > -pc_hypre_boomeramg_vec_interp_variant 1 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --known-level1-dcache-size=32768 --known-level1-dcache-linesize=64 > --known-level1-dcache-assoc=8 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 > --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 > --known-mpi-long-double=1 --known-mpi-int64_t=1 > --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 > PETSC_ARCH=timings --with-mpi-dir=/usr/lib64/openmpi > --with-blaslapack-dir=/usr/lib64 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 > FOPTFLAGS=-O3 --with-shared-libraries=1 --download-hypre > --with-debugging=no --with-batch -known-mpi-shared-libraries=0 > --known-64-bit-blas-indices=0 > ----------------------------------------- > Libraries compiled on 2018-04-27 21:13:11 on ocean > Machine characteristics: Linux-3.10.0-327.36.3.el7.x86_ > 64-x86_64-with-centos-7.2.1511-Core > Using PETSc directory: /home/valera/petsc > Using PETSc arch: timings > ----------------------------------------- > > Using C compiler: /usr/lib64/openmpi/bin/mpicc -fPIC -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector > -fvisibility=hidden -O3 > Using Fortran compiler: /usr/lib64/openmpi/bin/mpif90 -fPIC -Wall > -ffree-line-length-0 -Wno-unused-dummy-argument -O3 > ----------------------------------------- > > Using include paths: -I/home/valera/petsc/include > -I/home/valera/petsc/timings/include -I/usr/lib64/openmpi/include > ----------------------------------------- > > Using C linker: /usr/lib64/openmpi/bin/mpicc > Using Fortran linker: /usr/lib64/openmpi/bin/mpif90 > Using libraries: -Wl,-rpath,/home/valera/petsc/timings/lib > -L/home/valera/petsc/timings/lib -lpetsc -Wl,-rpath,/home/valera/petsc/timings/lib > -L/home/valera/petsc/timings/lib -Wl,-rpath,/usr/lib64 -L/usr/lib64 > -Wl,-rpath,/usr/lib64/openmpi/lib -L/usr/lib64/openmpi/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lHYPRE -llapack -lblas -lm > -lstdc++ -ldl -lmpi_usempi -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm > -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > > > > > What do you see wrong here? what options could i try to improve my solver > scaling? > > Thanks so much, > > Manuel > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Wed May 2 16:59:46 2018 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Wed, 2 May 2018 14:59:46 -0700 Subject: [petsc-users] Help my solver scale In-Reply-To: References: Message-ID: Thanks Matt, I just remade the streams tests on the machine and i got the following table, my question would be, is this the maximum speedup i may get on my machine, and thus should compare the efficiency and scaling tests up to this figure instead? I have 20-cores nodes so this was made over 4 nodes, Thanks, np speedup 1 1.0 2 1.82 3 2.43 4 2.79 5 2.99 6 3.13 7 3.13 8 3.19 9 3.17 10 3.17 11 3.44 12 3.81 13 4.13 14 4.43 15 4.72 16 5.05 17 5.4 18 5.69 19 5.99 20 6.29 21 6.66 22 6.96 23 7.26 24 7.6 25 7.86 26 8.25 27 8.54 28 8.88 29 9.2 30 9.44 31 9.84 32 10.06 33 10.43 34 10.72 35 11.11 36 11.42 37 11.75 38 12.07 39 12.27 40 12.65 41 12.94 42 13.34 43 13.6 44 13.83 45 14.27 46 14.56 47 14.84 48 15.24 49 15.49 50 15.85 51 15.87 52 16.35 53 16.76 54 17.02 55 17.17 56 17.7 57 17.9 58 18.28 59 18.56 60 18.82 61 19.37 62 19.62 63 19.88 64 20.21 On Wed, May 2, 2018 at 1:24 PM, Matthew Knepley wrote: > On Wed, May 2, 2018 at 4:19 PM, Manuel Valera wrote: > >> Hello guys, >> >> We are working in writing a paper about the parallelization of our model >> using PETSc, which is very exciting since is the first time we see our >> model scaling, but so far i feel my results for the laplacian solver could >> be much better, >> >> For example, using CG/Multigrid i get less than 20% of efficiency after >> 16 cores, up to 64 cores where i get only 8% efficiency, >> >> I am defining efficiency as speedup over number of cores, and speedup as >> twall_n/twall_1 where n is the number of cores, i think that's pretty >> standard, >> > > This is the first big problem. Not all "cores" are created equal. First, > you need to run streams in the exact same configuration, so that you can see > how much speedup to expect. The program is here > > cd src/benchmarks/streams > > and > > make streams > > will run it. You will probably need to submit the program yourself to the > batch system to get the same configuration as your solver. > > This really matter because 16 cores on one nodes probably only has the > potential for 5x speedup, so that your 20% is misguided. > > Thanks, > > Matt > > >> The ksp_view for a distributed solve looks like this: >> >> KSP Object: 16 MPI processes >> type: cg >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 16 MPI processes >> type: hypre >> HYPRE BoomerAMG preconditioning >> Cycle type V >> Maximum number of levels 25 >> Maximum number of iterations PER hypre call 1 >> Convergence tolerance PER hypre call 0. >> Threshold for strong coupling 0.25 >> Interpolation truncation factor 0. >> Interpolation: max elements per row 0 >> Number of levels of aggressive coarsening 0 >> Number of paths for aggressive coarsening 1 >> Maximum row sums 0.9 >> Sweeps down 1 >> Sweeps up 1 >> Sweeps on coarse 1 >> Relax down symmetric-SOR/Jacobi >> Relax up symmetric-SOR/Jacobi >> Relax on coarse Gaussian-elimination >> Relax weight (all) 1. >> Outer relax weight (all) 1. >> Using CF-relaxation >> Not using more complex smoothers. >> Measure type local >> Coarsen type Falgout >> Interpolation type classical >> Using nodal coarsening (with HYPRE_BOOMERAMGSetNodal() 1 >> HYPRE_BoomerAMGSetInterpVecVariant() 1 >> linear system matrix = precond matrix: >> Mat Object: 16 MPI processes >> type: mpiaij >> rows=213120, cols=213120 >> total: nonzeros=3934732, allocated nonzeros=8098560 >> total number of mallocs used during MatSetValues calls =0 >> has attached near null space >> >> >> And the log_view for the same case would be: >> >> ************************************************************ >> ************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >> -fCourier9' to print this document *** >> ************************************************************ >> ************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./gcmSeamount on a timings named ocean with 16 processors, by valera Wed >> May 2 13:18:21 2018 >> Using Petsc Development GIT revision: v3.9-163-gbe3efd4 GIT Date: >> 2018-04-16 10:45:40 -0500 >> >> Max Max/Min Avg Total >> Time (sec): 1.355e+00 1.00004 1.355e+00 >> Objects: 4.140e+02 1.00000 4.140e+02 >> Flop: 7.582e+05 1.09916 7.397e+05 1.183e+07 >> Flop/sec: 5.594e+05 1.09918 5.458e+05 8.732e+06 >> MPI Messages: 1.588e+03 1.19167 1.468e+03 2.348e+04 >> MPI Message Lengths: 7.112e+07 1.37899 4.462e+04 1.048e+09 >> MPI Reductions: 4.760e+02 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N >> --> 2N flop >> and VecAXPY() for complex vectors of length N >> --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages >> --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 1.3553e+00 100.0% 1.1835e+07 100.0% 2.348e+04 >> 100.0% 4.462e+04 100.0% 4.670e+02 98.1% >> >> ------------------------------------------------------------ >> ------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> Avg. len: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this >> phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >> over all processors) >> ------------------------------------------------------------ >> ------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg len >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------ >> ------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSidedF 2 1.0 9.1908e-03 2.2 0.00e+00 0.0 3.6e+02 1.6e+05 >> 0.0e+00 1 0 2 6 0 1 0 2 6 0 0 >> VecTDot 1 1.0 6.4135e-05 1.1 2.66e+04 1.0 0.0e+00 0.0e+00 >> 1.0e+00 0 4 0 0 0 0 4 0 0 0 6646 >> VecNorm 1 1.0 1.4589e-0347.1 2.66e+04 1.0 0.0e+00 0.0e+00 >> 1.0e+00 0 4 0 0 0 0 4 0 0 0 292 >> VecScale 14 1.0 3.6144e-04 1.3 4.80e+05 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 62 0 0 0 0 62 0 0 0 20346 >> VecCopy 7 1.0 1.0152e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 83 1.0 3.0013e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> VecPointwiseMult 12 1.0 2.7585e-04 1.4 2.43e+05 1.2 0.0e+00 0.0e+00 >> 0.0e+00 0 31 0 0 0 0 31 0 0 0 13153 >> VecScatterBegin 111 1.0 2.5293e-02 1.8 0.00e+00 0.0 9.5e+03 3.4e+04 >> 1.9e+01 1 0 40 31 4 1 0 40 31 4 0 >> VecScatterEnd 92 1.0 4.8771e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 >> VecNormalize 1 1.0 2.6941e-05 2.3 1.33e+04 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 2 0 0 0 7911 >> MatConvert 1 1.0 1.1009e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 4.0e+00 1 0 0 0 1 1 0 0 0 1 0 >> MatAssemblyBegin 3 1.0 2.8401e-02 1.0 0.00e+00 0.0 3.6e+02 1.6e+05 >> 0.0e+00 2 0 2 6 0 2 0 2 6 0 0 >> MatAssemblyEnd 3 1.0 2.9033e-02 1.0 0.00e+00 0.0 6.0e+01 1.2e+04 >> 2.0e+01 2 0 0 0 4 2 0 0 0 4 0 >> MatGetRowIJ 2 1.0 1.9073e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatView 1 1.0 3.0398e-04 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 1 1.0 4.7994e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 1 1.0 2.4850e-03 2.0 5.33e+04 1.0 0.0e+00 0.0e+00 >> 2.0e+00 0 7 0 0 0 0 7 0 0 0 343 >> PCSetUp 2 1.0 2.2953e-02 1.0 1.33e+04 1.0 0.0e+00 0.0e+00 >> 6.0e+00 2 2 0 0 1 2 2 0 0 1 9 >> PCApply 1 1.0 1.3151e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ------------------------------------------------------------ >> ------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Vector 172 170 70736264 0. >> Matrix 5 5 7125104 0. >> Matrix Null Space 1 1 608 0. >> Distributed Mesh 18 16 84096 0. >> Index Set 73 73 10022204 0. >> IS L to G Mapping 18 16 1180828 0. >> Star Forest Graph 36 32 27968 0. >> Discrete System 18 16 15040 0. >> Vec Scatter 67 64 38240520 0. >> Krylov Solver 2 2 2504 0. >> Preconditioner 2 2 2528 0. >> Viewer 2 1 848 0. >> ============================================================ >> ============================================================ >> Average time to get PetscTime(): 0. >> Average time for MPI_Barrier(): 2.38419e-06 >> Average time for zero size MPI_Send(): 2.11596e-06 >> #PETSc Option Table entries: >> -da_processors_z 1 >> -ksp_type cg >> -ksp_view >> -log_view >> -pc_hypre_boomeramg_nodal_coarsen 1 >> -pc_hypre_boomeramg_vec_interp_variant 1 >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >> Configure options: --known-level1-dcache-size=32768 >> --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 >> --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 >> --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 >> --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 >> --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 >> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 >> --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 >> PETSC_ARCH=timings --with-mpi-dir=/usr/lib64/openmpi >> --with-blaslapack-dir=/usr/lib64 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 >> FOPTFLAGS=-O3 --with-shared-libraries=1 --download-hypre >> --with-debugging=no --with-batch -known-mpi-shared-libraries=0 >> --known-64-bit-blas-indices=0 >> ----------------------------------------- >> Libraries compiled on 2018-04-27 21:13:11 on ocean >> Machine characteristics: Linux-3.10.0-327.36.3.el7.x86_ >> 64-x86_64-with-centos-7.2.1511-Core >> Using PETSc directory: /home/valera/petsc >> Using PETSc arch: timings >> ----------------------------------------- >> >> Using C compiler: /usr/lib64/openmpi/bin/mpicc -fPIC -Wall >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >> -fvisibility=hidden -O3 >> Using Fortran compiler: /usr/lib64/openmpi/bin/mpif90 -fPIC -Wall >> -ffree-line-length-0 -Wno-unused-dummy-argument -O3 >> ----------------------------------------- >> >> Using include paths: -I/home/valera/petsc/include >> -I/home/valera/petsc/timings/include -I/usr/lib64/openmpi/include >> ----------------------------------------- >> >> Using C linker: /usr/lib64/openmpi/bin/mpicc >> Using Fortran linker: /usr/lib64/openmpi/bin/mpif90 >> Using libraries: -Wl,-rpath,/home/valera/petsc/timings/lib >> -L/home/valera/petsc/timings/lib -lpetsc -Wl,-rpath,/home/valera/petsc/timings/lib >> -L/home/valera/petsc/timings/lib -Wl,-rpath,/usr/lib64 -L/usr/lib64 >> -Wl,-rpath,/usr/lib64/openmpi/lib -L/usr/lib64/openmpi/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lHYPRE -llapack -lblas -lm >> -lstdc++ -ldl -lmpi_usempi -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm >> -lgcc_s -lquadmath -lpthread -lstdc++ -ldl >> >> >> >> >> >> What do you see wrong here? what options could i try to improve my solver >> scaling? >> >> Thanks so much, >> >> Manuel >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 2 17:40:24 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 May 2018 18:40:24 -0400 Subject: [petsc-users] Help my solver scale In-Reply-To: References: Message-ID: On Wed, May 2, 2018 at 5:59 PM, Manuel Valera wrote: > Thanks Matt, > > I just remade the streams tests on the machine and i got the following > table, my question would be, is this the maximum speedup i may get on my > machine, and thus should compare the efficiency and scaling tests up to > this figure instead? > > I have 20-cores nodes > Are you sure they are not 16 core nodes? > so this was made over 4 nodes, > Okay, you get a speedup of 20 using all 4 nodes (64 processes). This means that the maximum speedup is 30% in your terminology. We can see that this is consistent scaling, since for 16 processes (I assume 1 node) we get a speedup of 5, which is also 30%. Using the bandwidth limit as peak instead of core count, then your strong scaling is about 70% for 16 processes (okay, not great), and 30% for 64 processes. That is believable, but could probably be improved. The next things to look at are: How big are the problem sizes per process? Are the iteration counts increasing? What do you get looking only are solve time? Only at setup time? Do you really care about strong scaling rather than weak scaling? For anything else we would need to see the output from -ksp_view -ksp_converged_reason -log_view Thanks, Matt > Thanks, > > np speedup > 1 1.0 > 2 1.82 > 3 2.43 > 4 2.79 > 5 2.99 > 6 3.13 > 7 3.13 > 8 3.19 > 9 3.17 > 10 3.17 > 11 3.44 > 12 3.81 > 13 4.13 > 14 4.43 > 15 4.72 > 16 5.05 > 17 5.4 > 18 5.69 > 19 5.99 > 20 6.29 > 21 6.66 > 22 6.96 > 23 7.26 > 24 7.6 > 25 7.86 > 26 8.25 > 27 8.54 > 28 8.88 > 29 9.2 > 30 9.44 > 31 9.84 > 32 10.06 > 33 10.43 > 34 10.72 > 35 11.11 > 36 11.42 > 37 11.75 > 38 12.07 > 39 12.27 > 40 12.65 > 41 12.94 > 42 13.34 > 43 13.6 > 44 13.83 > 45 14.27 > 46 14.56 > 47 14.84 > 48 15.24 > 49 15.49 > 50 15.85 > 51 15.87 > 52 16.35 > 53 16.76 > 54 17.02 > 55 17.17 > 56 17.7 > 57 17.9 > 58 18.28 > 59 18.56 > 60 18.82 > 61 19.37 > 62 19.62 > 63 19.88 > 64 20.21 > > > > On Wed, May 2, 2018 at 1:24 PM, Matthew Knepley wrote: > >> On Wed, May 2, 2018 at 4:19 PM, Manuel Valera wrote: >> >>> Hello guys, >>> >>> We are working in writing a paper about the parallelization of our model >>> using PETSc, which is very exciting since is the first time we see our >>> model scaling, but so far i feel my results for the laplacian solver could >>> be much better, >>> >>> For example, using CG/Multigrid i get less than 20% of efficiency after >>> 16 cores, up to 64 cores where i get only 8% efficiency, >>> >>> I am defining efficiency as speedup over number of cores, and speedup as >>> twall_n/twall_1 where n is the number of cores, i think that's pretty >>> standard, >>> >> >> This is the first big problem. Not all "cores" are created equal. First, >> you need to run streams in the exact same configuration, so that you can see >> how much speedup to expect. The program is here >> >> cd src/benchmarks/streams >> >> and >> >> make streams >> >> will run it. You will probably need to submit the program yourself to the >> batch system to get the same configuration as your solver. >> >> This really matter because 16 cores on one nodes probably only has the >> potential for 5x speedup, so that your 20% is misguided. >> >> Thanks, >> >> Matt >> >> >>> The ksp_view for a distributed solve looks like this: >>> >>> KSP Object: 16 MPI processes >>> type: cg >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 16 MPI processes >>> type: hypre >>> HYPRE BoomerAMG preconditioning >>> Cycle type V >>> Maximum number of levels 25 >>> Maximum number of iterations PER hypre call 1 >>> Convergence tolerance PER hypre call 0. >>> Threshold for strong coupling 0.25 >>> Interpolation truncation factor 0. >>> Interpolation: max elements per row 0 >>> Number of levels of aggressive coarsening 0 >>> Number of paths for aggressive coarsening 1 >>> Maximum row sums 0.9 >>> Sweeps down 1 >>> Sweeps up 1 >>> Sweeps on coarse 1 >>> Relax down symmetric-SOR/Jacobi >>> Relax up symmetric-SOR/Jacobi >>> Relax on coarse Gaussian-elimination >>> Relax weight (all) 1. >>> Outer relax weight (all) 1. >>> Using CF-relaxation >>> Not using more complex smoothers. >>> Measure type local >>> Coarsen type Falgout >>> Interpolation type classical >>> Using nodal coarsening (with HYPRE_BOOMERAMGSetNodal() 1 >>> HYPRE_BoomerAMGSetInterpVecVariant() 1 >>> linear system matrix = precond matrix: >>> Mat Object: 16 MPI processes >>> type: mpiaij >>> rows=213120, cols=213120 >>> total: nonzeros=3934732, allocated nonzeros=8098560 >>> total number of mallocs used during MatSetValues calls =0 >>> has attached near null space >>> >>> >>> And the log_view for the same case would be: >>> >>> ************************************************************ >>> ************************************************************ >>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >>> -fCourier9' to print this document *** >>> ************************************************************ >>> ************************************************************ >>> >>> ---------------------------------------------- PETSc Performance >>> Summary: ---------------------------------------------- >>> >>> ./gcmSeamount on a timings named ocean with 16 processors, by valera Wed >>> May 2 13:18:21 2018 >>> Using Petsc Development GIT revision: v3.9-163-gbe3efd4 GIT Date: >>> 2018-04-16 10:45:40 -0500 >>> >>> Max Max/Min Avg Total >>> Time (sec): 1.355e+00 1.00004 1.355e+00 >>> Objects: 4.140e+02 1.00000 4.140e+02 >>> Flop: 7.582e+05 1.09916 7.397e+05 1.183e+07 >>> Flop/sec: 5.594e+05 1.09918 5.458e+05 8.732e+06 >>> MPI Messages: 1.588e+03 1.19167 1.468e+03 2.348e+04 >>> MPI Message Lengths: 7.112e+07 1.37899 4.462e+04 1.048e+09 >>> MPI Reductions: 4.760e+02 1.00000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N >>> --> 2N flop >>> and VecAXPY() for complex vectors of length >>> N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages >>> --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total counts >>> %Total Avg %Total counts %Total >>> 0: Main Stage: 1.3553e+00 100.0% 1.1835e+07 100.0% 2.348e+04 >>> 100.0% 4.462e+04 100.0% 4.670e+02 98.1% >>> >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all >>> processors >>> Mess: number of messages sent >>> Avg. len: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() >>> and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this >>> phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >>> over all processors) >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> Event Count Time (sec) Flop >>> --- Global --- --- Stage --- Total >>> Max Ratio Max Ratio Max Ratio Mess Avg len >>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSidedF 2 1.0 9.1908e-03 2.2 0.00e+00 0.0 3.6e+02 1.6e+05 >>> 0.0e+00 1 0 2 6 0 1 0 2 6 0 0 >>> VecTDot 1 1.0 6.4135e-05 1.1 2.66e+04 1.0 0.0e+00 0.0e+00 >>> 1.0e+00 0 4 0 0 0 0 4 0 0 0 6646 >>> VecNorm 1 1.0 1.4589e-0347.1 2.66e+04 1.0 0.0e+00 0.0e+00 >>> 1.0e+00 0 4 0 0 0 0 4 0 0 0 292 >>> VecScale 14 1.0 3.6144e-04 1.3 4.80e+05 1.1 0.0e+00 0.0e+00 >>> 0.0e+00 0 62 0 0 0 0 62 0 0 0 20346 >>> VecCopy 7 1.0 1.0152e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 83 1.0 3.0013e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >>> VecPointwiseMult 12 1.0 2.7585e-04 1.4 2.43e+05 1.2 0.0e+00 0.0e+00 >>> 0.0e+00 0 31 0 0 0 0 31 0 0 0 13153 >>> VecScatterBegin 111 1.0 2.5293e-02 1.8 0.00e+00 0.0 9.5e+03 3.4e+04 >>> 1.9e+01 1 0 40 31 4 1 0 40 31 4 0 >>> VecScatterEnd 92 1.0 4.8771e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 >>> VecNormalize 1 1.0 2.6941e-05 2.3 1.33e+04 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 2 0 0 0 0 2 0 0 0 7911 >>> MatConvert 1 1.0 1.1009e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 4.0e+00 1 0 0 0 1 1 0 0 0 1 0 >>> MatAssemblyBegin 3 1.0 2.8401e-02 1.0 0.00e+00 0.0 3.6e+02 1.6e+05 >>> 0.0e+00 2 0 2 6 0 2 0 2 6 0 0 >>> MatAssemblyEnd 3 1.0 2.9033e-02 1.0 0.00e+00 0.0 6.0e+01 1.2e+04 >>> 2.0e+01 2 0 0 0 4 2 0 0 0 4 0 >>> MatGetRowIJ 2 1.0 1.9073e-06 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatView 1 1.0 3.0398e-04 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSetUp 1 1.0 4.7994e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 1 1.0 2.4850e-03 2.0 5.33e+04 1.0 0.0e+00 0.0e+00 >>> 2.0e+00 0 7 0 0 0 0 7 0 0 0 343 >>> PCSetUp 2 1.0 2.2953e-02 1.0 1.33e+04 1.0 0.0e+00 0.0e+00 >>> 6.0e+00 2 2 0 0 1 2 2 0 0 1 9 >>> PCApply 1 1.0 1.3151e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' >>> Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Vector 172 170 70736264 0. >>> Matrix 5 5 7125104 0. >>> Matrix Null Space 1 1 608 0. >>> Distributed Mesh 18 16 84096 0. >>> Index Set 73 73 10022204 0. >>> IS L to G Mapping 18 16 1180828 0. >>> Star Forest Graph 36 32 27968 0. >>> Discrete System 18 16 15040 0. >>> Vec Scatter 67 64 38240520 0. >>> Krylov Solver 2 2 2504 0. >>> Preconditioner 2 2 2528 0. >>> Viewer 2 1 848 0. >>> ============================================================ >>> ============================================================ >>> Average time to get PetscTime(): 0. >>> Average time for MPI_Barrier(): 2.38419e-06 >>> Average time for zero size MPI_Send(): 2.11596e-06 >>> #PETSc Option Table entries: >>> -da_processors_z 1 >>> -ksp_type cg >>> -ksp_view >>> -log_view >>> -pc_hypre_boomeramg_nodal_coarsen 1 >>> -pc_hypre_boomeramg_vec_interp_variant 1 >>> #End of PETSc Option Table entries >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >>> Configure options: --known-level1-dcache-size=32768 >>> --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 >>> --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 >>> --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 >>> --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 >>> --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 >>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 >>> --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 >>> PETSC_ARCH=timings --with-mpi-dir=/usr/lib64/openmpi >>> --with-blaslapack-dir=/usr/lib64 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 >>> FOPTFLAGS=-O3 --with-shared-libraries=1 --download-hypre >>> --with-debugging=no --with-batch -known-mpi-shared-libraries=0 >>> --known-64-bit-blas-indices=0 >>> ----------------------------------------- >>> Libraries compiled on 2018-04-27 21:13:11 on ocean >>> Machine characteristics: Linux-3.10.0-327.36.3.el7.x86_ >>> 64-x86_64-with-centos-7.2.1511-Core >>> Using PETSc directory: /home/valera/petsc >>> Using PETSc arch: timings >>> ----------------------------------------- >>> >>> Using C compiler: /usr/lib64/openmpi/bin/mpicc -fPIC -Wall >>> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >>> -fvisibility=hidden -O3 >>> Using Fortran compiler: /usr/lib64/openmpi/bin/mpif90 -fPIC -Wall >>> -ffree-line-length-0 -Wno-unused-dummy-argument -O3 >>> ----------------------------------------- >>> >>> Using include paths: -I/home/valera/petsc/include >>> -I/home/valera/petsc/timings/include -I/usr/lib64/openmpi/include >>> ----------------------------------------- >>> >>> Using C linker: /usr/lib64/openmpi/bin/mpicc >>> Using Fortran linker: /usr/lib64/openmpi/bin/mpif90 >>> Using libraries: -Wl,-rpath,/home/valera/petsc/timings/lib >>> -L/home/valera/petsc/timings/lib -lpetsc -Wl,-rpath,/home/valera/petsc/timings/lib >>> -L/home/valera/petsc/timings/lib -Wl,-rpath,/usr/lib64 -L/usr/lib64 >>> -Wl,-rpath,/usr/lib64/openmpi/lib -L/usr/lib64/openmpi/lib >>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lHYPRE -llapack -lblas -lm >>> -lstdc++ -ldl -lmpi_usempi -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm >>> -lgcc_s -lquadmath -lpthread -lstdc++ -ldl >>> >>> >>> >>> >>> >>> What do you see wrong here? what options could i try to improve my >>> solver scaling? >>> >>> Thanks so much, >>> >>> Manuel >>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed May 2 19:25:03 2018 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 2 May 2018 20:25:03 -0400 Subject: [petsc-users] Help my solver scale In-Reply-To: References: Message-ID: > > > KSPSetUp 1 1.0 4.7994e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.4850e-03 2.0 5.33e+04 1.0 0.0e+00 0.0e+00 > 2.0e+00 0 7 0 0 0 0 7 0 0 0 343 > PCSetUp 2 1.0 2.2953e-02 1.0 1.33e+04 1.0 0.0e+00 0.0e+00 > 6.0e+00 2 2 0 0 1 2 2 0 0 1 9 > PCApply 1 1.0 1.3151e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > Not following whole thread but these are tiny (fast) problems. The setup is 5x the solve. That is a poor balance that could come from a number of things, but it looks like you are past strong speedup roll over. Test larger problems. Try not to go below 5K eqs per process if you expect to see decent scaling. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fande.kong at inl.gov Thu May 3 11:00:39 2018 From: fande.kong at inl.gov (Kong, Fande) Date: Thu, 3 May 2018 10:00:39 -0600 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: Thanks, I get the PETSc complied, but theta does not like the shared lib, I think. I am switching back to a static lib. I ever successfully built and ran the PETSc with the static compiling. But I encountered a problem this time on building blaslapack. Thanks, Fande On Tue, May 1, 2018 at 2:22 PM, Satish Balay wrote: > This is theta.. > > Try: using --LDFLAGS=-dynamic option > > [as listed in config/examples/arch-cray-xc40-knl-opt.py] > > Satish > > On Tue, 1 May 2018, Kong, Fande wrote: > > > Hi All, > > > > I can build a static petsc library on a supercomputer, but could not do > the > > same thing with " --with-shared-libraries=1". > > > > The log file is attached. > > > > > > Fande, > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log 2.zip Type: application/zip Size: 211483 bytes Desc: not available URL: From balay at mcs.anl.gov Thu May 3 11:02:22 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 3 May 2018 11:02:22 -0500 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: Perhaps you should use MKL on theta? Again check config/examples/arch-cray-xc40-knl-opt.py Satish On Thu, 3 May 2018, Kong, Fande wrote: > Thanks, > > I get the PETSc complied, but theta does not like the shared lib, I think. > > I am switching back to a static lib. I ever successfully built and ran > the PETSc with the static compiling. > > But I encountered a problem this time on building blaslapack. > > > Thanks, > > Fande > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay wrote: > > > This is theta.. > > > > Try: using --LDFLAGS=-dynamic option > > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py] > > > > Satish > > > > On Tue, 1 May 2018, Kong, Fande wrote: > > > > > Hi All, > > > > > > I can build a static petsc library on a supercomputer, but could not do > > the > > > same thing with " --with-shared-libraries=1". > > > > > > The log file is attached. > > > > > > > > > Fande, > > > > > > > > From balay at mcs.anl.gov Thu May 3 11:09:32 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 3 May 2018 11:09:32 -0500 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: Ok you are not 'building blaslapack' - but using mkl [as per configure.log]. I'll have to check the issue. It might be something to do with using mkl as a static library.. Hong might have some suggestions wrt theta builds. Satish On Thu, 3 May 2018, Satish Balay wrote: > Perhaps you should use MKL on theta? Again check config/examples/arch-cray-xc40-knl-opt.py > > Satish > > On Thu, 3 May 2018, Kong, Fande wrote: > > > Thanks, > > > > I get the PETSc complied, but theta does not like the shared lib, I think. > > > > I am switching back to a static lib. I ever successfully built and ran > > the PETSc with the static compiling. > > > > But I encountered a problem this time on building blaslapack. > > > > > > Thanks, > > > > Fande > > > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay wrote: > > > > > This is theta.. > > > > > > Try: using --LDFLAGS=-dynamic option > > > > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py] > > > > > > Satish > > > > > > On Tue, 1 May 2018, Kong, Fande wrote: > > > > > > > Hi All, > > > > > > > > I can build a static petsc library on a supercomputer, but could not do > > > the > > > > same thing with " --with-shared-libraries=1". > > > > > > > > The log file is attached. > > > > > > > > > > > > Fande, > > > > > > > > > > > > > > From fdkong.jd at gmail.com Thu May 3 11:32:31 2018 From: fdkong.jd at gmail.com (Fande Kong) Date: Thu, 3 May 2018 10:32:31 -0600 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: --with-blaslapack-lib=-mkl -L' + os.environ['MKLROOT'] + '/lib/intel64 works. Fande, On Thu, May 3, 2018 at 10:09 AM, Satish Balay wrote: > Ok you are not 'building blaslapack' - but using mkl [as per > configure.log]. > > I'll have to check the issue. It might be something to do with using > mkl as a static library.. > > Hong might have some suggestions wrt theta builds. > > Satish > > On Thu, 3 May 2018, Satish Balay wrote: > > > Perhaps you should use MKL on theta? Again check > config/examples/arch-cray-xc40-knl-opt.py > > > > Satish > > > > On Thu, 3 May 2018, Kong, Fande wrote: > > > > > Thanks, > > > > > > I get the PETSc complied, but theta does not like the shared lib, I > think. > > > > > > I am switching back to a static lib. I ever successfully built and > ran > > > the PETSc with the static compiling. > > > > > > But I encountered a problem this time on building blaslapack. > > > > > > > > > Thanks, > > > > > > Fande > > > > > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay > wrote: > > > > > > > This is theta.. > > > > > > > > Try: using --LDFLAGS=-dynamic option > > > > > > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py] > > > > > > > > Satish > > > > > > > > On Tue, 1 May 2018, Kong, Fande wrote: > > > > > > > > > Hi All, > > > > > > > > > > I can build a static petsc library on a supercomputer, but could > not do > > > > the > > > > > same thing with " --with-shared-libraries=1". > > > > > > > > > > The log file is attached. > > > > > > > > > > > > > > > Fande, > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Thu May 3 12:50:32 2018 From: hongzhang at anl.gov (Zhang, Hong) Date: Thu, 3 May 2018 17:50:32 +0000 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: <638F9653-E5DD-467B-A69F-37B87D25AAFD@anl.gov> Alternatively you can use --with-blaslapack-dir=/opt/intel/compilers_and_libraries/linux/mkl/lib/intel64 to let petsc pick the right libs for you. Hong (Mr.) On May 3, 2018, at 11:32 AM, Fande Kong > wrote: --with-blaslapack-lib=-mkl -L' + os.environ['MKLROOT'] + '/lib/intel64 works. Fande, On Thu, May 3, 2018 at 10:09 AM, Satish Balay > wrote: Ok you are not 'building blaslapack' - but using mkl [as per configure.log]. I'll have to check the issue. It might be something to do with using mkl as a static library.. Hong might have some suggestions wrt theta builds. Satish On Thu, 3 May 2018, Satish Balay wrote: > Perhaps you should use MKL on theta? Again check config/examples/arch-cray-xc40-knl-opt.py > > Satish > > On Thu, 3 May 2018, Kong, Fande wrote: > > > Thanks, > > > > I get the PETSc complied, but theta does not like the shared lib, I think. > > > > I am switching back to a static lib. I ever successfully built and ran > > the PETSc with the static compiling. > > > > But I encountered a problem this time on building blaslapack. > > > > > > Thanks, > > > > Fande > > > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay > wrote: > > > > > This is theta.. > > > > > > Try: using --LDFLAGS=-dynamic option > > > > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py] > > > > > > Satish > > > > > > On Tue, 1 May 2018, Kong, Fande wrote: > > > > > > > Hi All, > > > > > > > > I can build a static petsc library on a supercomputer, but could not do > > > the > > > > same thing with " --with-shared-libraries=1". > > > > > > > > The log file is attached. > > > > > > > > > > > > Fande, > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fande.kong at inl.gov Thu May 3 13:25:33 2018 From: fande.kong at inl.gov (Kong, Fande) Date: Thu, 3 May 2018 12:25:33 -0600 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: <638F9653-E5DD-467B-A69F-37B87D25AAFD@anl.gov> References: <638F9653-E5DD-467B-A69F-37B87D25AAFD@anl.gov> Message-ID: On Thu, May 3, 2018 at 11:50 AM, Zhang, Hong wrote: > Alternatively you can use --with-blaslapack-dir=/opt/ > intel/compilers_and_libraries/linux/mkl/lib/intel64 to let petsc pick the > right libs for you. > This used to work, but does not work any more. Thanks, Fande, > > Hong (Mr.) > > > On May 3, 2018, at 11:32 AM, Fande Kong wrote: > > --with-blaslapack-lib=-mkl -L' + os.environ['MKLROOT'] + '/lib/intel64 > > works. > > Fande, > > On Thu, May 3, 2018 at 10:09 AM, Satish Balay wrote: > >> Ok you are not 'building blaslapack' - but using mkl [as per >> configure.log]. >> >> I'll have to check the issue. It might be something to do with using >> mkl as a static library.. >> >> Hong might have some suggestions wrt theta builds. >> >> Satish >> >> On Thu, 3 May 2018, Satish Balay wrote: >> >> > Perhaps you should use MKL on theta? Again check >> config/examples/arch-cray-xc40-knl-opt.py >> > >> > Satish >> > >> > On Thu, 3 May 2018, Kong, Fande wrote: >> > >> > > Thanks, >> > > >> > > I get the PETSc complied, but theta does not like the shared lib, I >> think. >> > > >> > > I am switching back to a static lib. I ever successfully built and >> ran >> > > the PETSc with the static compiling. >> > > >> > > But I encountered a problem this time on building blaslapack. >> > > >> > > >> > > Thanks, >> > > >> > > Fande >> > > >> > > On Tue, May 1, 2018 at 2:22 PM, Satish Balay >> wrote: >> > > >> > > > This is theta.. >> > > > >> > > > Try: using --LDFLAGS=-dynamic option >> > > > >> > > > [as listed in config/examples/arch-cray-xc40-knl-opt.py] >> > > > >> > > > Satish >> > > > >> > > > On Tue, 1 May 2018, Kong, Fande wrote: >> > > > >> > > > > Hi All, >> > > > > >> > > > > I can build a static petsc library on a supercomputer, but could >> not do >> > > > the >> > > > > same thing with " --with-shared-libraries=1". >> > > > > >> > > > > The log file is attached. >> > > > > >> > > > > >> > > > > Fande, >> > > > > >> > > > >> > > > >> > > >> > >> > >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 3 14:02:14 2018 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 3 May 2018 15:02:14 -0400 Subject: [petsc-users] configure error Message-ID: I am getting this configure error on the new SUMMIT machine at ORNL. I have built a v3.7 on this machine, but when moved up to 'maint' I get this error. Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2241628 bytes Desc: not available URL: From balay at mcs.anl.gov Thu May 3 14:06:51 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 3 May 2018 14:06:51 -0500 Subject: [petsc-users] configure error In-Reply-To: References: Message-ID: >>>>>>> Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2 --FOPTFLAGS=-O2 --with-ssl=0 --with-batch=0 --prefix=/ccs/proj/env003/petscv3.9-opt64-summit-pgi --download-metis --with-hwloc=0 --download-parmetis --download-cmake --with-cc=mpicc --with-fc=mpif90 --with-shared-libraries=0 --known-mpi-shared-libraries=1 --with-x=0 --with-64-bit-indices --with-debugging=0 PETSC_ARCH=arch-summit-opt64-pgi.v3.9 <<<<< You did not specify --with-cxx option here Satish On Thu, 3 May 2018, Mark Adams wrote: > I am getting this configure error on the new SUMMIT machine at ORNL. I have > built a v3.7 on this machine, but when moved up to 'maint' I get this error. > Thanks, > Mark > From bsmith at mcs.anl.gov Thu May 3 14:14:32 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 3 May 2018 19:14:32 +0000 Subject: [petsc-users] configure error In-Reply-To: References: Message-ID: <76236144-E605-4021-9936-89FBA819524D@anl.gov> Mark, You pass the C and C++ compiler names to ./configure with --with-cc=mpicc --with-fc=mpif90 but do not pass a C++ compiler hence it defaults to g++ which does not know about MPI. Barry > On May 3, 2018, at 2:02 PM, Mark Adams wrote: > > I am getting this configure error on the new SUMMIT machine at ORNL. I have built a v3.7 on this machine, but when moved up to 'maint' I get this error. > Thanks, > Mark > From hongzhang at anl.gov Thu May 3 14:19:06 2018 From: hongzhang at anl.gov (Zhang, Hong) Date: Thu, 3 May 2018 19:19:06 +0000 Subject: [petsc-users] Could not determine how to create a shared library! In-Reply-To: References: Message-ID: The MKL version you are trying to use does not match the environment setting on Theta. hongzh at thetalogin6:~/Projects/petsc (master)$ module list Currently Loaded Modulefiles: 1) modules/3.2.10.6 10) dmapp/7.1.1-6.0.4.0_46.2__gb8abda2.ari 19) PrgEnv-intel/6.0.4 2) eproxy/2.0.16-6.0.4.1_3.1__g001b199.ari 11) gni-headers/5.0.11-6.0.4.0_7.2__g7136988.ari 20) craype-mic-knl 3) intel/18.0.0.128 12) xpmem/2.2.2-6.0.4.1_18.2__g43b0535.ari 21) cray-mpich/7.7.0 4) craype-network-aries 13) job/2.2.2-6.0.4.0_8.2__g3c644b5.ari 22) nompirun/nompirun 5) craype/2.5.14 14) dvs/2.7_2.2.36-6.0.4.1_16.2__g4c8274a 23) darshan/3.1.5 6) cray-libsci/18.04.1 15) alps/6.4.2-6.0.4.1_3.1__gb8adc61.ari 24) trackdeps 7) udreg/2.3.2-6.0.4.0_12.2__g2f9c3ee.ari 16) rca/2.2.15-6.0.4.1_13.1__g46acb0f.ari 25) xalt 8) ugni/6.0.14-6.0.4.0_14.1__ge7db4a2.ari 17) atp/2.1.1 26) cray-hdf5-parallel/1.10.1.1 9) pmi/5.0.13 18) perftools-base/7.0.1 By default, 18.0.0.128 is loaded. Your petsc configuration should work with 18.0.2.199 if you do module swap intel/18.0.0.128 intel/18.0.2.199 Hong (Mr.) On May 3, 2018, at 11:00 AM, Kong, Fande > wrote: Thanks, I get the PETSc complied, but theta does not like the shared lib, I think. I am switching back to a static lib. I ever successfully built and ran the PETSc with the static compiling. But I encountered a problem this time on building blaslapack. Thanks, Fande On Tue, May 1, 2018 at 2:22 PM, Satish Balay > wrote: This is theta.. Try: using --LDFLAGS=-dynamic option [as listed in config/examples/arch-cray-xc40-knl-opt.py] Satish On Tue, 1 May 2018, Kong, Fande wrote: > Hi All, > > I can build a static petsc library on a supercomputer, but could not do the > same thing with " --with-shared-libraries=1". > > The log file is attached. > > > Fande, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 3 19:45:35 2018 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 3 May 2018 20:45:35 -0400 Subject: [petsc-users] configure error In-Reply-To: <76236144-E605-4021-9936-89FBA819524D@anl.gov> References: <76236144-E605-4021-9936-89FBA819524D@anl.gov> Message-ID: Thanks, I can not find the damn (PGI) c++ compiler (on SUMMIT) but this is for a Fortran code so I just set it to =0 and it seems to be chugging along. On Thu, May 3, 2018 at 3:14 PM, Smith, Barry F. wrote: > > Mark, > > You pass the C and C++ compiler names to ./configure with > --with-cc=mpicc --with-fc=mpif90 but do not pass a C++ compiler hence it > defaults to g++ which does not know about MPI. > > Barry > > > > On May 3, 2018, at 2:02 PM, Mark Adams wrote: > > > > I am getting this configure error on the new SUMMIT machine at ORNL. I > have built a v3.7 on this machine, but when moved up to 'maint' I get this > error. > > Thanks, > > Mark > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harshad.sahasrabudhe at gmail.com Thu May 3 19:58:53 2018 From: harshad.sahasrabudhe at gmail.com (Harshad Sahasrabudhe) Date: Thu, 3 May 2018 20:58:53 -0400 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD Message-ID: Hello, I am solving for the lowest eigenvalues and eigenvectors of symmetric positive definite matrices in the generalized eigenvalue problem. I am using the GD solver with the default settings of PCBJACOBI. When I run a standalone executable on 16 processes which loads the matrices from a file and solves the eigenproblem, I get converged results in ~600 iterations. I am using PETSc/SLEPc 3.5.4. However, when I use the same settings in my software, which uses LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: Program received signal SIGFPE, Arithmetic exception. 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 1112 if (d->nR[i]/a < data->fix) { #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:316 #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:360 #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:193 #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/davidson.c:299 #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/interface/epssolve.c:99 #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver::_solve_generalized_helper (this=0x1b19880, mat_A=0x1c906d0, mat_B=0x1cb16b0, nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:519 #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:316 #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve (this=0x1b19930) at src/systems/eigen_system.C:241 #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) at EMSchrodingerFEM.cpp:879 #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at Simulation.cpp:789 #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at NonlinearPoissonFEM.cpp:179 #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at src/solvers/petsc_nonlinear_solver.C:137 #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/shell/snesshell.c:167 #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver (this=0x19da050) at NonlinearPoisson.cpp:1218 #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) at NonlinearPoisson.cpp:961 #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at Simulation.cpp:789 #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve (this=0x19c0210) at PredictorCorrectorModule.cpp:334 #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at Simulation.cpp:789 #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 ) at Nemo.cpp:1367 #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at main.cpp:452 Here is the log_view from the standalone executable: ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./libmesh_solve_eigenproblem on a linux named conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 20:56:03 2018 Using Petsc Release Version 3.5.4, May, 23, 2015 Max Max/Min Avg Total Time (sec): 2.628e+01 1.00158 2.625e+01 Objects: 6.400e+03 1.00000 6.400e+03 Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 MPI Reductions: 8.522e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 100.0% 3.219e+03 100.0% 8.521e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 3 2 1504 0 Matrix 3196 3190 31529868 0 Vector 2653 2651 218802920 0 Vector Scatter 2 0 0 0 Index Set 7 7 84184 0 Eigenvalue Problem Solver 1 1 4564 0 PetscRandom 1 1 632 0 Spectral Transform 1 1 828 0 Krylov Solver 2 2 2320 0 Preconditioner 2 2 1912 0 Basis Vectors 530 530 1111328 0 Region 1 1 648 0 Direct solver 1 1 201200 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 0.0004704 Average time for zero size MPI_Send(): 0.000118256 #PETSc Option Table entries: -eps_monitor -f1 A.mat -f2 B.mat -log_view -matload_block_size 1 -ncv 70 -st_ksp_tol 1e-12 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-x=0 --download-hdf5=1 --with-scalar-type=real --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --download-scalapack --download-fblaslapack=1 ----------------------------------------- Libraries compiled on Thu Sep 22 10:19:43 2016 on carter-g008.rcac.purdue.edu Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_64-x86_64-with-redhat-6.7-Santiago Using PETSc directory: /depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real Using PETSc arch: linux ----------------------------------------- Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich-3.1/include ----------------------------------------- Using C linker: mpicxx Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl ----------------------------------------- Can you please point me to what could be going wrong with the larger software? Thanks! Harshad -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 3 22:11:10 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 4 May 2018 03:11:10 +0000 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: References: Message-ID: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 1112 if (d->nR[i]/a < data->fix) { Likely the problem is due to the division by a when a is zero. Perhaps the code needs above a check that a is not zero. Or rewrite the check as if (d->nR[i] < a*data->fix) { Barry > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe wrote: > > Hello, > > I am solving for the lowest eigenvalues and eigenvectors of symmetric positive definite matrices in the generalized eigenvalue problem. I am using the GD solver with the default settings of PCBJACOBI. When I run a standalone executable on 16 processes which loads the matrices from a file and solves the eigenproblem, I get converged results in ~600 iterations. I am using PETSc/SLEPc 3.5.4. > > However, when I use the same settings in my software, which uses LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: > > Program received signal SIGFPE, Arithmetic exception. > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > 1112 if (d->nR[i]/a < data->fix) { > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:316 > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:360 > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:193 > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/davidson.c:299 > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/interface/epssolve.c:99 > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver::_solve_generalized_helper (this=0x1b19880, mat_A=0x1c906d0, mat_B=0x1cb16b0, nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:519 > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:316 > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve (this=0x1b19930) at src/systems/eigen_system.C:241 > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) at EMSchrodingerFEM.cpp:879 > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at Simulation.cpp:789 > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at NonlinearPoissonFEM.cpp:179 > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at src/solvers/petsc_nonlinear_solver.C:137 > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/shell/snesshell.c:167 > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver (this=0x19da050) at NonlinearPoisson.cpp:1218 > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) at NonlinearPoisson.cpp:961 > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at Simulation.cpp:789 > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve (this=0x19c0210) at PredictorCorrectorModule.cpp:334 > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at Simulation.cpp:789 > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 ) at Nemo.cpp:1367 > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at main.cpp:452 > > > Here is the log_view from the standalone executable: > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./libmesh_solve_eigenproblem on a linux named conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 20:56:03 2018 > Using Petsc Release Version 3.5.4, May, 23, 2015 > > Max Max/Min Avg Total > Time (sec): 2.628e+01 1.00158 2.625e+01 > Objects: 6.400e+03 1.00000 6.400e+03 > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 > MPI Reductions: 8.522e+03 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 100.0% 3.219e+03 100.0% 8.521e+03 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Viewer 3 2 1504 0 > Matrix 3196 3190 31529868 0 > Vector 2653 2651 218802920 0 > Vector Scatter 2 0 0 0 > Index Set 7 7 84184 0 > Eigenvalue Problem Solver 1 1 4564 0 > PetscRandom 1 1 632 0 > Spectral Transform 1 1 828 0 > Krylov Solver 2 2 2320 0 > Preconditioner 2 2 1912 0 > Basis Vectors 530 530 1111328 0 > Region 1 1 648 0 > Direct solver 1 1 201200 0 > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 0.0004704 > Average time for zero size MPI_Send(): 0.000118256 > #PETSc Option Table entries: > -eps_monitor > -f1 A.mat > -f2 B.mat > -log_view > -matload_block_size 1 > -ncv 70 > -st_ksp_tol 1e-12 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --with-x=0 --download-hdf5=1 --with-scalar-type=real --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --download-scalapack --download-fblaslapack=1 > ----------------------------------------- > Libraries compiled on Thu Sep 22 10:19:43 2016 on carter-g008.rcac.purdue.edu > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_64-x86_64-with-redhat-6.7-Santiago > Using PETSc directory: /depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real > Using PETSc arch: linux > ----------------------------------------- > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} > ----------------------------------------- > > Using include paths: -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich-3.1/include > ----------------------------------------- > > Using C linker: mpicxx > Using Fortran linker: mpif90 > Using libraries: -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > ----------------------------------------- > > Can you please point me to what could be going wrong with the larger software? > > Thanks! > Harshad From harshad.sahasrabudhe at gmail.com Thu May 3 22:24:15 2018 From: harshad.sahasrabudhe at gmail.com (Harshad Sahasrabudhe) Date: Thu, 3 May 2018 23:24:15 -0400 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> References: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> Message-ID: Hi Barry, There's an overflow in the division: Program received signal SIGFPE, Arithmetic exception. 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 1112 if (d->nR[i]/a < data->fix) { (gdb) p d->nR[i] $7 = 1.7976931348623157e+308 (gdb) p a $8 = 0.15744695659409991 It looks like the residual is very high in the first GD iteration and rapidly drops in the second iteration 240 EPS nconv=0 first unconverged value (error) 0.0999172 (1.11889357e-12) 241 EPS nconv=1 first unconverged value (error) 0.100311 (1.79769313e+308) 242 EPS nconv=1 first unconverged value (error) 0.100311 (2.39980067e-04) Thanks, Harshad On Thu, May 3, 2018 at 11:11 PM, Smith, Barry F. wrote: > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > 1112 if (d->nR[i]/a < data->fix) { > > Likely the problem is due to the division by a when a is zero. Perhaps the > code needs above a check that a is not zero. Or rewrite the check as > > if (d->nR[i] < a*data->fix) { > > Barry > > > > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> wrote: > > > > Hello, > > > > I am solving for the lowest eigenvalues and eigenvectors of symmetric > positive definite matrices in the generalized eigenvalue problem. I am > using the GD solver with the default settings of PCBJACOBI. When I run a > standalone executable on 16 processes which loads the matrices from a file > and solves the eigenproblem, I get converged results in ~600 iterations. I > am using PETSc/SLEPc 3.5.4. > > > > However, when I use the same settings in my software, which uses LibMesh > (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: > > > > Program received signal SIGFPE, Arithmetic exception. > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, > theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > 1112 if (d->nR[i]/a < data->fix) { > > > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, > theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, > r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/ > davidson/common/dvd_improvex.c:316 > > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_updatev.c:360 > > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_updatev.c:193 > > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/davidson.c:299 > > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/interface/epssolve.c:99 > > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver< > double>::_solve_generalized_helper (this=0x1b19880, mat_A=0x1c906d0, > mat_B=0x1cb16b0, nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at > src/solvers/slepc_eigen_solver.C:519 > > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized > (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, > tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_ > solver.C:316 > > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve (this=0x1b19930) > at src/systems/eigen_system.C:241 > > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve > (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 > > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) at > EMSchrodingerFEM.cpp:879 > > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at > Simulation.cpp:789 > > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble > (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at > NonlinearPoissonFEM.cpp:179 > > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual > (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 > > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual > (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at > src/solvers/petsc_nonlinear_solver.C:137 > > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, > x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 > > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector > (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 > > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/shell/snesshell.c:167 > > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, x=0x1a2a5a0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/interface/snes.c:3743 > > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve > (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve > (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 > > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver > (this=0x19da050) at NonlinearPoisson.cpp:1218 > > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) at > NonlinearPoisson.cpp:961 > > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at > Simulation.cpp:789 > > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve > (this=0x19c0210) at PredictorCorrectorModule.cpp:334 > > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at > Simulation.cpp:789 > > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 > ) at Nemo.cpp:1367 > > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at > main.cpp:452 > > > > > > Here is the log_view from the standalone executable: > > > > ************************************************************ > ************************************************************ > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > ************************************************************ > ************************************************************ > > > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > > > ./libmesh_solve_eigenproblem on a linux named conte-a373.rcac.purdue.edu > with 16 processors, by hsahasra Thu May 3 20:56:03 2018 > > Using Petsc Release Version 3.5.4, May, 23, 2015 > > > > Max Max/Min Avg Total > > Time (sec): 2.628e+01 1.00158 2.625e+01 > > Objects: 6.400e+03 1.00000 6.400e+03 > > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 > > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 > > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 > > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 > > MPI Reductions: 8.522e+03 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N > --> 2N flops > > and VecAXPY() for complex vectors of length > N --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 > 100.0% 3.219e+03 100.0% 8.521e+03 100.0% > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all > processors > > Mess: number of messages sent > > Avg. len: average message length (bytes) > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this > phase > > %M - percent messages in this phase %L - percent message > lengths in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > ------------------------------------------------------------ > ------------------------------------------------------------ > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 3.0e+03 > 0.0e+00 13 10100 93 0 13 10100 93 0 1209 > > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 > > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 > > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 7.5e+02 > 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 7.6e+04 > 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 > > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 3.0e+03 > 0.0e+00 1 0100 93 0 1 0100 93 0 0 > > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 > > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 > > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 > > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 > > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 3.0e+03 > 8.5e+03 99100100 93100 99100100 93100 2200 > > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 > > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 > > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 > > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 > > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 > > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 3.0e+03 > 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 > > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 3.0e+03 > 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 > > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 > > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 0.0e+00 > 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 > > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' > Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Viewer 3 2 1504 0 > > Matrix 3196 3190 31529868 0 > > Vector 2653 2651 218802920 0 > > Vector Scatter 2 0 0 0 > > Index Set 7 7 84184 0 > > Eigenvalue Problem Solver 1 1 4564 0 > > PetscRandom 1 1 632 0 > > Spectral Transform 1 1 828 0 > > Krylov Solver 2 2 2320 0 > > Preconditioner 2 2 1912 0 > > Basis Vectors 530 530 1111328 0 > > Region 1 1 648 0 > > Direct solver 1 1 201200 0 > > ============================================================ > ============================================================ > > Average time to get PetscTime(): 9.53674e-08 > > Average time for MPI_Barrier(): 0.0004704 > > Average time for zero size MPI_Send(): 0.000118256 > > #PETSc Option Table entries: > > -eps_monitor > > -f1 A.mat > > -f2 B.mat > > -log_view > > -matload_block_size 1 > > -ncv 70 > > -st_ksp_tol 1e-12 > > #End of PETSc Option Table entries > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > Configure options: --with-x=0 --download-hdf5=1 --with-scalar-type=real > --with-single-library=1 --with-pic=1 --with-shared-libraries=0 > --with-clanguage=C++ --with-fortran=1 --with-debugging=0 --with-cc=mpicc > --with-fc=mpif90 --with-cxx=mpicxx --download-metis=1 --download-parmetis=1 > --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ --download-mumps=1 > --with-fortran-kernels=0 --download-superlu_dist=1 --download-scalapack > --download-fblaslapack=1 > > ----------------------------------------- > > Libraries compiled on Thu Sep 22 10:19:43 2016 on > carter-g008.rcac.purdue.edu > > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_ > 64-x86_64-with-redhat-6.7-Santiago > > Using PETSc directory: /depot/ncn/apps/conte/conte- > gcc-petsc35/libs/petsc/build-real > > Using PETSc arch: linux > > ----------------------------------------- > > > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} > > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable > -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} > > ----------------------------------------- > > > > Using include paths: -I/depot/ncn/apps/conte/conte- > gcc-petsc35/libs/petsc/build-real/linux/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include > -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich- > 3.1/include > > ----------------------------------------- > > > > Using C linker: mpicxx > > Using Fortran linker: mpif90 > > Using libraries: -Wl,-rpath,/depot/ncn/apps/ > conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack > -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl > -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz > -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5. > 2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/ > gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 > -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib > -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm > -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5. > 2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/ > gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 > -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib > -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > > ----------------------------------------- > > > > Can you please point me to what could be going wrong with the larger > software? > > > > Thanks! > > Harshad > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 3 22:28:03 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 4 May 2018 03:28:03 +0000 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: References: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> Message-ID: <1C4E12DD-E92D-433C-924F-1925730B4EA5@mcs.anl.gov> > On May 3, 2018, at 10:24 PM, Harshad Sahasrabudhe wrote: > > Hi Barry, > > There's an overflow in the division: > > Program received signal SIGFPE, Arithmetic exception. > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > 1112 if (d->nR[i]/a < data->fix) { > > (gdb) p d->nR[i] > $7 = 1.7976931348623157e+308 > (gdb) p a > $8 = 0.15744695659409991 > > It looks like the residual is very high in the first GD iteration and rapidly drops in the second iteration > > 240 EPS nconv=0 first unconverged value (error) 0.0999172 (1.11889357e-12) > 241 EPS nconv=1 first unconverged value (error) 0.100311 (1.79769313e+308) Jumps like this usually indicate something is going very poorly in the algorithm. I hope the algorithmic experts have a chance to look at your case. Barry > 242 EPS nconv=1 first unconverged value (error) 0.100311 (2.39980067e-04) > > Thanks, > Harshad > > On Thu, May 3, 2018 at 11:11 PM, Smith, Barry F. wrote: > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > 1112 if (d->nR[i]/a < data->fix) { > > Likely the problem is due to the division by a when a is zero. Perhaps the code needs above a check that a is not zero. Or rewrite the check as > > if (d->nR[i] < a*data->fix) { > > Barry > > > > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe wrote: > > > > Hello, > > > > I am solving for the lowest eigenvalues and eigenvectors of symmetric positive definite matrices in the generalized eigenvalue problem. I am using the GD solver with the default settings of PCBJACOBI. When I run a standalone executable on 16 processes which loads the matrices from a file and solves the eigenproblem, I get converged results in ~600 iterations. I am using PETSc/SLEPc 3.5.4. > > > > However, when I use the same settings in my software, which uses LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: > > > > Program received signal SIGFPE, Arithmetic exception. > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > 1112 if (d->nR[i]/a < data->fix) { > > > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:316 > > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:360 > > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:193 > > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/davidson.c:299 > > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/interface/epssolve.c:99 > > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver::_solve_generalized_helper (this=0x1b19880, mat_A=0x1c906d0, mat_B=0x1cb16b0, nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:519 > > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:316 > > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve (this=0x1b19930) at src/systems/eigen_system.C:241 > > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 > > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) at EMSchrodingerFEM.cpp:879 > > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at Simulation.cpp:789 > > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at NonlinearPoissonFEM.cpp:179 > > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 > > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at src/solvers/petsc_nonlinear_solver.C:137 > > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 > > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 > > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/shell/snesshell.c:167 > > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 > > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 > > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver (this=0x19da050) at NonlinearPoisson.cpp:1218 > > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) at NonlinearPoisson.cpp:961 > > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at Simulation.cpp:789 > > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve (this=0x19c0210) at PredictorCorrectorModule.cpp:334 > > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at Simulation.cpp:789 > > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 ) at Nemo.cpp:1367 > > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at main.cpp:452 > > > > > > Here is the log_view from the standalone executable: > > > > ************************************************************************************************************************ > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > > ************************************************************************************************************************ > > > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > > > ./libmesh_solve_eigenproblem on a linux named conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 20:56:03 2018 > > Using Petsc Release Version 3.5.4, May, 23, 2015 > > > > Max Max/Min Avg Total > > Time (sec): 2.628e+01 1.00158 2.625e+01 > > Objects: 6.400e+03 1.00000 6.400e+03 > > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 > > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 > > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 > > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 > > MPI Reductions: 8.522e+03 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N --> 2N flops > > and VecAXPY() for complex vectors of length N --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 100.0% 3.219e+03 100.0% 8.521e+03 100.0% > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length (bytes) > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this phase > > %M - percent messages in this phase %L - percent message lengths in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 > > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 > > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 > > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 > > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 > > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 > > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 > > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 > > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 > > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 > > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 > > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 > > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 > > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 > > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 > > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 > > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 > > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 > > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 > > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Viewer 3 2 1504 0 > > Matrix 3196 3190 31529868 0 > > Vector 2653 2651 218802920 0 > > Vector Scatter 2 0 0 0 > > Index Set 7 7 84184 0 > > Eigenvalue Problem Solver 1 1 4564 0 > > PetscRandom 1 1 632 0 > > Spectral Transform 1 1 828 0 > > Krylov Solver 2 2 2320 0 > > Preconditioner 2 2 1912 0 > > Basis Vectors 530 530 1111328 0 > > Region 1 1 648 0 > > Direct solver 1 1 201200 0 > > ======================================================================================================================== > > Average time to get PetscTime(): 9.53674e-08 > > Average time for MPI_Barrier(): 0.0004704 > > Average time for zero size MPI_Send(): 0.000118256 > > #PETSc Option Table entries: > > -eps_monitor > > -f1 A.mat > > -f2 B.mat > > -log_view > > -matload_block_size 1 > > -ncv 70 > > -st_ksp_tol 1e-12 > > #End of PETSc Option Table entries > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > Configure options: --with-x=0 --download-hdf5=1 --with-scalar-type=real --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --download-scalapack --download-fblaslapack=1 > > ----------------------------------------- > > Libraries compiled on Thu Sep 22 10:19:43 2016 on carter-g008.rcac.purdue.edu > > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_64-x86_64-with-redhat-6.7-Santiago > > Using PETSc directory: /depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real > > Using PETSc arch: linux > > ----------------------------------------- > > > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} > > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} > > ----------------------------------------- > > > > Using include paths: -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich-3.1/include > > ----------------------------------------- > > > > Using C linker: mpicxx > > Using Fortran linker: mpif90 > > Using libraries: -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > > ----------------------------------------- > > > > Can you please point me to what could be going wrong with the larger software? > > > > Thanks! > > Harshad > > From harshad.sahasrabudhe at gmail.com Thu May 3 22:39:25 2018 From: harshad.sahasrabudhe at gmail.com (Harshad Sahasrabudhe) Date: Thu, 3 May 2018 23:39:25 -0400 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: <1C4E12DD-E92D-433C-924F-1925730B4EA5@mcs.anl.gov> References: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> <1C4E12DD-E92D-433C-924F-1925730B4EA5@mcs.anl.gov> Message-ID: > > Jumps like this usually indicate something is going very poorly in the > algorithm. I hope the algorithmic experts have a chance to look at your > case. I think the residual of the first unconverged eigenvalue is set to INF on purpose and not calculated in the iteration when a converged eigenvalue is found. Dividing the INF residual is a mistake: if (d->nR[i]/a < data->fix) { I think your suggestion will certainly fix this issue: if (d->nR[i] < a*data->fix) { Just my guess. On Thu, May 3, 2018 at 11:28 PM, Smith, Barry F. wrote: > > > > On May 3, 2018, at 10:24 PM, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> wrote: > > > > Hi Barry, > > > > There's an overflow in the division: > > > > Program received signal SIGFPE, Arithmetic exception. > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, > theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > 1112 if (d->nR[i]/a < data->fix) { > > > > (gdb) p d->nR[i] > > $7 = 1.7976931348623157e+308 > > (gdb) p a > > $8 = 0.15744695659409991 > > > > It looks like the residual is very high in the first GD iteration and > rapidly drops in the second iteration > > > > 240 EPS nconv=0 first unconverged value (error) 0.0999172 > (1.11889357e-12) > > 241 EPS nconv=1 first unconverged value (error) 0.100311 > (1.79769313e+308) > > Jumps like this usually indicate something is going very poorly in the > algorithm. I hope the algorithmic experts have a chance to look at your > case. > > Barry > > > 242 EPS nconv=1 first unconverged value (error) 0.100311 (2.39980067e-04) > > > > Thanks, > > Harshad > > > > On Thu, May 3, 2018 at 11:11 PM, Smith, Barry F. > wrote: > > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > 1112 if (d->nR[i]/a < data->fix) { > > > > Likely the problem is due to the division by a when a is zero. Perhaps > the code needs above a check that a is not zero. Or rewrite the check as > > > > if (d->nR[i] < a*data->fix) { > > > > Barry > > > > > > > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> wrote: > > > > > > Hello, > > > > > > I am solving for the lowest eigenvalues and eigenvectors of symmetric > positive definite matrices in the generalized eigenvalue problem. I am > using the GD solver with the default settings of PCBJACOBI. When I run a > standalone executable on 16 processes which loads the matrices from a file > and solves the eigenproblem, I get converged results in ~600 iterations. I > am using PETSc/SLEPc 3.5.4. > > > > > > However, when I use the same settings in my software, which uses > LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: > > > > > > Program received signal SIGFPE, Arithmetic exception. > > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, > theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > 1112 if (d->nR[i]/a < data->fix) { > > > > > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, > i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, > r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte-gc > c-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/ > common/dvd_improvex.c:316 > > > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/dvd_updatev.c:360 > > > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/dvd_updatev.c:193 > > > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/impls/davidson/common/davidson.c:299 > > > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build > -real/src/eps/interface/epssolve.c:99 > > > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver::_solve_generalized_helper > (this=0x1b19880, mat_A=0x1c906d0, mat_B=0x1cb16b0, nev=5, ncv=20, > tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver > .C:519 > > > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized > (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, > tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver > .C:316 > > > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve (this=0x1b19930) > at src/systems/eigen_system.C:241 > > > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve > (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 > > > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) > at EMSchrodingerFEM.cpp:879 > > > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at > Simulation.cpp:789 > > > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble > (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at > NonlinearPoissonFEM.cpp:179 > > > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual > (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 > > > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual > (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at > src/solvers/petsc_nonlinear_solver.C:137 > > > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, > x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte-gc > c-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 > > > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector > (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 > > > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build > -real/src/snes/impls/shell/snesshell.c:167 > > > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, > x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gc > c-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > > > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve > (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > > > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve > (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 > > > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver > (this=0x19da050) at NonlinearPoisson.cpp:1218 > > > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) > at NonlinearPoisson.cpp:961 > > > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at > Simulation.cpp:789 > > > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve > (this=0x19c0210) at PredictorCorrectorModule.cpp:334 > > > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at > Simulation.cpp:789 > > > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 > ) at Nemo.cpp:1367 > > > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at > main.cpp:452 > > > > > > > > > Here is the log_view from the standalone executable: > > > > > > ************************************************************ > ************************************************************ > > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > > ************************************************************ > ************************************************************ > > > > > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > > > > > ./libmesh_solve_eigenproblem on a linux named > conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 > 20:56:03 2018 > > > Using Petsc Release Version 3.5.4, May, 23, 2015 > > > > > > Max Max/Min Avg Total > > > Time (sec): 2.628e+01 1.00158 2.625e+01 > > > Objects: 6.400e+03 1.00000 6.400e+03 > > > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 > > > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 > > > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 > > > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 > > > MPI Reductions: 8.522e+03 1.00000 > > > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > > e.g., VecAXPY() for real vectors of length > N --> 2N flops > > > and VecAXPY() for complex vectors of > length N --> 8N flops > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 > 100.0% 3.219e+03 100.0% 8.521e+03 100.0% > > > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > > Phase summary info: > > > Count: number of times phase was executed > > > Time and Flops: Max - maximum over all processors > > > Ratio - ratio of maximum to minimum over all > processors > > > Mess: number of messages sent > > > Avg. len: average message length (bytes) > > > Reduct: number of global reductions > > > Global: entire computation > > > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > > > %T - percent time in this phase %F - percent flops in > this phase > > > %M - percent messages in this phase %L - percent message > lengths in this phase > > > %R - percent reductions in this phase > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > > > --- Event Stage 0: Main Stage > > > > > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 > 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 > > > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 > > > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 > > > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 > 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 > 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 > > > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 > 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 > > > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 > > > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 > > > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 > > > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 > > > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 > 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 > > > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 > > > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 > > > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 > > > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 > > > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 > 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 > > > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 > 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 > > > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 > 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 > > > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 > > > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 > 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 > > > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > > > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > > > Memory usage is given in bytes: > > > > > > Object Type Creations Destructions Memory Descendants' > Mem. > > > Reports information only for process 0. > > > > > > --- Event Stage 0: Main Stage > > > > > > Viewer 3 2 1504 0 > > > Matrix 3196 3190 31529868 0 > > > Vector 2653 2651 218802920 0 > > > Vector Scatter 2 0 0 0 > > > Index Set 7 7 84184 0 > > > Eigenvalue Problem Solver 1 1 4564 0 > > > PetscRandom 1 1 632 0 > > > Spectral Transform 1 1 828 0 > > > Krylov Solver 2 2 2320 0 > > > Preconditioner 2 2 1912 0 > > > Basis Vectors 530 530 1111328 0 > > > Region 1 1 648 0 > > > Direct solver 1 1 201200 0 > > > ============================================================ > ============================================================ > > > Average time to get PetscTime(): 9.53674e-08 > > > Average time for MPI_Barrier(): 0.0004704 > > > Average time for zero size MPI_Send(): 0.000118256 > > > #PETSc Option Table entries: > > > -eps_monitor > > > -f1 A.mat > > > -f2 B.mat > > > -log_view > > > -matload_block_size 1 > > > -ncv 70 > > > -st_ksp_tol 1e-12 > > > #End of PETSc Option Table entries > > > Compiled without FORTRAN kernels > > > Compiled with full precision matrices (default) > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > > Configure options: --with-x=0 --download-hdf5=1 > --with-scalar-type=real --with-single-library=1 --with-pic=1 > --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 > --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx > --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ > --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 > --download-scalapack --download-fblaslapack=1 > > > ----------------------------------------- > > > Libraries compiled on Thu Sep 22 10:19:43 2016 on > carter-g008.rcac.purdue.edu > > > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_6 > 4-x86_64-with-redhat-6.7-Santiago > > > Using PETSc directory: /depot/ncn/apps/conte/conte-gc > c-petsc35/libs/petsc/build-real > > > Using PETSc arch: linux > > > ----------------------------------------- > > > > > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} > > > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable > -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} > > > ----------------------------------------- > > > > > > Using include paths: -I/depot/ncn/apps/conte/conte- > gcc-petsc35/libs/petsc/build-real/linux/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include > -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich- > 3.1/include > > > ----------------------------------------- > > > > > > Using C linker: mpicxx > > > Using Fortran linker: mpif90 > > > Using libraries: -Wl,-rpath,/depot/ncn/apps/con > te/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack > -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl > -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz > -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2 > .0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gc > c/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 > -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib > -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm > -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2 > .0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gc > c/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 > -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib > -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > > > ----------------------------------------- > > > > > > Can you please point me to what could be going wrong with the larger > software? > > > > > > Thanks! > > > Harshad > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harshad.sahasrabudhe at gmail.com Thu May 3 22:45:37 2018 From: harshad.sahasrabudhe at gmail.com (Harshad Sahasrabudhe) Date: Thu, 3 May 2018 23:45:37 -0400 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: References: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> <1C4E12DD-E92D-433C-924F-1925730B4EA5@mcs.anl.gov> Message-ID: > > I think the residual of the first unconverged eigenvalue is set to INF on > purpose and not calculated in the iteration when a converged eigenvalue is > found. Dividing the INF residual is a mistake: Found the line where d->nR[i] is set to infinity: src/eps/impls/davidson/common/dvd_updatev.c static PetscErrorCode dvd_updateV_start(dvdDashboard *d) .. 124 for (i=0;ieps->ncv;i++) d->nR[i] = PETSC_MAX_REAL; On Thu, May 3, 2018 at 11:39 PM, Harshad Sahasrabudhe < harshad.sahasrabudhe at gmail.com> wrote: > Jumps like this usually indicate something is going very poorly in the >> algorithm. I hope the algorithmic experts have a chance to look at your >> case. > > > I think the residual of the first unconverged eigenvalue is set to INF on > purpose and not calculated in the iteration when a converged eigenvalue is > found. Dividing the INF residual is a mistake: > > if (d->nR[i]/a < data->fix) { > > I think your suggestion will certainly fix this issue: > > if (d->nR[i] < a*data->fix) { > > Just my guess. > > > On Thu, May 3, 2018 at 11:28 PM, Smith, Barry F. > wrote: > >> >> >> > On May 3, 2018, at 10:24 PM, Harshad Sahasrabudhe < >> harshad.sahasrabudhe at gmail.com> wrote: >> > >> > Hi Barry, >> > >> > There's an overflow in the division: >> > >> > Program received signal SIGFPE, Arithmetic exception. >> > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, >> theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, >> tol=0x7fffffff8140) >> > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 >> > 1112 if (d->nR[i]/a < data->fix) { >> > >> > (gdb) p d->nR[i] >> > $7 = 1.7976931348623157e+308 >> > (gdb) p a >> > $8 = 0.15744695659409991 >> > >> > It looks like the residual is very high in the first GD iteration and >> rapidly drops in the second iteration >> > >> > 240 EPS nconv=0 first unconverged value (error) 0.0999172 >> (1.11889357e-12) >> > 241 EPS nconv=1 first unconverged value (error) 0.100311 >> (1.79769313e+308) >> >> Jumps like this usually indicate something is going very poorly in the >> algorithm. I hope the algorithmic experts have a chance to look at your >> case. >> >> Barry >> >> > 242 EPS nconv=1 first unconverged value (error) 0.100311 >> (2.39980067e-04) >> > >> > Thanks, >> > Harshad >> > >> > On Thu, May 3, 2018 at 11:11 PM, Smith, Barry F. >> wrote: >> > >> > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 >> > 1112 if (d->nR[i]/a < data->fix) { >> > >> > Likely the problem is due to the division by a when a is zero. Perhaps >> the code needs above a check that a is not zero. Or rewrite the check as >> > >> > if (d->nR[i] < a*data->fix) { >> > >> > Barry >> > >> > >> > > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe < >> harshad.sahasrabudhe at gmail.com> wrote: >> > > >> > > Hello, >> > > >> > > I am solving for the lowest eigenvalues and eigenvectors of symmetric >> positive definite matrices in the generalized eigenvalue problem. I am >> using the GD solver with the default settings of PCBJACOBI. When I run a >> standalone executable on 16 processes which loads the matrices from a file >> and solves the eigenproblem, I get converged results in ~600 iterations. I >> am using PETSc/SLEPc 3.5.4. >> > > >> > > However, when I use the same settings in my software, which uses >> LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: >> > > >> > > Program received signal SIGFPE, Arithmetic exception. >> > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, >> theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, >> tol=0x7fffffff8140) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 >> > > 1112 if (d->nR[i]/a < data->fix) { >> > > >> > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, >> i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, >> tol=0x7fffffff8140) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/dvd_improvex.c:1112 >> > > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, >> r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte-gc >> c-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/c >> ommon/dvd_improvex.c:316 >> > > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/dvd_updatev.c:360 >> > > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/dvd_updatev.c:193 >> > > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/impls/davidson/common/davidson.c:299 >> > > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build >> -real/src/eps/interface/epssolve.c:99 >> > > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver::_solve_generalized_helper >> (this=0x1b19880, mat_A=0x1c906d0, mat_B=0x1cb16b0, nev=5, ncv=20, >> tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver >> .C:519 >> > > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized >> (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, >> tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver >> .C:316 >> > > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve >> (this=0x1b19930) at src/systems/eigen_system.C:241 >> > > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve >> (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 >> > > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) >> at EMSchrodingerFEM.cpp:879 >> > > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at >> Simulation.cpp:789 >> > > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble >> (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at >> NonlinearPoissonFEM.cpp:179 >> > > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual >> (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 >> > > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual >> (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at >> src/solvers/petsc_nonlinear_solver.C:137 >> > > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, >> x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte-gc >> c-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 >> > > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector >> (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 >> > > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/impls/shell/snesshell.c:167 >> > > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, >> x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gc >> c-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 >> > > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve >> (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at >> src/solvers/petsc_nonlinear_solver.C:714 >> > > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve >> (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 >> > > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver >> (this=0x19da050) at NonlinearPoisson.cpp:1218 >> > > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) >> at NonlinearPoisson.cpp:961 >> > > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at >> Simulation.cpp:789 >> > > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve >> (this=0x19c0210) at PredictorCorrectorModule.cpp:334 >> > > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at >> Simulation.cpp:789 >> > > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 >> ) at Nemo.cpp:1367 >> > > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at >> main.cpp:452 >> > > >> > > >> > > Here is the log_view from the standalone executable: >> > > >> > > ************************************************************ >> ************************************************************ >> > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> > > ************************************************************ >> ************************************************************ >> > > >> > > ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> > > >> > > ./libmesh_solve_eigenproblem on a linux named >> conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 >> 20:56:03 2018 >> > > Using Petsc Release Version 3.5.4, May, 23, 2015 >> > > >> > > Max Max/Min Avg Total >> > > Time (sec): 2.628e+01 1.00158 2.625e+01 >> > > Objects: 6.400e+03 1.00000 6.400e+03 >> > > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 >> > > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 >> > > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 >> > > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 >> > > MPI Reductions: 8.522e+03 1.00000 >> > > >> > > Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> > > e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> > > and VecAXPY() for complex vectors of >> length N --> 8N flops >> > > >> > > Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> > > Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> > > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 >> 100.0% 3.219e+03 100.0% 8.521e+03 100.0% >> > > >> > > ------------------------------------------------------------ >> ------------------------------------------------------------ >> > > See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> > > Phase summary info: >> > > Count: number of times phase was executed >> > > Time and Flops: Max - maximum over all processors >> > > Ratio - ratio of maximum to minimum over all >> processors >> > > Mess: number of messages sent >> > > Avg. len: average message length (bytes) >> > > Reduct: number of global reductions >> > > Global: entire computation >> > > Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> > > %T - percent time in this phase %F - percent flops in >> this phase >> > > %M - percent messages in this phase %L - percent message >> lengths in this phase >> > > %R - percent reductions in this phase >> > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> > > ------------------------------------------------------------ >> ------------------------------------------------------------ >> > > Event Count Time (sec) Flops >> --- Global --- --- Stage --- Total >> > > Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> > > ------------------------------------------------------------ >> ------------------------------------------------------------ >> > > >> > > --- Event Stage 0: Main Stage >> > > >> > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 >> 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 >> > > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 >> 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 >> > > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 >> > > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 >> 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 >> > > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 >> 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 >> > > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 >> 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 >> > > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 >> > > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 >> 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 >> > > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 >> > > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 >> > > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 >> 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 >> > > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 >> 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 >> > > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 >> > > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 >> 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 >> > > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 >> > > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 >> 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 >> > > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 >> 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 >> > > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 >> 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 >> > > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 >> > > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 >> 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 >> > > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 >> > > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> > > ------------------------------------------------------------ >> ------------------------------------------------------------ >> > > >> > > Memory usage is given in bytes: >> > > >> > > Object Type Creations Destructions Memory >> Descendants' Mem. >> > > Reports information only for process 0. >> > > >> > > --- Event Stage 0: Main Stage >> > > >> > > Viewer 3 2 1504 0 >> > > Matrix 3196 3190 31529868 0 >> > > Vector 2653 2651 218802920 0 >> > > Vector Scatter 2 0 0 0 >> > > Index Set 7 7 84184 0 >> > > Eigenvalue Problem Solver 1 1 4564 0 >> > > PetscRandom 1 1 632 0 >> > > Spectral Transform 1 1 828 0 >> > > Krylov Solver 2 2 2320 0 >> > > Preconditioner 2 2 1912 0 >> > > Basis Vectors 530 530 1111328 0 >> > > Region 1 1 648 0 >> > > Direct solver 1 1 201200 0 >> > > ============================================================ >> ============================================================ >> > > Average time to get PetscTime(): 9.53674e-08 >> > > Average time for MPI_Barrier(): 0.0004704 >> > > Average time for zero size MPI_Send(): 0.000118256 >> > > #PETSc Option Table entries: >> > > -eps_monitor >> > > -f1 A.mat >> > > -f2 B.mat >> > > -log_view >> > > -matload_block_size 1 >> > > -ncv 70 >> > > -st_ksp_tol 1e-12 >> > > #End of PETSc Option Table entries >> > > Compiled without FORTRAN kernels >> > > Compiled with full precision matrices (default) >> > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >> > > Configure options: --with-x=0 --download-hdf5=1 >> --with-scalar-type=real --with-single-library=1 --with-pic=1 >> --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 >> --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx >> --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ >> --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 >> --download-scalapack --download-fblaslapack=1 >> > > ----------------------------------------- >> > > Libraries compiled on Thu Sep 22 10:19:43 2016 on >> carter-g008.rcac.purdue.edu >> > > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_6 >> 4-x86_64-with-redhat-6.7-Santiago >> > > Using PETSc directory: /depot/ncn/apps/conte/conte-gc >> c-petsc35/libs/petsc/build-real >> > > Using PETSc arch: linux >> > > ----------------------------------------- >> > > >> > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} >> > > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable >> -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} >> > > ----------------------------------------- >> > > >> > > Using include paths: -I/depot/ncn/apps/conte/conte- >> gcc-petsc35/libs/petsc/build-real/linux/include >> -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include >> -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include >> -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include >> -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich- >> 3.1/include >> > > ----------------------------------------- >> > > >> > > Using C linker: mpicxx >> > > Using Fortran linker: mpif90 >> > > Using libraries: -Wl,-rpath,/depot/ncn/apps/con >> te/conte-gcc-petsc35/libs/petsc/build-real/linux/lib >> -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib >> -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib >> -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib >> -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack >> -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl >> -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >> -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib >> -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2 >> .0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gc >> c/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 >> -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib >> -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm >> -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib >> -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2 >> .0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gc >> c/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 >> -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib >> -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib >> -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl >> > > ----------------------------------------- >> > > >> > > Can you please point me to what could be going wrong with the larger >> software? >> > > >> > > Thanks! >> > > Harshad >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri May 4 08:37:32 2018 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 4 May 2018 15:37:32 +0200 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: References: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> <1C4E12DD-E92D-433C-924F-1925730B4EA5@mcs.anl.gov> Message-ID: That version is 4-year old, I recommend you to upgrade to the latest version. We have made the change suggested by Barry. It will be included in 3.9.2. Jose > El 4 may 2018, a las 5:45, Harshad Sahasrabudhe escribi?: > > I think the residual of the first unconverged eigenvalue is set to INF on purpose and not calculated in the iteration when a converged eigenvalue is found. Dividing the INF residual is a mistake: > > Found the line where d->nR[i] is set to infinity: > > src/eps/impls/davidson/common/dvd_updatev.c > static PetscErrorCode dvd_updateV_start(dvdDashboard *d) > .. > 124 for (i=0;ieps->ncv;i++) d->nR[i] = PETSC_MAX_REAL; > > > On Thu, May 3, 2018 at 11:39 PM, Harshad Sahasrabudhe wrote: > Jumps like this usually indicate something is going very poorly in the algorithm. I hope the algorithmic experts have a chance to look at your case. > > I think the residual of the first unconverged eigenvalue is set to INF on purpose and not calculated in the iteration when a converged eigenvalue is found. Dividing the INF residual is a mistake: > > if (d->nR[i]/a < data->fix) { > > I think your suggestion will certainly fix this issue: > > if (d->nR[i] < a*data->fix) { > > Just my guess. > > > On Thu, May 3, 2018 at 11:28 PM, Smith, Barry F. wrote: > > > > On May 3, 2018, at 10:24 PM, Harshad Sahasrabudhe wrote: > > > > Hi Barry, > > > > There's an overflow in the division: > > > > Program received signal SIGFPE, Arithmetic exception. > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > 1112 if (d->nR[i]/a < data->fix) { > > > > (gdb) p d->nR[i] > > $7 = 1.7976931348623157e+308 > > (gdb) p a > > $8 = 0.15744695659409991 > > > > It looks like the residual is very high in the first GD iteration and rapidly drops in the second iteration > > > > 240 EPS nconv=0 first unconverged value (error) 0.0999172 (1.11889357e-12) > > 241 EPS nconv=1 first unconverged value (error) 0.100311 (1.79769313e+308) > > Jumps like this usually indicate something is going very poorly in the algorithm. I hope the algorithmic experts have a chance to look at your case. > > Barry > > > 242 EPS nconv=1 first unconverged value (error) 0.100311 (2.39980067e-04) > > > > Thanks, > > Harshad > > > > On Thu, May 3, 2018 at 11:11 PM, Smith, Barry F. wrote: > > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > 1112 if (d->nR[i]/a < data->fix) { > > > > Likely the problem is due to the division by a when a is zero. Perhaps the code needs above a check that a is not zero. Or rewrite the check as > > > > if (d->nR[i] < a*data->fix) { > > > > Barry > > > > > > > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe wrote: > > > > > > Hello, > > > > > > I am solving for the lowest eigenvalues and eigenvectors of symmetric positive definite matrices in the generalized eigenvalue problem. I am using the GD solver with the default settings of PCBJACOBI. When I run a standalone executable on 16 processes which loads the matrices from a file and solves the eigenproblem, I get converged results in ~600 iterations. I am using PETSc/SLEPc 3.5.4. > > > > > > However, when I use the same settings in my software, which uses LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: > > > > > > Program received signal SIGFPE, Arithmetic exception. > > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > 1112 if (d->nR[i]/a < data->fix) { > > > > > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, tol=0x7fffffff8140) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_improvex.c:316 > > > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:360 > > > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/dvd_updatev.c:193 > > > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/davidson/common/davidson.c:299 > > > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/build-real/src/eps/interface/epssolve.c:99 > > > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver::_solve_generalized_helper (this=0x1b19880, mat_A=0x1c906d0, mat_B=0x1cb16b0, nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:519 > > > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_solver.C:316 > > > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve (this=0x1b19930) at src/systems/eigen_system.C:241 > > > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 > > > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve (this=0x19d6a90) at EMSchrodingerFEM.cpp:879 > > > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at Simulation.cpp:789 > > > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at NonlinearPoissonFEM.cpp:179 > > > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 > > > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at src/solvers/petsc_nonlinear_solver.C:137 > > > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 > > > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 > > > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/shell/snesshell.c:167 > > > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > > > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 > > > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 > > > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver (this=0x19da050) at NonlinearPoisson.cpp:1218 > > > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve (this=0x19da050) at NonlinearPoisson.cpp:961 > > > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at Simulation.cpp:789 > > > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve (this=0x19c0210) at PredictorCorrectorModule.cpp:334 > > > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at Simulation.cpp:789 > > > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 ) at Nemo.cpp:1367 > > > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at main.cpp:452 > > > > > > > > > Here is the log_view from the standalone executable: > > > > > > ************************************************************************************************************************ > > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > > > ************************************************************************************************************************ > > > > > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > > > > > ./libmesh_solve_eigenproblem on a linux named conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 20:56:03 2018 > > > Using Petsc Release Version 3.5.4, May, 23, 2015 > > > > > > Max Max/Min Avg Total > > > Time (sec): 2.628e+01 1.00158 2.625e+01 > > > Objects: 6.400e+03 1.00000 6.400e+03 > > > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 > > > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 > > > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 > > > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 > > > MPI Reductions: 8.522e+03 1.00000 > > > > > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > > > e.g., VecAXPY() for real vectors of length N --> 2N flops > > > and VecAXPY() for complex vectors of length N --> 8N flops > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > > > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > > > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 100.0% 3.219e+03 100.0% 8.521e+03 100.0% > > > > > > ------------------------------------------------------------------------------------------------------------------------ > > > See the 'Profiling' chapter of the users' manual for details on interpreting output. > > > Phase summary info: > > > Count: number of times phase was executed > > > Time and Flops: Max - maximum over all processors > > > Ratio - ratio of maximum to minimum over all processors > > > Mess: number of messages sent > > > Avg. len: average message length (bytes) > > > Reduct: number of global reductions > > > Global: entire computation > > > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > > > %T - percent time in this phase %F - percent flops in this phase > > > %M - percent messages in this phase %L - percent message lengths in this phase > > > %R - percent reductions in this phase > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > > > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > --- Event Stage 0: Main Stage > > > > > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 > > > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 > > > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 > > > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 > > > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 > > > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 > > > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 > > > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 > > > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 > > > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 > > > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 > > > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 > > > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 > > > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 > > > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 > > > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 > > > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 > > > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 > > > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 > > > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > > > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > Memory usage is given in bytes: > > > > > > Object Type Creations Destructions Memory Descendants' Mem. > > > Reports information only for process 0. > > > > > > --- Event Stage 0: Main Stage > > > > > > Viewer 3 2 1504 0 > > > Matrix 3196 3190 31529868 0 > > > Vector 2653 2651 218802920 0 > > > Vector Scatter 2 0 0 0 > > > Index Set 7 7 84184 0 > > > Eigenvalue Problem Solver 1 1 4564 0 > > > PetscRandom 1 1 632 0 > > > Spectral Transform 1 1 828 0 > > > Krylov Solver 2 2 2320 0 > > > Preconditioner 2 2 1912 0 > > > Basis Vectors 530 530 1111328 0 > > > Region 1 1 648 0 > > > Direct solver 1 1 201200 0 > > > ======================================================================================================================== > > > Average time to get PetscTime(): 9.53674e-08 > > > Average time for MPI_Barrier(): 0.0004704 > > > Average time for zero size MPI_Send(): 0.000118256 > > > #PETSc Option Table entries: > > > -eps_monitor > > > -f1 A.mat > > > -f2 B.mat > > > -log_view > > > -matload_block_size 1 > > > -ncv 70 > > > -st_ksp_tol 1e-12 > > > #End of PETSc Option Table entries > > > Compiled without FORTRAN kernels > > > Compiled with full precision matrices (default) > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > > Configure options: --with-x=0 --download-hdf5=1 --with-scalar-type=real --with-single-library=1 --with-pic=1 --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 --download-scalapack --download-fblaslapack=1 > > > ----------------------------------------- > > > Libraries compiled on Thu Sep 22 10:19:43 2016 on carter-g008.rcac.purdue.edu > > > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_64-x86_64-with-redhat-6.7-Santiago > > > Using PETSc directory: /depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real > > > Using PETSc arch: linux > > > ----------------------------------------- > > > > > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} > > > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} > > > ----------------------------------------- > > > > > > Using include paths: -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich-3.1/include > > > ----------------------------------------- > > > > > > Using C linker: mpicxx > > > Using Fortran linker: mpif90 > > > Using libraries: -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > > > ----------------------------------------- > > > > > > Can you please point me to what could be going wrong with the larger software? > > > > > > Thanks! > > > Harshad > > > > > > > From ys453 at cam.ac.uk Fri May 4 08:40:04 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 04 May 2018 14:40:04 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation Message-ID: Dear PETSc users, I am currently using MUMPS to solve linear systems directly. Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing step and then solve the system. In my code, the values in the matrix is changed in each iteration, but the structure of the matrix stays the same, which means the performance can be improved if symbolic factorisation is only performed once. Hence, it is necessary to split the symbolic and numeric factorisation. However, I cannot find a specific step (control parameter) to perform the numeric factorisation. I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, it seems that the symbolic and numeric factorisation always perform together. So I am wondering if anyone has an idea about it. Below is how I set up MUMPS solver: PC pc; PetscBool flg_mumps, flg_mumps_ch; flg_mumps = PETSC_FALSE; flg_mumps_ch = PETSC_FALSE; PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, NULL); PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, NULL); if(flg_mumps ||flg_mumps_ch) { KSPSetType(_ksp, KSPPREONLY); PetscInt ival,icntl; PetscReal val; KSPGetPC(_ksp, &pc); /// Set preconditioner type if(flg_mumps) { PCSetType(pc, PCLU); } else if(flg_mumps_ch) { MatSetOption(A, MAT_SPD, PETSC_TRUE); PCSetType(pc, PCCHOLESKY); } PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); PCFactorSetUpMatSolverPackage(pc); PCFactorGetMatrix(pc, &_F); icntl = 7; ival = 0; MatMumpsSetIcntl( _F, icntl, ival ); MatMumpsSetIcntl(_F, 3, 6); MatMumpsSetIcntl(_F, 4, 2); } KSPSetUp(_ksp); Kind Regards, Shidi From knepley at gmail.com Fri May 4 08:44:18 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 May 2018 09:44:18 -0400 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: Message-ID: On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: > Dear PETSc users, > > I am currently using MUMPS to solve linear systems directly. > Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing > step and then solve the system. > > In my code, the values in the matrix is changed in each iteration, > but the structure of the matrix stays the same, which means the > performance can be improved if symbolic factorisation is only > performed once. Hence, it is necessary to split the symbolic > and numeric factorisation. However, I cannot find a specific step > (control parameter) to perform the numeric factorisation. > I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, > it seems that the symbolic and numeric factorisation always perform > together. > If you use KSPSolve instead, it will automatically preserve the symbolic factorization. Thanks, Matt > So I am wondering if anyone has an idea about it. > > Below is how I set up MUMPS solver: > PC pc; > PetscBool flg_mumps, flg_mumps_ch; > flg_mumps = PETSC_FALSE; > flg_mumps_ch = PETSC_FALSE; > PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, NULL); > PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, NULL); > if(flg_mumps ||flg_mumps_ch) > { > KSPSetType(_ksp, KSPPREONLY); > PetscInt ival,icntl; > PetscReal val; > KSPGetPC(_ksp, &pc); > /// Set preconditioner type > if(flg_mumps) > { > PCSetType(pc, PCLU); > } > else if(flg_mumps_ch) > { > MatSetOption(A, MAT_SPD, PETSC_TRUE); > PCSetType(pc, PCCHOLESKY); > } > PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(pc); > PCFactorGetMatrix(pc, &_F); > icntl = 7; ival = 0; > MatMumpsSetIcntl( _F, icntl, ival ); > MatMumpsSetIcntl(_F, 3, 6); > MatMumpsSetIcntl(_F, 4, 2); > } > KSPSetUp(_ksp); > > Kind Regards, > Shidi > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From harshad.sahasrabudhe at gmail.com Fri May 4 14:12:07 2018 From: harshad.sahasrabudhe at gmail.com (Harshad Sahasrabudhe) Date: Fri, 4 May 2018 15:12:07 -0400 Subject: [petsc-users] [SLEPc] SIGFPE Arithmetic exception in EPSGD In-Reply-To: References: <553A9019-E913-4716-816D-B646D06E1969@anl.gov> <1C4E12DD-E92D-433C-924F-1925730B4EA5@mcs.anl.gov> Message-ID: Thanks. We will plan an upgrade to 3.9.2. On Fri, May 4, 2018 at 9:37 AM, Jose E. Roman wrote: > That version is 4-year old, I recommend you to upgrade to the latest > version. > We have made the change suggested by Barry. It will be included in 3.9.2. > Jose > > > > El 4 may 2018, a las 5:45, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> escribi?: > > > > I think the residual of the first unconverged eigenvalue is set to INF > on purpose and not calculated in the iteration when a converged eigenvalue > is found. Dividing the INF residual is a mistake: > > > > Found the line where d->nR[i] is set to infinity: > > > > src/eps/impls/davidson/common/dvd_updatev.c > > static PetscErrorCode dvd_updateV_start(dvdDashboard *d) > > .. > > 124 for (i=0;ieps->ncv;i++) d->nR[i] = PETSC_MAX_REAL; > > > > > > On Thu, May 3, 2018 at 11:39 PM, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> wrote: > > Jumps like this usually indicate something is going very poorly in the > algorithm. I hope the algorithmic experts have a chance to look at your > case. > > > > I think the residual of the first unconverged eigenvalue is set to INF > on purpose and not calculated in the iteration when a converged eigenvalue > is found. Dividing the INF residual is a mistake: > > > > if (d->nR[i]/a < data->fix) { > > > > I think your suggestion will certainly fix this issue: > > > > if (d->nR[i] < a*data->fix) { > > > > Just my guess. > > > > > > On Thu, May 3, 2018 at 11:28 PM, Smith, Barry F. > wrote: > > > > > > > On May 3, 2018, at 10:24 PM, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> wrote: > > > > > > Hi Barry, > > > > > > There's an overflow in the division: > > > > > > Program received signal SIGFPE, Arithmetic exception. > > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, > theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > 1112 if (d->nR[i]/a < data->fix) { > > > > > > (gdb) p d->nR[i] > > > $7 = 1.7976931348623157e+308 > > > (gdb) p a > > > $8 = 0.15744695659409991 > > > > > > It looks like the residual is very high in the first GD iteration and > rapidly drops in the second iteration > > > > > > 240 EPS nconv=0 first unconverged value (error) 0.0999172 > (1.11889357e-12) > > > 241 EPS nconv=1 first unconverged value (error) 0.100311 > (1.79769313e+308) > > > > Jumps like this usually indicate something is going very poorly in the > algorithm. I hope the algorithmic experts have a chance to look at your > case. > > > > Barry > > > > > 242 EPS nconv=1 first unconverged value (error) 0.100311 > (2.39980067e-04) > > > > > > Thanks, > > > Harshad > > > > > > On Thu, May 3, 2018 at 11:11 PM, Smith, Barry F. > wrote: > > > > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > 1112 if (d->nR[i]/a < data->fix) { > > > > > > Likely the problem is due to the division by a when a is zero. Perhaps > the code needs above a check that a is not zero. Or rewrite the check as > > > > > > if (d->nR[i] < a*data->fix) { > > > > > > Barry > > > > > > > > > > On May 3, 2018, at 7:58 PM, Harshad Sahasrabudhe < > harshad.sahasrabudhe at gmail.com> wrote: > > > > > > > > Hello, > > > > > > > > I am solving for the lowest eigenvalues and eigenvectors of > symmetric positive definite matrices in the generalized eigenvalue problem. > I am using the GD solver with the default settings of PCBJACOBI. When I run > a standalone executable on 16 processes which loads the matrices from a > file and solves the eigenproblem, I get converged results in ~600 > iterations. I am using PETSc/SLEPc 3.5.4. > > > > > > > > However, when I use the same settings in my software, which uses > LibMesh (0.9.5) for FEM discretization, I get a SIGFPE. The backtrace is: > > > > > > > > Program received signal SIGFPE, Arithmetic exception. > > > > 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, i=0, > theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > > 1112 if (d->nR[i]/a < data->fix) { > > > > > > > > #0 0x00002aaab377ea26 in dvd_improvex_jd_lit_const_0 (d=0x1d29078, > i=0, theta=0x1f396f8, thetai=0x1f39718, maxits=0x7fffffff816c, > tol=0x7fffffff8140) > > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_improvex.c:1112 > > > > #1 0x00002aaab3774316 in dvd_improvex_jd_gen (d=0x1d29078, r_s=0, > r_e=1, size_D=0x7fffffff821c) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/slepc/build-real/src/eps/impls/ > davidson/common/dvd_improvex.c:316 > > > > #2 0x00002aaab3731ec4 in dvd_updateV_update_gen (d=0x1d29078) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_updatev.c:360 > > > > #3 0x00002aaab3730296 in dvd_updateV_extrapol (d=0x1d29078) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/dvd_updatev.c:193 > > > > #4 0x00002aaab3727cbc in EPSSolve_XD (eps=0x1d0ee10) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/impls/davidson/common/davidson.c:299 > > > > #5 0x00002aaab35bafc8 in EPSSolve (eps=0x1d0ee10) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/slepc/ > build-real/src/eps/interface/epssolve.c:99 > > > > #6 0x00002aaab30dbaf9 in libMesh::SlepcEigenSolver< > double>::_solve_generalized_helper (this=0x1b19880, mat_A=0x1c906d0, > mat_B=0x1cb16b0, nev=5, ncv=20, tol=9.9999999999999998e-13, m_its=3000) at > src/solvers/slepc_eigen_solver.C:519 > > > > #7 0x00002aaab30da56a in libMesh::SlepcEigenSolver::solve_generalized > (this=0x1b19880, matrix_A_in=..., matrix_B_in=..., nev=5, ncv=20, > tol=9.9999999999999998e-13, m_its=3000) at src/solvers/slepc_eigen_ > solver.C:316 > > > > #8 0x00002aaab30fb02e in libMesh::EigenSystem::solve > (this=0x1b19930) at src/systems/eigen_system.C:241 > > > > #9 0x00002aaab30e48a9 in libMesh::CondensedEigenSystem::solve > (this=0x1b19930) at src/systems/condensed_eigen_system.C:106 > > > > #10 0x00002aaaacce0e78 in EMSchrodingerFEM::do_solve > (this=0x19d6a90) at EMSchrodingerFEM.cpp:879 > > > > #11 0x00002aaaadaae3e5 in Simulation::solve (this=0x19d6a90) at > Simulation.cpp:789 > > > > #12 0x00002aaaad52458b in NonlinearPoissonFEM::do_my_assemble > (this=0x19da050, x=..., residual=0x7fffffff9eb0, jacobian=0x0) at > NonlinearPoissonFEM.cpp:179 > > > > #13 0x00002aaaad555eec in NonlinearPoisson::my_assemble_residual > (x=..., r=..., s=...) at NonlinearPoisson.cpp:1469 > > > > #14 0x00002aaab30c5dc3 in libMesh::__libmesh_petsc_snes_residual > (snes=0x1b9ed70, x=0x1a50330, r=0x1a47a50, ctx=0x19e5a60) at > src/solvers/petsc_nonlinear_solver.C:137 > > > > #15 0x00002aaab41048b9 in SNESComputeFunction (snes=0x1b9ed70, > x=0x1a50330, y=0x1a47a50) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:2033 > > > > #16 0x00002aaaad1c9ad8 in SNESShellSolve_PredictorCorrector > (snes=0x1b9ed70, vec_sol=0x1a2a5a0) at PredictorCorrectorModule.cpp:413 > > > > #17 0x00002aaab4653e3d in SNESSolve_Shell (snes=0x1b9ed70) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/shell/snesshell.c:167 > > > > #18 0x00002aaab4116fb7 in SNESSolve (snes=0x1b9ed70, b=0x0, > x=0x1a2a5a0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/interface/snes.c:3743 > > > > #19 0x00002aaab30c7c3c in libMesh::PetscNonlinearSolver::solve > (this=0x19e5a60, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > > > > #20 0x00002aaab3136ad9 in libMesh::NonlinearImplicitSystem::solve > (this=0x19e4b80) at src/systems/nonlinear_implicit_system.C:183 > > > > #21 0x00002aaaad5791f3 in NonlinearPoisson::execute_solver > (this=0x19da050) at NonlinearPoisson.cpp:1218 > > > > #22 0x00002aaaad554a99 in NonlinearPoisson::do_solve > (this=0x19da050) at NonlinearPoisson.cpp:961 > > > > #23 0x00002aaaadaae3e5 in Simulation::solve (this=0x19da050) at > Simulation.cpp:789 > > > > #24 0x00002aaaad1c9657 in PredictorCorrectorModule::do_solve > (this=0x19c0210) at PredictorCorrectorModule.cpp:334 > > > > #25 0x00002aaaadaae3e5 in Simulation::solve (this=0x19c0210) at > Simulation.cpp:789 > > > > #26 0x00002aaaad9e8f4a in Nemo::run_simulations (this=0x63ba80 > ) at Nemo.cpp:1367 > > > > #27 0x0000000000426f36 in main (argc=2, argv=0x7fffffffd0f8) at > main.cpp:452 > > > > > > > > > > > > Here is the log_view from the standalone executable: > > > > > > > > ************************************************************ > ************************************************************ > > > > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript > -r -fCourier9' to print this document *** > > > > ************************************************************ > ************************************************************ > > > > > > > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > > > > > > > ./libmesh_solve_eigenproblem on a linux named > conte-a373.rcac.purdue.edu with 16 processors, by hsahasra Thu May 3 > 20:56:03 2018 > > > > Using Petsc Release Version 3.5.4, May, 23, 2015 > > > > > > > > Max Max/Min Avg Total > > > > Time (sec): 2.628e+01 1.00158 2.625e+01 > > > > Objects: 6.400e+03 1.00000 6.400e+03 > > > > Flops: 3.576e+09 1.00908 3.564e+09 5.702e+10 > > > > Flops/sec: 1.363e+08 1.00907 1.358e+08 2.172e+09 > > > > MPI Messages: 1.808e+04 2.74920 1.192e+04 1.907e+05 > > > > MPI Message Lengths: 4.500e+07 1.61013 3.219e+03 6.139e+08 > > > > MPI Reductions: 8.522e+03 1.00000 > > > > > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > > > e.g., VecAXPY() for real vectors of > length N --> 2N flops > > > > and VecAXPY() for complex vectors of > length N --> 8N flops > > > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > > > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > > > 0: Main Stage: 2.6254e+01 100.0% 5.7023e+10 100.0% 1.907e+05 > 100.0% 3.219e+03 100.0% 8.521e+03 100.0% > > > > > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > > > Phase summary info: > > > > Count: number of times phase was executed > > > > Time and Flops: Max - maximum over all processors > > > > Ratio - ratio of maximum to minimum over all > processors > > > > Mess: number of messages sent > > > > Avg. len: average message length (bytes) > > > > Reduct: number of global reductions > > > > Global: entire computation > > > > Stage: stages of a computation. Set stages with > PetscLogStagePush() and PetscLogStagePop(). > > > > %T - percent time in this phase %F - percent flops in > this phase > > > > %M - percent messages in this phase %L - percent message > lengths in this phase > > > > %R - percent reductions in this phase > > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max > time over all processors) > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > > > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > > > > > --- Event Stage 0: Main Stage > > > > > > > > MatMult 1639 1.0 4.7509e+00 1.7 3.64e+08 1.1 1.9e+05 > 3.0e+03 0.0e+00 13 10100 93 0 13 10100 93 0 1209 > > > > MatSolve 1045 1.0 6.4188e-01 1.0 2.16e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 5163 > > > > MatLUFactorNum 1 1.0 2.0798e-02 3.5 9.18e+05 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 664 > > > > MatILUFactorSym 1 1.0 1.1777e-02 5.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > MatAssemblyBegin 4 1.0 1.3677e-01 6.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > MatAssemblyEnd 4 1.0 3.7882e-02 1.3 0.00e+00 0.0 4.6e+02 > 7.5e+02 1.6e+01 0 0 0 0 0 0 0 0 0 0 0 > > > > MatGetRowIJ 1 1.0 7.1526e-06 3.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > MatGetOrdering 1 1.0 2.3198e-04 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > MatZeroEntries 33 1.0 1.1992e-04 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > MatLoad 2 1.0 2.9271e-01 1.0 0.00e+00 0.0 5.5e+02 > 7.6e+04 2.6e+01 1 0 0 7 0 1 0 0 7 0 0 > > > > VecCopy 2096 1.0 4.0181e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > VecSet 1047 1.0 1.7598e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > VecScatterBegin 1639 1.0 4.3395e-01 2.0 0.00e+00 0.0 1.9e+05 > 3.0e+03 0.0e+00 1 0100 93 0 1 0100 93 0 0 > > > > VecScatterEnd 1639 1.0 3.2399e+00 2.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 > > > > VecReduceArith 2096 1.0 5.6402e-02 1.1 3.27e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9287 > > > > VecReduceComm 1572 1.0 5.5213e+00 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.6e+03 20 0 0 0 18 20 0 0 0 18 0 > > > > EPSSetUp 1 1.0 9.0121e-02 1.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 5.2e+01 0 0 0 0 1 0 0 0 0 1 0 > > > > EPSSolve 1 1.0 2.5917e+01 1.0 3.58e+09 1.0 1.9e+05 > 3.0e+03 8.5e+03 99100100 93100 99100100 93100 2200 > > > > STSetUp 1 1.0 4.8380e-03 5.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > KSPSolve 1045 1.0 6.8107e-01 1.0 2.17e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4886 > > > > PCSetUp 2 1.0 2.3827e-02 2.8 9.18e+05 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 580 > > > > PCApply 1045 1.0 7.0819e-01 1.0 2.17e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 4699 > > > > BVCreate 529 1.0 3.7145e+00 1.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 3.2e+03 11 0 0 0 37 11 0 0 0 37 0 > > > > BVCopy 1048 1.0 1.3941e-02 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > BVMult 3761 1.0 3.6953e+00 1.1 2.00e+09 1.0 0.0e+00 > 0.0e+00 0.0e+00 14 56 0 0 0 14 56 0 0 0 8674 > > > > BVDot 2675 1.0 9.6611e+00 1.3 1.08e+09 1.0 6.8e+04 > 3.0e+03 2.7e+03 34 30 36 33 31 34 30 36 33 31 1791 > > > > BVOrthogonalize 526 1.0 4.0705e+00 1.1 7.89e+08 1.0 6.8e+04 > 3.0e+03 5.9e+02 15 22 36 33 7 15 22 36 33 7 3092 > > > > BVScale 1047 1.0 1.6144e-02 1.1 8.18e+06 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8105 > > > > BVSetRandom 5 1.0 4.7204e-02 2.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > BVMatProject 1046 1.0 5.1708e+00 1.4 6.11e+08 1.0 0.0e+00 > 0.0e+00 1.6e+03 18 17 0 0 18 18 17 0 0 18 1891 > > > > DSSolve 533 1.0 9.7243e-01 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 > > > > DSVectors 1048 1.0 1.3440e-03 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > DSOther 2123 1.0 8.8778e-03 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > > > > > > > > Memory usage is given in bytes: > > > > > > > > Object Type Creations Destructions Memory > Descendants' Mem. > > > > Reports information only for process 0. > > > > > > > > --- Event Stage 0: Main Stage > > > > > > > > Viewer 3 2 1504 0 > > > > Matrix 3196 3190 31529868 0 > > > > Vector 2653 2651 218802920 0 > > > > Vector Scatter 2 0 0 0 > > > > Index Set 7 7 84184 0 > > > > Eigenvalue Problem Solver 1 1 4564 0 > > > > PetscRandom 1 1 632 0 > > > > Spectral Transform 1 1 828 0 > > > > Krylov Solver 2 2 2320 0 > > > > Preconditioner 2 2 1912 0 > > > > Basis Vectors 530 530 1111328 0 > > > > Region 1 1 648 0 > > > > Direct solver 1 1 201200 0 > > > > ============================================================ > ============================================================ > > > > Average time to get PetscTime(): 9.53674e-08 > > > > Average time for MPI_Barrier(): 0.0004704 > > > > Average time for zero size MPI_Send(): 0.000118256 > > > > #PETSc Option Table entries: > > > > -eps_monitor > > > > -f1 A.mat > > > > -f2 B.mat > > > > -log_view > > > > -matload_block_size 1 > > > > -ncv 70 > > > > -st_ksp_tol 1e-12 > > > > #End of PETSc Option Table entries > > > > Compiled without FORTRAN kernels > > > > Compiled with full precision matrices (default) > > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > > > Configure options: --with-x=0 --download-hdf5=1 > --with-scalar-type=real --with-single-library=1 --with-pic=1 > --with-shared-libraries=0 --with-clanguage=C++ --with-fortran=1 > --with-debugging=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx > --download-metis=1 --download-parmetis=1 --with-valgrind-dir=/apps/rhel6/valgrind/3.8.1/ > --download-mumps=1 --with-fortran-kernels=0 --download-superlu_dist=1 > --download-scalapack --download-fblaslapack=1 > > > > ----------------------------------------- > > > > Libraries compiled on Thu Sep 22 10:19:43 2016 on > carter-g008.rcac.purdue.edu > > > > Machine characteristics: Linux-2.6.32-573.8.1.el6.x86_ > 64-x86_64-with-redhat-6.7-Santiago > > > > Using PETSc directory: /depot/ncn/apps/conte/conte- > gcc-petsc35/libs/petsc/build-real > > > > Using PETSc arch: linux > > > > ----------------------------------------- > > > > > > > > Using C compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -O -fPIC ${COPTFLAGS} ${CFLAGS} > > > > Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable > -ffree-line-length-0 -O ${FOPTFLAGS} ${FFLAGS} > > > > ----------------------------------------- > > > > > > > > Using include paths: -I/depot/ncn/apps/conte/conte- > gcc-petsc35/libs/petsc/build-real/linux/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/include > -I/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/include > -I/apps/rhel6/valgrind/3.8.1/include -I/depot/apps/ncn/conte/mpich- > 3.1/include > > > > ----------------------------------------- > > > > > > > > Using C linker: mpicxx > > > > Using Fortran linker: mpif90 > > > > Using libraries: -Wl,-rpath,/depot/ncn/apps/ > conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -lpetsc -Wl,-rpath,/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -L/depot/ncn/apps/conte/conte-gcc-petsc35/libs/petsc/build-real/linux/lib > -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack > -lsuperlu_dist_3.3 -lflapack -lfblas -lparmetis -lmetis -lpthread -lssl > -lcrypto -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz > -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5. > 2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/ > gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 > -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib > -L/apps/rhel6/gcc/5.2.0/lib -lmpichf90 -lgfortran -lm -lgfortran -lm > -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -L/depot/apps/ncn/conte/mpich-3.1/lib -Wl,-rpath,/apps/rhel6/gcc/5. > 2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -L/apps/rhel6/gcc/5.2.0/lib/ > gcc/x86_64-unknown-linux-gnu/5.2.0 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib64 > -L/apps/rhel6/gcc/5.2.0/lib64 -Wl,-rpath,/apps/rhel6/gcc/5.2.0/lib > -L/apps/rhel6/gcc/5.2.0/lib -ldl -Wl,-rpath,/depot/apps/ncn/conte/mpich-3.1/lib > -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > > > > ----------------------------------------- > > > > > > > > Can you please point me to what could be going wrong with the larger > software? > > > > > > > > Thanks! > > > > Harshad > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Fri May 4 14:54:52 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 04 May 2018 20:54:52 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: Message-ID: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> Dear Matt, Thank you very much for your reply! So what you mean is that I can just do the KSPSolve() every iteration once the MUMPS is set? That means inside the KSPSolve() the numerical factorization is performed. If that is the case, it seems that the ksp object is not changed when the values in the matrix are changed. Or do I need to call both KSPSetOperators() and KSPSolve()? On 2018-05-04 14:44, Matthew Knepley wrote: > On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: > >> Dear PETSc users, >> >> I am currently using MUMPS to solve linear systems directly. >> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >> step and then solve the system. >> >> In my code, the values in the matrix is changed in each iteration, >> but the structure of the matrix stays the same, which means the >> performance can be improved if symbolic factorisation is only >> performed once. Hence, it is necessary to split the symbolic >> and numeric factorisation. However, I cannot find a specific step >> (control parameter) to perform the numeric factorisation. >> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >> it seems that the symbolic and numeric factorisation always perform >> together. > > If you use KSPSolve instead, it will automatically preserve the > symbolic > factorization. > > Thanks, > > Matt > >> So I am wondering if anyone has an idea about it. >> >> Below is how I set up MUMPS solver: >> PC pc; >> PetscBool flg_mumps, flg_mumps_ch; >> flg_mumps = PETSC_FALSE; >> flg_mumps_ch = PETSC_FALSE; >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >> NULL); >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >> NULL); >> if(flg_mumps ||flg_mumps_ch) >> { >> KSPSetType(_ksp, KSPPREONLY); >> PetscInt ival,icntl; >> PetscReal val; >> KSPGetPC(_ksp, &pc); >> /// Set preconditioner type >> if(flg_mumps) >> { >> PCSetType(pc, PCLU); >> } >> else if(flg_mumps_ch) >> { >> MatSetOption(A, MAT_SPD, PETSC_TRUE); >> PCSetType(pc, PCCHOLESKY); >> } >> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc); >> PCFactorGetMatrix(pc, &_F); >> icntl = 7; ival = 0; >> MatMumpsSetIcntl( _F, icntl, ival ); >> MatMumpsSetIcntl(_F, 3, 6); >> MatMumpsSetIcntl(_F, 4, 2); >> } >> KSPSetUp(_ksp); >> >> Kind Regards, >> Shidi > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [1] > > > Links: > ------ > [1] http://www.caam.rice.edu/~mk51/ From knepley at gmail.com Fri May 4 15:05:51 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 May 2018 16:05:51 -0400 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> Message-ID: On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: > Dear Matt, > > Thank you very much for your reply! > So what you mean is that I can just do the KSPSolve() every iteration > once the MUMPS is set? > Yes. > That means inside the KSPSolve() the numerical factorization is > performed. If that is the case, it seems that the ksp object is > not changed when the values in the matrix are changed. > Yes. > Or do I need to call both KSPSetOperators() and KSPSolve()? > If you do SetOperators, it will redo the factorization. If you do not, it will look at the Mat object, determine that the structure has not changed, and just redo the numerical factorization. Thanks, Matt > On 2018-05-04 14:44, Matthew Knepley wrote: > >> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >> >> Dear PETSc users, >>> >>> I am currently using MUMPS to solve linear systems directly. >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>> step and then solve the system. >>> >>> In my code, the values in the matrix is changed in each iteration, >>> but the structure of the matrix stays the same, which means the >>> performance can be improved if symbolic factorisation is only >>> performed once. Hence, it is necessary to split the symbolic >>> and numeric factorisation. However, I cannot find a specific step >>> (control parameter) to perform the numeric factorisation. >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>> it seems that the symbolic and numeric factorisation always perform >>> together. >>> >> >> If you use KSPSolve instead, it will automatically preserve the >> symbolic >> factorization. >> >> Thanks, >> >> Matt >> >> So I am wondering if anyone has an idea about it. >>> >>> Below is how I set up MUMPS solver: >>> PC pc; >>> PetscBool flg_mumps, flg_mumps_ch; >>> flg_mumps = PETSC_FALSE; >>> flg_mumps_ch = PETSC_FALSE; >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>> NULL); >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>> NULL); >>> if(flg_mumps ||flg_mumps_ch) >>> { >>> KSPSetType(_ksp, KSPPREONLY); >>> PetscInt ival,icntl; >>> PetscReal val; >>> KSPGetPC(_ksp, &pc); >>> /// Set preconditioner type >>> if(flg_mumps) >>> { >>> PCSetType(pc, PCLU); >>> } >>> else if(flg_mumps_ch) >>> { >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc, PCCHOLESKY); >>> } >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc); >>> PCFactorGetMatrix(pc, &_F); >>> icntl = 7; ival = 0; >>> MatMumpsSetIcntl( _F, icntl, ival ); >>> MatMumpsSetIcntl(_F, 3, 6); >>> MatMumpsSetIcntl(_F, 4, 2); >>> } >>> KSPSetUp(_ksp); >>> >>> Kind Regards, >>> Shidi >>> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] >> >> >> Links: >> ------ >> [1] http://www.caam.rice.edu/~mk51/ >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Fri May 4 15:10:55 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 04 May 2018 21:10:55 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> Message-ID: Thank you very much for your reply. That is really clear. Kind Regards, Shidi On 2018-05-04 21:05, Matthew Knepley wrote: > On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: > >> Dear Matt, >> >> Thank you very much for your reply! >> So what you mean is that I can just do the KSPSolve() every >> iteration >> once the MUMPS is set? > > Yes. > >> That means inside the KSPSolve() the numerical factorization is >> performed. If that is the case, it seems that the ksp object is >> not changed when the values in the matrix are changed. > > Yes. > >> Or do I need to call both KSPSetOperators() and KSPSolve()? > > If you do SetOperators, it will redo the factorization. If you do not, > it will look > at the Mat object, determine that the structure has not changed, and > just redo > the numerical factorization. > > Thanks, > > Matt > >> On 2018-05-04 14:44, Matthew Knepley wrote: >> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >> >> Dear PETSc users, >> >> I am currently using MUMPS to solve linear systems directly. >> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >> step and then solve the system. >> >> In my code, the values in the matrix is changed in each iteration, >> but the structure of the matrix stays the same, which means the >> performance can be improved if symbolic factorisation is only >> performed once. Hence, it is necessary to split the symbolic >> and numeric factorisation. However, I cannot find a specific step >> (control parameter) to perform the numeric factorisation. >> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >> it seems that the symbolic and numeric factorisation always perform >> together. >> >> If you use KSPSolve instead, it will automatically preserve the >> symbolic >> factorization. >> >> Thanks, >> >> Matt >> >> So I am wondering if anyone has an idea about it. >> >> Below is how I set up MUMPS solver: >> PC pc; >> PetscBool flg_mumps, flg_mumps_ch; >> flg_mumps = PETSC_FALSE; >> flg_mumps_ch = PETSC_FALSE; >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >> NULL); >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >> NULL); >> if(flg_mumps ||flg_mumps_ch) >> { >> KSPSetType(_ksp, KSPPREONLY); >> PetscInt ival,icntl; >> PetscReal val; >> KSPGetPC(_ksp, &pc); >> /// Set preconditioner type >> if(flg_mumps) >> { >> PCSetType(pc, PCLU); >> } >> else if(flg_mumps_ch) >> { >> MatSetOption(A, MAT_SPD, PETSC_TRUE); >> PCSetType(pc, PCCHOLESKY); >> } >> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc); >> PCFactorGetMatrix(pc, &_F); >> icntl = 7; ival = 0; >> MatMumpsSetIcntl( _F, icntl, ival ); >> MatMumpsSetIcntl(_F, 3, 6); >> MatMumpsSetIcntl(_F, 4, 2); >> } >> KSPSetUp(_ksp); >> >> Kind Regards, >> Shidi >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] [1] >> >> Links: >> ------ >> [1] http://www.caam.rice.edu/~mk51/ [2] > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [2] > > > Links: > ------ > [1] https://www.cse.buffalo.edu/~knepley/ > [2] http://www.caam.rice.edu/~mk51/ From nshervt at gmail.com Sat May 5 19:51:55 2018 From: nshervt at gmail.com (Navid Shervani-Tabar) Date: Sat, 5 May 2018 20:51:55 -0400 Subject: [petsc-users] Refactoring an MPI code to call from another mpi code Message-ID: Dear All, I have a question and I was advised that I might find people with related expertise here. I have two programs Main and Solver, each of which uses MPI for parallel processing. I am keeping Main as the main code and modifying Solver as a subroutine which is being called by Main. The problem arises when both codes initiate the MPI (since both were originally standalone programs) and then each have their own parallel structure, which leads the code to crash as MPI_init can only be called once in the now combined program. I thought this must have been a common problem and there should be discussions on how to approach this; However, to my surprise, I couldn't find any related topics on common websites like stackoverflow and the archives of this mailing list. Thanks, Navid PS1: When connecting the two codes, I decided to call Aux by Main rather than running the Aux as executable by Main to have a more optimized and robust setting and also avoid the overhead. PS2: Code Main is written in c++ and code Aux is written in Fortran. PS3: The Main code was originally using Deal.II as solver. -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.speck at fz-juelich.de Sun May 6 03:40:16 2018 From: r.speck at fz-juelich.de (Robert Speck) Date: Sun, 6 May 2018 10:40:16 +0200 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication Message-ID: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> Hi! I would like to do a matrix-vector multiplication (besides using linear solvers and so on) with petsc4py. I took the matrix from this example (https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py) and applied it to a PETSc Vector. All works well in serial, but in parallel (in particular if ordering becomes relevant) the resulting vector looks very different. Using the shell matrix of this example (https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py) helps, but then I cannot use matrix-based preconditioners for KSP directly (right?). I also tried using DMDA for creating vectors and matrix and for taking care of their ordering (which seems to be my problem here), but that did not help either. So, my question is this: How do I do easy parallel matrix-vector multiplication with petsc4py in a way that allows me to use parallel linear solvers etc. later on? I want to deal with spatial decomposition as little as possible. What data structures should I use? DMDA or PETSc.Vec() and PETSc.Mat() or something else? Thanks! -Robert- -- Dr. Robert Speck Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Tel: +49 2461 61 1644 Fax: +49 2461 61 6656 Email: r.speck at fz-juelich.de Website: http://www.fz-juelich.de/ias/jsc/speck_r PinT: http://www.fz-juelich.de/ias/jsc/pint ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ From knepley at gmail.com Sun May 6 06:20:24 2018 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 6 May 2018 07:20:24 -0400 Subject: [petsc-users] Refactoring an MPI code to call from another mpi code In-Reply-To: References: Message-ID: On Sat, May 5, 2018 at 8:51 PM, Navid Shervani-Tabar wrote: > Dear All, > > I have a question and I was advised that I might find people with related > expertise here. > > I have two programs Main and Solver, each of which uses MPI for parallel > processing. I am keeping Main as the main code and modifying Solver as a > subroutine which is being called by Main. > > The problem arises when both codes initiate the MPI (since both were > originally standalone programs) and then each have their own parallel > structure, which leads the code to crash as MPI_init can only be called > once in the now combined program. > > The easiest thing to do is call MPI_Initialized() in the programs to check whether MPI_Init() has already been called (same for MPI_Finalized()). > I thought this must have been a common problem and there should be > discussions on how to approach this; However, to my surprise, I couldn't > find any related topics on common websites like stackoverflow and the > archives of this mailing list. > > Thanks, > > Navid > > PS1: When connecting the two codes, I decided to call Aux by Main rather > than running the Aux as executable by Main to have a more optimized and > robust setting and also avoid the overhead. > > PS2: Code Main is written in c++ and code Aux is written in Fortran. > > PS3: The Main code was originally using Deal.II as solver. > No one answered on the Deal.II list? :) Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 6 06:22:04 2018 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 6 May 2018 07:22:04 -0400 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> Message-ID: On Sun, May 6, 2018 at 4:40 AM, Robert Speck wrote: > Hi! > > I would like to do a matrix-vector multiplication (besides using linear > solvers and so on) with petsc4py. I took the matrix from this example > (https://bitbucket.org/petsc/petsc4py/src/master/demo/ > kspsolve/petsc-mat.py) > and applied it to a PETSc Vector. All works well in serial, but in > parallel (in particular if ordering becomes relevant) the resulting > vector looks very different. You will have to be more specific about "looks different". If you look inside any of the solver code, you can see we are just calling MatMult() everywhere. Matt > Using the shell matrix of this example > (https://bitbucket.org/petsc/petsc4py/src/master/demo/ > poisson2d/poisson2d.py) > helps, but then I cannot use matrix-based preconditioners for KSP > directly (right?). I also tried using DMDA for creating vectors and > matrix and for taking care of their ordering (which seems to be my > problem here), but that did not help either. > > So, my question is this: How do I do easy parallel matrix-vector > multiplication with petsc4py in a way that allows me to use parallel > linear solvers etc. later on? I want to deal with spatial decomposition > as little as possible. What data structures should I use? DMDA or > PETSc.Vec() and PETSc.Mat() or something else? > > Thanks! > -Robert- > > -- > Dr. Robert Speck > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Tel: +49 2461 61 1644 > Fax: +49 2461 61 6656 > > Email: r.speck at fz-juelich.de > Website: http://www.fz-juelich.de/ias/jsc/speck_r > PinT: http://www.fz-juelich.de/ias/jsc/pint > > > > ------------------------------------------------------------ > ------------------------------------ > ------------------------------------------------------------ > ------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > ------------------------------------------------------------ > ------------------------------------ > ------------------------------------------------------------ > ------------------------------------ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Sun May 6 07:44:13 2018 From: dave.mayhem23 at gmail.com (Dave May) Date: Sun, 06 May 2018 12:44:13 +0000 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> Message-ID: On Sun, 6 May 2018 at 10:40, Robert Speck wrote: > Hi! > > I would like to do a matrix-vector multiplication (besides using linear > solvers and so on) with petsc4py. I took the matrix from this example > (https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py) This example only produces a matrix. And from the code the matrix produced is identical in serial or parallel. > and applied it to a PETSc Vector. All works well in serial, but in > parallel (in particular if ordering becomes relevant) the resulting > vector looks very different. Given this, the way you defined the x vector in y = A x must be different when run on 1 versus N mpi ranks. Using the shell matrix of this example > ( > https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py > ) > helps, but then I cannot use matrix-based preconditioners for KSP > directly (right?). I also tried using DMDA for creating vectors and > matrix and for taking care of their ordering (which seems to be my > problem here), but that did not help either. > > So, my question is this: How do I do easy parallel matrix-vector > multiplication with petsc4py in a way that allows me to use parallel > linear solvers etc. later on? I want to deal with spatial decomposition > as little as possible. What's the context - are you solving a PDE? Assuming you are using your own grid object (e.g. as you might have if solving a PDE), and assuming you are not solving a 1D problem, you actually have to "deal" with the spatial decomposition otherwise performance could be quite terrible - even for something simple like a 5 point Laplacian on a structured grid in 2D What data structures should I use? DMDA or > PETSc.Vec() and PETSc.Mat() or something else? The mat vec product is not causing you a problem. Your issue appears to be that you do not have a way to label entries in a vector in a consistent manner. What's the objective? Are you solving a PDE? If yes, structured grid? If yes again, use the DMDA. It takes care of all the local-to-global and global-to-local mapping you need. Thanks, Dave > > Thanks! > -Robert- > > -- > Dr. Robert Speck > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Tel: +49 2461 61 1644 > Fax: +49 2461 61 6656 > > Email: r.speck at fz-juelich.de > Website: http://www.fz-juelich.de/ias/jsc/speck_r > PinT: http://www.fz-juelich.de/ias/jsc/pint > > > > > ------------------------------------------------------------------------------------------------ > > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > > ------------------------------------------------------------------------------------------------ > > ------------------------------------------------------------------------------------------------ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.speck at fz-juelich.de Sun May 6 09:34:04 2018 From: r.speck at fz-juelich.de (Robert Speck) Date: Sun, 6 May 2018 16:34:04 +0200 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> Message-ID: Thanks for your reply and help. Yes, this is going to be a PDE solver for structured grids. The first goal would be IDC (or Crank-Nicholson) for the heat equation, which would require both solving a linear system and application of the matrix. The code I wrote for testing parallel matrix-vector multiplication can be found here: https://github.com/Parallel-in-Time/pySDC/blob/petsc/pySDC/playgrounds/PETSc/playground_matmult.py Both vectors and matrix come from the DMDA, but I guess filling them is done in a wrong way? Or do I need to convert global/natural vectors to local ones somewhere? Best -Robert- On 06.05.18 14:44, Dave May wrote: > On Sun, 6 May 2018 at 10:40, Robert Speck > wrote: > > Hi! > > I would like to do a matrix-vector multiplication (besides using linear > solvers and so on) with petsc4py. I took the matrix from this example > > (https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py) > > > This example only produces a matrix. And from the code the matrix > produced is identical in serial or parallel.? > > > > and applied it to a PETSc Vector. All works well in serial, but in > parallel (in particular if ordering becomes relevant) the resulting > vector looks very different. > > > Given this, the way you defined the x vector in y = A x must be > different when run on 1 versus N mpi ranks.? > > > Using the shell matrix of this example > (https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py) > helps, but then I cannot use matrix-based preconditioners for KSP > directly (right?). I also tried using DMDA for creating vectors and > matrix and for taking care of their ordering (which seems to be my > problem here), but that did not help either. > > So, my question is this: How do I do easy parallel matrix-vector > multiplication with petsc4py in a way that allows me to use parallel > linear solvers etc. later on? I want to deal with spatial decomposition > as little as possible. > > > What's the context - are you solving a PDE? > > Assuming you are using your own grid object (e.g. as you might have if > solving a PDE), and assuming you are not solving a 1D problem, you > actually have to "deal" with the spatial decomposition otherwise > performance could be quite terrible - even for something simple like a 5 > point Laplacian on a structured grid in 2D > > What data structures should I use? DMDA or > PETSc.Vec() and PETSc.Mat() or something else? > > > The mat vec product is not causing you a problem. Your issue appears to > be that you do not have a way to label entries in a vector in a > consistent manner. > > What's the objective? Are you solving a PDE? If yes, structured grid? If > yes again, use the DMDA. It takes care of all the local-to-global and > global-to-local mapping you need. > > Thanks, > ? Dave > > > > Thanks! > -Robert- > > -- > Dr. Robert Speck > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Tel: +49 2461 61 1644 > Fax: +49 2461 61 6656 > > Email:? ?r.speck at fz-juelich.de > Website: http://www.fz-juelich.de/ias/jsc/speck_r > PinT:? ? http://www.fz-juelich.de/ias/jsc/pint > > > > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > -- Dr. Robert Speck Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Tel: +49 2461 61 1644 Fax: +49 2461 61 6656 Email: r.speck at fz-juelich.de Website: http://www.fz-juelich.de/ias/jsc/speck_r PinT: http://www.fz-juelich.de/ias/jsc/pint From jed at jedbrown.org Sun May 6 09:52:58 2018 From: jed at jedbrown.org (Jed Brown) Date: Sun, 06 May 2018 08:52:58 -0600 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> Message-ID: <877eog4z05.fsf@jedbrown.org> Robert Speck writes: > Thanks for your reply and help. Yes, this is going to be a PDE solver > for structured grids. The first goal would be IDC (or Crank-Nicholson) > for the heat equation, which would require both solving a linear system > and application of the matrix. > > The code I wrote for testing parallel matrix-vector multiplication can > be found here: > https://github.com/Parallel-in-Time/pySDC/blob/petsc/pySDC/playgrounds/PETSc/playground_matmult.py > > Both vectors and matrix come from the DMDA, but I guess filling them is > done in a wrong way? Or do I need to convert global/natural vectors to > local ones somewhere? Global and Natural are not the same (see user's manual for details). The matrix acts on a Global vector. See petsc4py/demo/bratu3d/bratu3d.py for examples of efficiently setting values (and computing residuals) using Global vectors. It should be simpler/cleaner code than you currently have. > > Best > -Robert- > > > > On 06.05.18 14:44, Dave May wrote: >> On Sun, 6 May 2018 at 10:40, Robert Speck > > wrote: >> >> Hi! >> >> I would like to do a matrix-vector multiplication (besides using linear >> solvers and so on) with petsc4py. I took the matrix from this example >> >> (https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py) >> >> >> This example only produces a matrix. And from the code the matrix >> produced is identical in serial or parallel.? >> >> >> >> and applied it to a PETSc Vector. All works well in serial, but in >> parallel (in particular if ordering becomes relevant) the resulting >> vector looks very different. >> >> >> Given this, the way you defined the x vector in y = A x must be >> different when run on 1 versus N mpi ranks.? >> >> >> Using the shell matrix of this example >> (https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py) >> helps, but then I cannot use matrix-based preconditioners for KSP >> directly (right?). I also tried using DMDA for creating vectors and >> matrix and for taking care of their ordering (which seems to be my >> problem here), but that did not help either. >> >> So, my question is this: How do I do easy parallel matrix-vector >> multiplication with petsc4py in a way that allows me to use parallel >> linear solvers etc. later on? I want to deal with spatial decomposition >> as little as possible. >> >> >> What's the context - are you solving a PDE? >> >> Assuming you are using your own grid object (e.g. as you might have if >> solving a PDE), and assuming you are not solving a 1D problem, you >> actually have to "deal" with the spatial decomposition otherwise >> performance could be quite terrible - even for something simple like a 5 >> point Laplacian on a structured grid in 2D >> >> What data structures should I use? DMDA or >> PETSc.Vec() and PETSc.Mat() or something else? >> >> >> The mat vec product is not causing you a problem. Your issue appears to >> be that you do not have a way to label entries in a vector in a >> consistent manner. >> >> What's the objective? Are you solving a PDE? If yes, structured grid? If >> yes again, use the DMDA. It takes care of all the local-to-global and >> global-to-local mapping you need. >> >> Thanks, >> ? Dave >> >> >> >> Thanks! >> -Robert- >> >> -- >> Dr. Robert Speck >> Juelich Supercomputing Centre >> Institute for Advanced Simulation >> Forschungszentrum Juelich GmbH >> 52425 Juelich, Germany >> >> Tel: +49 2461 61 1644 >> Fax: +49 2461 61 6656 >> >> Email:? ?r.speck at fz-juelich.de >> Website: http://www.fz-juelich.de/ias/jsc/speck_r >> PinT:? ? http://www.fz-juelich.de/ias/jsc/pint >> >> >> >> ------------------------------------------------------------------------------------------------ >> ------------------------------------------------------------------------------------------------ >> Forschungszentrum Juelich GmbH >> 52425 Juelich >> Sitz der Gesellschaft: Juelich >> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 >> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher >> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), >> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, >> Prof. Dr. Sebastian M. Schmidt >> ------------------------------------------------------------------------------------------------ >> ------------------------------------------------------------------------------------------------ >> > > -- > Dr. Robert Speck > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Tel: +49 2461 61 1644 > Fax: +49 2461 61 6656 > > Email: r.speck at fz-juelich.de > Website: http://www.fz-juelich.de/ias/jsc/speck_r > PinT: http://www.fz-juelich.de/ias/jsc/pint From r.speck at fz-juelich.de Sun May 6 10:52:39 2018 From: r.speck at fz-juelich.de (Robert Speck) Date: Sun, 6 May 2018 17:52:39 +0200 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: <877eog4z05.fsf@jedbrown.org> References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> <877eog4z05.fsf@jedbrown.org> Message-ID: OK, thanks, I see. This is basically what happens in the poisson2d.py example, too, right? I tried it with the shell matrix (?) used in the poisson2d example and it works right away, but then I fail to see how to make use of the preconditioners for KSP (see my original message).. Thanks again! -Robert- On 06.05.18 16:52, Jed Brown wrote: > Robert Speck writes: > >> Thanks for your reply and help. Yes, this is going to be a PDE solver >> for structured grids. The first goal would be IDC (or Crank-Nicholson) >> for the heat equation, which would require both solving a linear system >> and application of the matrix. >> >> The code I wrote for testing parallel matrix-vector multiplication can >> be found here: >> https://github.com/Parallel-in-Time/pySDC/blob/petsc/pySDC/playgrounds/PETSc/playground_matmult.py >> >> Both vectors and matrix come from the DMDA, but I guess filling them is >> done in a wrong way? Or do I need to convert global/natural vectors to >> local ones somewhere? > > Global and Natural are not the same (see user's manual for details). > The matrix acts on a Global vector. See > petsc4py/demo/bratu3d/bratu3d.py for examples of efficiently setting > values (and computing residuals) using Global vectors. It should be > simpler/cleaner code than you currently have. > >> >> Best >> -Robert- >> >> >> >> On 06.05.18 14:44, Dave May wrote: >>> On Sun, 6 May 2018 at 10:40, Robert Speck >> > wrote: >>> >>> Hi! >>> >>> I would like to do a matrix-vector multiplication (besides using linear >>> solvers and so on) with petsc4py. I took the matrix from this example >>> >>> (https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py) >>> >>> >>> This example only produces a matrix. And from the code the matrix >>> produced is identical in serial or parallel.? >>> >>> >>> >>> and applied it to a PETSc Vector. All works well in serial, but in >>> parallel (in particular if ordering becomes relevant) the resulting >>> vector looks very different. >>> >>> >>> Given this, the way you defined the x vector in y = A x must be >>> different when run on 1 versus N mpi ranks.? >>> >>> >>> Using the shell matrix of this example >>> (https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py) >>> helps, but then I cannot use matrix-based preconditioners for KSP >>> directly (right?). I also tried using DMDA for creating vectors and >>> matrix and for taking care of their ordering (which seems to be my >>> problem here), but that did not help either. >>> >>> So, my question is this: How do I do easy parallel matrix-vector >>> multiplication with petsc4py in a way that allows me to use parallel >>> linear solvers etc. later on? I want to deal with spatial decomposition >>> as little as possible. >>> >>> >>> What's the context - are you solving a PDE? >>> >>> Assuming you are using your own grid object (e.g. as you might have if >>> solving a PDE), and assuming you are not solving a 1D problem, you >>> actually have to "deal" with the spatial decomposition otherwise >>> performance could be quite terrible - even for something simple like a 5 >>> point Laplacian on a structured grid in 2D >>> >>> What data structures should I use? DMDA or >>> PETSc.Vec() and PETSc.Mat() or something else? >>> >>> >>> The mat vec product is not causing you a problem. Your issue appears to >>> be that you do not have a way to label entries in a vector in a >>> consistent manner. >>> >>> What's the objective? Are you solving a PDE? If yes, structured grid? If >>> yes again, use the DMDA. It takes care of all the local-to-global and >>> global-to-local mapping you need. >>> >>> Thanks, >>> ? Dave >>> >>> >>> >>> Thanks! >>> -Robert- >>> >>> -- >>> Dr. Robert Speck >>> Juelich Supercomputing Centre >>> Institute for Advanced Simulation >>> Forschungszentrum Juelich GmbH >>> 52425 Juelich, Germany >>> >>> Tel: +49 2461 61 1644 >>> Fax: +49 2461 61 6656 >>> >>> Email:? ?r.speck at fz-juelich.de >>> Website: http://www.fz-juelich.de/ias/jsc/speck_r >>> PinT:? ? http://www.fz-juelich.de/ias/jsc/pint >>> >>> >>> >>> ------------------------------------------------------------------------------------------------ >>> ------------------------------------------------------------------------------------------------ >>> Forschungszentrum Juelich GmbH >>> 52425 Juelich >>> Sitz der Gesellschaft: Juelich >>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 >>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher >>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), >>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, >>> Prof. Dr. Sebastian M. Schmidt >>> ------------------------------------------------------------------------------------------------ >>> ------------------------------------------------------------------------------------------------ >>> >> >> -- >> Dr. Robert Speck >> Juelich Supercomputing Centre >> Institute for Advanced Simulation >> Forschungszentrum Juelich GmbH >> 52425 Juelich, Germany >> >> Tel: +49 2461 61 1644 >> Fax: +49 2461 61 6656 >> >> Email: r.speck at fz-juelich.de >> Website: http://www.fz-juelich.de/ias/jsc/speck_r >> PinT: http://www.fz-juelich.de/ias/jsc/pint -- Dr. Robert Speck Juelich Supercomputing Centre Institute for Advanced Simulation Forschungszentrum Juelich GmbH 52425 Juelich, Germany Tel: +49 2461 61 1644 Fax: +49 2461 61 6656 Email: r.speck at fz-juelich.de Website: http://www.fz-juelich.de/ias/jsc/speck_r PinT: http://www.fz-juelich.de/ias/jsc/pint From dave.mayhem23 at gmail.com Sun May 6 11:14:14 2018 From: dave.mayhem23 at gmail.com (Dave May) Date: Sun, 06 May 2018 16:14:14 +0000 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> <877eog4z05.fsf@jedbrown.org> Message-ID: On Sun, 6 May 2018 at 17:52, Robert Speck wrote: > OK, thanks, I see. This is basically what happens in the poisson2d.py > example, too, right? > > I tried it with the shell matrix (?) used in the poisson2d example and > it works right away, but then I fail to see how to make use of the > preconditioners for KSP (see my original message).. If you want assemble something within the operator created by DMCreateMatrix(), take a look at this function http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html Study some of the examples listed at the bottom of the page. (But you will have to concern yourself with the spatial decomposition, just like in the poisson2d and Bratu example.) Thanks, Dave > > Thanks again! > -Robert- > > > On 06.05.18 16:52, Jed Brown wrote: > > Robert Speck writes: > > > >> Thanks for your reply and help. Yes, this is going to be a PDE solver > >> for structured grids. The first goal would be IDC (or Crank-Nicholson) > >> for the heat equation, which would require both solving a linear system > >> and application of the matrix. > >> > >> The code I wrote for testing parallel matrix-vector multiplication can > >> be found here: > >> > https://github.com/Parallel-in-Time/pySDC/blob/petsc/pySDC/playgrounds/PETSc/playground_matmult.py > >> > >> Both vectors and matrix come from the DMDA, but I guess filling them is > >> done in a wrong way? Or do I need to convert global/natural vectors to > >> local ones somewhere? > > > > Global and Natural are not the same (see user's manual for details). > > The matrix acts on a Global vector. See > > petsc4py/demo/bratu3d/bratu3d.py for examples of efficiently setting > > values (and computing residuals) using Global vectors. It should be > > simpler/cleaner code than you currently have. > > > >> > >> Best > >> -Robert- > >> > >> > >> > >> On 06.05.18 14:44, Dave May wrote: > >>> On Sun, 6 May 2018 at 10:40, Robert Speck >>> > wrote: > >>> > >>> Hi! > >>> > >>> I would like to do a matrix-vector multiplication (besides using > linear > >>> solvers and so on) with petsc4py. I took the matrix from this > example > >>> > >>> ( > https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py > ) > >>> > >>> > >>> This example only produces a matrix. And from the code the matrix > >>> produced is identical in serial or parallel. > >>> > >>> > >>> > >>> and applied it to a PETSc Vector. All works well in serial, but in > >>> parallel (in particular if ordering becomes relevant) the resulting > >>> vector looks very different. > >>> > >>> > >>> Given this, the way you defined the x vector in y = A x must be > >>> different when run on 1 versus N mpi ranks. > >>> > >>> > >>> Using the shell matrix of this example > >>> ( > https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py > ) > >>> helps, but then I cannot use matrix-based preconditioners for KSP > >>> directly (right?). I also tried using DMDA for creating vectors and > >>> matrix and for taking care of their ordering (which seems to be my > >>> problem here), but that did not help either. > >>> > >>> So, my question is this: How do I do easy parallel matrix-vector > >>> multiplication with petsc4py in a way that allows me to use > parallel > >>> linear solvers etc. later on? I want to deal with spatial > decomposition > >>> as little as possible. > >>> > >>> > >>> What's the context - are you solving a PDE? > >>> > >>> Assuming you are using your own grid object (e.g. as you might have if > >>> solving a PDE), and assuming you are not solving a 1D problem, you > >>> actually have to "deal" with the spatial decomposition otherwise > >>> performance could be quite terrible - even for something simple like a > 5 > >>> point Laplacian on a structured grid in 2D > >>> > >>> What data structures should I use? DMDA or > >>> PETSc.Vec() and PETSc.Mat() or something else? > >>> > >>> > >>> The mat vec product is not causing you a problem. Your issue appears to > >>> be that you do not have a way to label entries in a vector in a > >>> consistent manner. > >>> > >>> What's the objective? Are you solving a PDE? If yes, structured grid? > If > >>> yes again, use the DMDA. It takes care of all the local-to-global and > >>> global-to-local mapping you need. > >>> > >>> Thanks, > >>> Dave > >>> > >>> > >>> > >>> Thanks! > >>> -Robert- > >>> > >>> -- > >>> Dr. Robert Speck > >>> Juelich Supercomputing Centre > >>> Institute for Advanced Simulation > >>> Forschungszentrum Juelich GmbH > >>> 52425 Juelich, Germany > >>> > >>> Tel: +49 2461 61 1644 > >>> Fax: +49 2461 61 6656 > >>> > >>> Email: r.speck at fz-juelich.de > >>> Website: http://www.fz-juelich.de/ias/jsc/speck_r > >>> PinT: http://www.fz-juelich.de/ias/jsc/pint > >>> > >>> > >>> > >>> > ------------------------------------------------------------------------------------------------ > >>> > ------------------------------------------------------------------------------------------------ > >>> Forschungszentrum Juelich GmbH > >>> 52425 Juelich > >>> Sitz der Gesellschaft: Juelich > >>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B > 3498 > >>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher > >>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt > (Vorsitzender), > >>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > >>> Prof. Dr. Sebastian M. Schmidt > >>> > ------------------------------------------------------------------------------------------------ > >>> > ------------------------------------------------------------------------------------------------ > >>> > >> > >> -- > >> Dr. Robert Speck > >> Juelich Supercomputing Centre > >> Institute for Advanced Simulation > >> Forschungszentrum Juelich GmbH > >> 52425 Juelich, Germany > >> > >> Tel: +49 2461 61 1644 > >> Fax: +49 2461 61 6656 > >> > >> Email: r.speck at fz-juelich.de > >> Website: http://www.fz-juelich.de/ias/jsc/speck_r > >> PinT: http://www.fz-juelich.de/ias/jsc/pint > > -- > Dr. Robert Speck > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Tel: +49 2461 61 1644 > Fax: +49 2461 61 6656 > > Email: r.speck at fz-juelich.de > Website: http://www.fz-juelich.de/ias/jsc/speck_r > PinT: http://www.fz-juelich.de/ias/jsc/pint > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun May 6 11:18:38 2018 From: jed at jedbrown.org (Jed Brown) Date: Sun, 06 May 2018 10:18:38 -0600 Subject: [petsc-users] petsc4py: parallel matrix-vector multiplication In-Reply-To: References: <54b02348-8c86-c7ef-c49e-3b97da9eea10@fz-juelich.de> <877eog4z05.fsf@jedbrown.org> Message-ID: <871seo4v1d.fsf@jedbrown.org> See formJacobian() in bratu3d.py for matrix assembly using DMDA. Robert Speck writes: > OK, thanks, I see. This is basically what happens in the poisson2d.py > example, too, right? > > I tried it with the shell matrix (?) used in the poisson2d example and > it works right away, but then I fail to see how to make use of the > preconditioners for KSP (see my original message).. > > Thanks again! > -Robert- > > > On 06.05.18 16:52, Jed Brown wrote: >> Robert Speck writes: >> >>> Thanks for your reply and help. Yes, this is going to be a PDE solver >>> for structured grids. The first goal would be IDC (or Crank-Nicholson) >>> for the heat equation, which would require both solving a linear system >>> and application of the matrix. >>> >>> The code I wrote for testing parallel matrix-vector multiplication can >>> be found here: >>> https://github.com/Parallel-in-Time/pySDC/blob/petsc/pySDC/playgrounds/PETSc/playground_matmult.py >>> >>> Both vectors and matrix come from the DMDA, but I guess filling them is >>> done in a wrong way? Or do I need to convert global/natural vectors to >>> local ones somewhere? >> >> Global and Natural are not the same (see user's manual for details). >> The matrix acts on a Global vector. See >> petsc4py/demo/bratu3d/bratu3d.py for examples of efficiently setting >> values (and computing residuals) using Global vectors. It should be >> simpler/cleaner code than you currently have. >> >>> >>> Best >>> -Robert- >>> >>> >>> >>> On 06.05.18 14:44, Dave May wrote: >>>> On Sun, 6 May 2018 at 10:40, Robert Speck >>> > wrote: >>>> >>>> Hi! >>>> >>>> I would like to do a matrix-vector multiplication (besides using linear >>>> solvers and so on) with petsc4py. I took the matrix from this example >>>> >>>> (https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py) >>>> >>>> >>>> This example only produces a matrix. And from the code the matrix >>>> produced is identical in serial or parallel.? >>>> >>>> >>>> >>>> and applied it to a PETSc Vector. All works well in serial, but in >>>> parallel (in particular if ordering becomes relevant) the resulting >>>> vector looks very different. >>>> >>>> >>>> Given this, the way you defined the x vector in y = A x must be >>>> different when run on 1 versus N mpi ranks.? >>>> >>>> >>>> Using the shell matrix of this example >>>> (https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py) >>>> helps, but then I cannot use matrix-based preconditioners for KSP >>>> directly (right?). I also tried using DMDA for creating vectors and >>>> matrix and for taking care of their ordering (which seems to be my >>>> problem here), but that did not help either. >>>> >>>> So, my question is this: How do I do easy parallel matrix-vector >>>> multiplication with petsc4py in a way that allows me to use parallel >>>> linear solvers etc. later on? I want to deal with spatial decomposition >>>> as little as possible. >>>> >>>> >>>> What's the context - are you solving a PDE? >>>> >>>> Assuming you are using your own grid object (e.g. as you might have if >>>> solving a PDE), and assuming you are not solving a 1D problem, you >>>> actually have to "deal" with the spatial decomposition otherwise >>>> performance could be quite terrible - even for something simple like a 5 >>>> point Laplacian on a structured grid in 2D >>>> >>>> What data structures should I use? DMDA or >>>> PETSc.Vec() and PETSc.Mat() or something else? >>>> >>>> >>>> The mat vec product is not causing you a problem. Your issue appears to >>>> be that you do not have a way to label entries in a vector in a >>>> consistent manner. >>>> >>>> What's the objective? Are you solving a PDE? If yes, structured grid? If >>>> yes again, use the DMDA. It takes care of all the local-to-global and >>>> global-to-local mapping you need. >>>> >>>> Thanks, >>>> ? Dave >>>> >>>> >>>> >>>> Thanks! >>>> -Robert- >>>> >>>> -- >>>> Dr. Robert Speck >>>> Juelich Supercomputing Centre >>>> Institute for Advanced Simulation >>>> Forschungszentrum Juelich GmbH >>>> 52425 Juelich, Germany >>>> >>>> Tel: +49 2461 61 1644 >>>> Fax: +49 2461 61 6656 >>>> >>>> Email:? ?r.speck at fz-juelich.de >>>> Website: http://www.fz-juelich.de/ias/jsc/speck_r >>>> PinT:? ? http://www.fz-juelich.de/ias/jsc/pint >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------------------------ >>>> ------------------------------------------------------------------------------------------------ >>>> Forschungszentrum Juelich GmbH >>>> 52425 Juelich >>>> Sitz der Gesellschaft: Juelich >>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 >>>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher >>>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), >>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, >>>> Prof. Dr. Sebastian M. Schmidt >>>> ------------------------------------------------------------------------------------------------ >>>> ------------------------------------------------------------------------------------------------ >>>> >>> >>> -- >>> Dr. Robert Speck >>> Juelich Supercomputing Centre >>> Institute for Advanced Simulation >>> Forschungszentrum Juelich GmbH >>> 52425 Juelich, Germany >>> >>> Tel: +49 2461 61 1644 >>> Fax: +49 2461 61 6656 >>> >>> Email: r.speck at fz-juelich.de >>> Website: http://www.fz-juelich.de/ias/jsc/speck_r >>> PinT: http://www.fz-juelich.de/ias/jsc/pint > > -- > Dr. Robert Speck > Juelich Supercomputing Centre > Institute for Advanced Simulation > Forschungszentrum Juelich GmbH > 52425 Juelich, Germany > > Tel: +49 2461 61 1644 > Fax: +49 2461 61 6656 > > Email: r.speck at fz-juelich.de > Website: http://www.fz-juelich.de/ias/jsc/speck_r > PinT: http://www.fz-juelich.de/ias/jsc/pint From zhaonanavril at gmail.com Mon May 7 10:34:46 2018 From: zhaonanavril at gmail.com (NAN ZHAO) Date: Mon, 7 May 2018 10:34:46 -0500 Subject: [petsc-users] Performance of MatSetValues and display the converged tolerance archieved in KSPSolve Message-ID: Dear all, I am trying to integrated PETSC in a legacy FEM code, but I had a few troubles to get the performance of MatSetValues to match the old subroutines in the sequtial implementation : 1. I use MatSetVaules per element, to add the matrix value to each row and col this element had, and find it really slow compared with my old subroutine which directly add the value to a 1-d array to store the value in CRS format. For a case with 12K unknown, my old subroutines takes several seconds, but MatSetValues takes around 50 seconds to finish the matrix calculation part.... Did I do something wrong, I do have preallocation giving non-zeros for each row in this MATSEQ matrix... 2. I switched to MatCreateSeqAIJWithArrays, and the performance seems to be OK. I do not understand the difference, does MatCreateSeqAIJWithArrays call MatSetValues internally? or it is just the difference with INSERT_VALUES vs ADD_VALUES? 3, I want to know the converged rtol,atol of a KSPSolve, how to I do it? 4. I want to do a parallel implement of this too, but worried about the performance of MatSetValues, should I use MatCreateMPIAIJWithArrays? Thanks, Nan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 7 15:24:55 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 7 May 2018 20:24:55 +0000 Subject: [petsc-users] Performance of MatSetValues and display the converged tolerance archieved in KSPSolve In-Reply-To: References: Message-ID: <3E92A44D-1BE7-45A9-96CD-64B4B2B853FF@anl.gov> > On May 7, 2018, at 10:34 AM, NAN ZHAO wrote: > > Dear all, > > I am trying to integrated PETSC in a legacy FEM code, but I had a few troubles to get the performance of MatSetValues to match the old subroutines in the sequtial implementation : > > 1. I use MatSetVaules per element, to add the matrix value to each row and col this element had, and find it really slow compared with my old subroutine which directly add the value to a 1-d array to store the value in CRS format. For a case with 12K unknown, my old subroutines takes several seconds, but MatSetValues takes around 50 seconds to finish the matrix calculation part.... Did I do something wrong, I do have preallocation giving non-zeros for each row in this MATSEQ matrix... 50 seconds means something is incorrect with the preallocation: http://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly > > 2. I switched to MatCreateSeqAIJWithArrays, and the performance seems to be OK. I do not understand the difference, does MatCreateSeqAIJWithArrays call MatSetValues internally? The i, j, and a arrays are not copied by this routine, PETSc uses exactly what you provided without copying hence it is fast. > or it is just the difference with INSERT_VALUES vs ADD_VALUES? Insert vs add shouldn't make a performance difference. > > > 3, I want to know the converged rtol,atol of a KSPSolve, how to I do it? If you want to know the convergence criteria used by the test you can call KSPGetTolerances() You can run with -ksp_monitor to see the norm of the (preconditioned) residual at each iteration or -ksp_monitor_true_residual to see the norm of the non-preconditioned residual. You can use KSPSetResidualHistory() to have the residual norm saved at each iteration then after KSPSolve() you can access the array to see the final residual norm. > > 4. I want to do a parallel implement of this too, but worried about the performance of MatSetValues, should I use MatCreateMPIAIJWithArrays? First you need to get the preallocation working for sequential than change to parallel. > > Thanks, > > Nan From epscodes at gmail.com Mon May 7 15:25:52 2018 From: epscodes at gmail.com (Xiangdong) Date: Mon, 7 May 2018 16:25:52 -0400 Subject: [petsc-users] questions about dmdacreaterestriction Message-ID: Hello everyone, I have a 2D DMDA with Nx=6 and Ny=4. If I set refine_x=3 and refine_y=2, I can get a coarse DMDA by DMCoarsen(). However, when I tried to call the DMCreateRestriction between these two DMDAs, I got error about " No support for this operation for this object type. DMCreateRestriction not implemented for this type." Can you help me get around this? Another simple question, where can I find the explanation/meaning of DMDA_Q0 and DMDA_Q1? Below is my test example. Thank you. Best, Xiangdong #include int main(int argc, char **argv) { PetscErrorCode ierr; PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); int Nx=6, Ny=4, Nz=1; DM da; ierr = DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE, DMDA_STENCIL_STAR,Nx,Ny,PETSC_DECIDE, PETSC_DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); int refine_x=3, refine_y=2, refine_z=1; ierr = DMDASetRefinementFactor(da,refine_x,refine_y,refine_z); ierr = DMDASetInterpolationType(da,DMDA_Q0); ierr = DMSetFromOptions(da); CHKERRQ(ierr); ierr = DMSetUp(da); CHKERRQ(ierr); DM dac; ierr = DMCoarsen(da,NULL,&dac); Mat M; Vec V; ierr = DMCreateRestriction(dac,da,&M); MatView(M,PETSC_VIEWER_STDOUT_WORLD); ierr = PetscFinalize(); CHKERRQ(ierr); PetscFunctionReturn(0); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From nishantnangia329 at gmail.com Mon May 7 16:24:00 2018 From: nishantnangia329 at gmail.com (Nishant Nangia) Date: Mon, 7 May 2018 16:24:00 -0500 Subject: [petsc-users] Solving advection equations implicitly Message-ID: Hi all, I want to implicitly solve a linear advection equation of the form: dQ/dt + div(u*Q) = 0 for a scalar quantity Q, with some known velocity field u. Note that it is purely advection with no diffusion term. Is there a recommended solver/preconditioner combination to solve something like this? *Nishant Nangia* Northwestern University Ph.D. Candidate | Engineering Sciences and Applied Mathematics Tech L386 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 7 16:56:48 2018 From: jed at jedbrown.org (Jed Brown) Date: Mon, 07 May 2018 15:56:48 -0600 Subject: [petsc-users] Solving advection equations implicitly In-Reply-To: References: Message-ID: <87d0y7165b.fsf@jedbrown.org> Do you want it to be time accurate (implies CFL number is modest) or do you want very large time steps? If very large time steps, why not steady state? Nishant Nangia writes: > Hi all, > > I want to implicitly solve a linear advection equation of the form: > dQ/dt + div(u*Q) = 0 > > for a scalar quantity Q, with some known velocity field u. Note that it is > purely advection with no diffusion term. > > Is there a recommended solver/preconditioner combination to solve something > like this? > > *Nishant Nangia* > Northwestern University > Ph.D. Candidate | Engineering Sciences and Applied Mathematics > Tech L386 From ling.zou at inl.gov Mon May 7 17:00:58 2018 From: ling.zou at inl.gov (Zou, Ling) Date: Mon, 7 May 2018 16:00:58 -0600 Subject: [petsc-users] Solving advection equations implicitly In-Reply-To: References: Message-ID: I played with implicitly (BDF1 and BDF2) solving a linear advection equation before. For spatial discretization, I used upwind donor cell method. I did not give any specifications to the solver, so I suppose default works just fine. I did supply a Jacobian matrix for preconditioning purpose. Here a screen shot of -snes_view and -ksp_view is attached. Maybe PETSc team member could provide more useful info from this screenshot. Hope this helps. -Ling SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=6 total number of function evaluations=11 norm schedule ALWAYS SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=30, initial guess is zero tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=600, cols=600 package used to perform factorization: petsc total: nonzeros=2994, allocated nonzeros=2994 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI processes type: mffd rows=600, cols=600 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 1 MPI processes type: seqaij rows=600, cols=600 total: nonzeros=2994, allocated nonzeros=2994 total number of mallocs used during MatSetValues calls =0 not using I-node routines On Mon, May 7, 2018 at 3:24 PM, Nishant Nangia wrote: > Hi all, > > I want to implicitly solve a linear advection equation of the form: > dQ/dt + div(u*Q) = 0 > > for a scalar quantity Q, with some known velocity field u. Note that it is > purely advection with no diffusion term. > > Is there a recommended solver/preconditioner combination to solve > something like this? > > *Nishant Nangia* > Northwestern University > Ph.D. Candidate | Engineering Sciences and Applied Mathematics > Tech L386 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nishantnangia329 at gmail.com Mon May 7 17:15:19 2018 From: nishantnangia329 at gmail.com (Nishant Nangia) Date: Mon, 7 May 2018 17:15:19 -0500 Subject: [petsc-users] Solving advection equations implicitly In-Reply-To: <87d0y7165b.fsf@jedbrown.org> References: <87d0y7165b.fsf@jedbrown.org> Message-ID: We basically want something that is stable and non-oscillatory for any given time step size. A little more context: Q is a fluid density variable, which we are currently updating using an explicit forward Euler step. This updated quantity is later used in a conservative discretization of the Navier-Stokes (NS) equations on a staggered mesh. We are using finite volume/finite differences. We observe oscillations in Q after a few time steps, which causes the overall mass of the domain to change and breaks the linear solvers for NS. We are experimenting with some strong stability preserving, TVD schemes to update this density, but were thinking of trying an implicit update instead. *Nishant Nangia* Northwestern University Ph.D. Candidate | Engineering Sciences and Applied Mathematics Tech L386 On Mon, May 7, 2018 at 4:56 PM, Jed Brown wrote: > Do you want it to be time accurate (implies CFL number is modest) or do > you want very large time steps? If very large time steps, why not > steady state? > > Nishant Nangia writes: > > > Hi all, > > > > I want to implicitly solve a linear advection equation of the form: > > dQ/dt + div(u*Q) = 0 > > > > for a scalar quantity Q, with some known velocity field u. Note that it > is > > purely advection with no diffusion term. > > > > Is there a recommended solver/preconditioner combination to solve > something > > like this? > > > > *Nishant Nangia* > > Northwestern University > > Ph.D. Candidate | Engineering Sciences and Applied Mathematics > > Tech L386 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 7 17:16:33 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 7 May 2018 22:16:33 +0000 Subject: [petsc-users] questions about dmdacreaterestriction In-Reply-To: References: Message-ID: <892440E3-B469-4419-BB7F-0196D036C0E9@anl.gov> > On May 7, 2018, at 3:25 PM, Xiangdong wrote: > > Hello everyone, > > I have a 2D DMDA with Nx=6 and Ny=4. If I set refine_x=3 and refine_y=2, I can get a coarse DMDA by DMCoarsen(). However, when I tried to call the DMCreateRestriction between these two DMDAs, I got error about " No support for this operation for this object type. DMCreateRestriction not implemented for this type." > > Can you help me get around this? Us DMCreateInterpolation() and then use MatMultTranspose() for example to apply the restriction. > > Another simple question, where can I find the explanation/meaning of DMDA_Q0 and DMDA_Q1? This is finite element terminology, Q0 means piecewise constant basis functions (and hence piecewise constant interpolation) while Q1 means linear interpolation. For Q0 you should think of the unknowns as living on the centers of each element while for Q1 the unknowns live on the vertices of the elements. Barry > > Below is my test example. > > Thank you. > > Best, > Xiangdong > > #include > > int main(int argc, char **argv) > { > PetscErrorCode ierr; > > PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); > > int Nx=6, Ny=4, Nz=1; > > DM da; > > ierr = DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE, DMDA_STENCIL_STAR,Nx,Ny,PETSC_DECIDE, PETSC_DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); > > int refine_x=3, refine_y=2, refine_z=1; > ierr = DMDASetRefinementFactor(da,refine_x,refine_y,refine_z); > > ierr = DMDASetInterpolationType(da,DMDA_Q0); > > ierr = DMSetFromOptions(da); CHKERRQ(ierr); > ierr = DMSetUp(da); CHKERRQ(ierr); > > DM dac; > ierr = DMCoarsen(da,NULL,&dac); > > Mat M; > Vec V; > > ierr = DMCreateRestriction(dac,da,&M); > > MatView(M,PETSC_VIEWER_STDOUT_WORLD); > > > ierr = PetscFinalize(); CHKERRQ(ierr); > PetscFunctionReturn(0); > } > From jed at jedbrown.org Mon May 7 17:20:23 2018 From: jed at jedbrown.org (Jed Brown) Date: Mon, 07 May 2018 16:20:23 -0600 Subject: [petsc-users] Solving advection equations implicitly In-Reply-To: References: <87d0y7165b.fsf@jedbrown.org> Message-ID: <877eof1520.fsf@jedbrown.org> There are not unconditionally strongly stable integrators of order higher than 1, unless you look into downwind methods which present more grave solver challenges. You can use implicit Euler, but it will be very diffusive. You can test this out using a direct solve, but I think you'll be disappointed in the results of this discretization, in which case fast preconditioning is irrelevant. Nishant Nangia writes: > We basically want something that is stable and non-oscillatory for any > given time step size. > > A little more context: Q is a fluid density variable, which we are > currently updating using an explicit forward Euler step. This updated > quantity is later used in a conservative discretization of the > Navier-Stokes (NS) equations on a staggered mesh. We are using finite > volume/finite differences. > > We observe oscillations in Q after a few time steps, which causes the > overall mass of the domain to change and breaks the linear solvers for NS. > We are experimenting with some strong stability preserving, TVD schemes to > update this density, but were thinking of trying an implicit update instead. > > > *Nishant Nangia* > Northwestern University > Ph.D. Candidate | Engineering Sciences and Applied Mathematics > Tech L386 > > On Mon, May 7, 2018 at 4:56 PM, Jed Brown wrote: > >> Do you want it to be time accurate (implies CFL number is modest) or do >> you want very large time steps? If very large time steps, why not >> steady state? >> >> Nishant Nangia writes: >> >> > Hi all, >> > >> > I want to implicitly solve a linear advection equation of the form: >> > dQ/dt + div(u*Q) = 0 >> > >> > for a scalar quantity Q, with some known velocity field u. Note that it >> is >> > purely advection with no diffusion term. >> > >> > Is there a recommended solver/preconditioner combination to solve >> something >> > like this? >> > >> > *Nishant Nangia* >> > Northwestern University >> > Ph.D. Candidate | Engineering Sciences and Applied Mathematics >> > Tech L386 >> From ling.zou at inl.gov Mon May 7 17:24:53 2018 From: ling.zou at inl.gov (Zou, Ling) Date: Mon, 7 May 2018 16:24:53 -0600 Subject: [petsc-users] Solving advection equations implicitly In-Reply-To: References: <87d0y7165b.fsf@jedbrown.org> Message-ID: On Mon, May 7, 2018 at 4:15 PM, Nishant Nangia wrote: > We basically want something that is stable and non-oscillatory for any > given time step size. > TVD upwind + 1st order BDF (BDF1) As Jed already pointed out, there is no strong stability-reserving (SSP) scheme with oder > 1. > > A little more context: Q is a fluid density variable, which we are > currently updating using an explicit forward Euler step. This updated > quantity is later used in a conservative discretization of the > Navier-Stokes (NS) equations on a staggered mesh. We are using finite > volume/finite differences. > > We observe oscillations in Q after a few time steps, which causes the > overall mass of the domain to change and breaks the linear solvers for NS. > We are experimenting with some strong stability preserving, TVD schemes to > update this density, but were thinking of trying an implicit update instead. > > > *Nishant Nangia* > Northwestern University > Ph.D. Candidate | Engineering Sciences and Applied Mathematics > Tech L386 > > On Mon, May 7, 2018 at 4:56 PM, Jed Brown wrote: > >> Do you want it to be time accurate (implies CFL number is modest) or do >> you want very large time steps? If very large time steps, why not >> steady state? >> >> Nishant Nangia writes: >> >> > Hi all, >> > >> > I want to implicitly solve a linear advection equation of the form: >> > dQ/dt + div(u*Q) = 0 >> > >> > for a scalar quantity Q, with some known velocity field u. Note that it >> is >> > purely advection with no diffusion term. >> > >> > Is there a recommended solver/preconditioner combination to solve >> something >> > like this? >> > >> > *Nishant Nangia* >> > Northwestern University >> > Ph.D. Candidate | Engineering Sciences and Applied Mathematics >> > Tech L386 >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Mon May 7 20:11:42 2018 From: epscodes at gmail.com (Xiangdong) Date: Mon, 7 May 2018 21:11:42 -0400 Subject: [petsc-users] questions about dmdacreaterestriction In-Reply-To: <892440E3-B469-4419-BB7F-0196D036C0E9@anl.gov> References: <892440E3-B469-4419-BB7F-0196D036C0E9@anl.gov> Message-ID: Hi Barry, Thanks for your message. When I use DMCreateInterpolation, I got the error message like "[0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Coarsening factor in x must be 2". The error will be gone if I change refine_x = 3 to refine_x=2. Why must the coarsening factor be a factor of 2? Thank you. Xiangdong Below is my test code: int main(int argc, char **argv) { PetscErrorCode ierr; PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); int Nx=6, Ny=4, Nz=1; DM da; ierr = DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE, DMDA_STENCIL_STAR,Nx,Ny,PETSC_DECIDE, PETSC_DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); int refine_x=3, refine_y=2, refine_z=1; ierr = DMDASetRefinementFactor(da,refine_x,refine_y,refine_z); ierr = DMDASetInterpolationType(da,DMDA_Q0); ierr = DMSetFromOptions(da); CHKERRQ(ierr); ierr = DMSetUp(da); CHKERRQ(ierr); DM dac; ierr = DMCoarsen(da,NULL,&dac); Mat M; Vec V; ierr = DMCreateInterpolation(dac,da,&M,&V); MatView(M,PETSC_VIEWER_STDOUT_WORLD); ierr = PetscFinalize(); CHKERRQ(ierr); PetscFunctionReturn(0); } On Mon, May 7, 2018 at 6:16 PM, Smith, Barry F. wrote: > > > > > On May 7, 2018, at 3:25 PM, Xiangdong wrote: > > > > Hello everyone, > > > > I have a 2D DMDA with Nx=6 and Ny=4. If I set refine_x=3 and refine_y=2, > I can get a coarse DMDA by DMCoarsen(). However, when I tried to call the > DMCreateRestriction between these two DMDAs, I got error about " No support > for this operation for this object type. DMCreateRestriction not > implemented for this type." > > > > Can you help me get around this? > > Us DMCreateInterpolation() and then use MatMultTranspose() for example > to apply the restriction. > > > > > Another simple question, where can I find the explanation/meaning of > DMDA_Q0 and DMDA_Q1? > > This is finite element terminology, Q0 means piecewise constant basis > functions (and hence piecewise constant interpolation) while Q1 means > linear interpolation. > For Q0 you should think of the unknowns as living on the centers of each > element while for Q1 the unknowns live on the vertices of the elements. > > Barry > > > > > Below is my test example. > > > > Thank you. > > > > Best, > > Xiangdong > > > > #include > > > > int main(int argc, char **argv) > > { > > PetscErrorCode ierr; > > > > PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); > > > > int Nx=6, Ny=4, Nz=1; > > > > DM da; > > > > ierr = DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE, > DMDA_STENCIL_STAR,Nx,Ny,PETSC_DECIDE, PETSC_DECIDE,1,1,NULL,NULL,& > da);CHKERRQ(ierr); > > > > int refine_x=3, refine_y=2, refine_z=1; > > ierr = DMDASetRefinementFactor(da,refine_x,refine_y,refine_z); > > > > ierr = DMDASetInterpolationType(da,DMDA_Q0); > > > > ierr = DMSetFromOptions(da); CHKERRQ(ierr); > > ierr = DMSetUp(da); CHKERRQ(ierr); > > > > DM dac; > > ierr = DMCoarsen(da,NULL,&dac); > > > > Mat M; > > Vec V; > > > > ierr = DMCreateRestriction(dac,da,&M); > > > > MatView(M,PETSC_VIEWER_STDOUT_WORLD); > > > > > > ierr = PetscFinalize(); CHKERRQ(ierr); > > PetscFunctionReturn(0); > > } > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 7 21:45:31 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 8 May 2018 02:45:31 +0000 Subject: [petsc-users] questions about dmdacreaterestriction In-Reply-To: References: <892440E3-B469-4419-BB7F-0196D036C0E9@anl.gov> Message-ID: > On May 7, 2018, at 8:11 PM, Xiangdong wrote: > > Hi Barry, > > Thanks for your message. When I use DMCreateInterpolation, I got the error message like "[0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Coarsening factor in x must be 2". The error will be gone if I change refine_x = 3 to refine_x=2. Why must the coarsening factor be a factor of 2? We've only coded interpolation for some simple situations. DMDA is not fully function to all kinds of grid operations, it just does a few basic things to demonstrate how to use the solvers. Barry > > Thank you. > > Xiangdong > > Below is my test code: > > int main(int argc, char **argv) > { > PetscErrorCode ierr; > > PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); > > int Nx=6, Ny=4, Nz=1; > > DM da; > > ierr = DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE, DMDA_STENCIL_STAR,Nx,Ny,PETSC_DECIDE, PETSC_DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); > > int refine_x=3, refine_y=2, refine_z=1; > ierr = DMDASetRefinementFactor(da,refine_x,refine_y,refine_z); > > ierr = DMDASetInterpolationType(da,DMDA_Q0); > > ierr = DMSetFromOptions(da); CHKERRQ(ierr); > ierr = DMSetUp(da); CHKERRQ(ierr); > > DM dac; > ierr = DMCoarsen(da,NULL,&dac); > > Mat M; > Vec V; > > ierr = DMCreateInterpolation(dac,da,&M,&V); > > MatView(M,PETSC_VIEWER_STDOUT_WORLD); > > > ierr = PetscFinalize(); CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > On Mon, May 7, 2018 at 6:16 PM, Smith, Barry F. wrote: > > > > > On May 7, 2018, at 3:25 PM, Xiangdong wrote: > > > > Hello everyone, > > > > I have a 2D DMDA with Nx=6 and Ny=4. If I set refine_x=3 and refine_y=2, I can get a coarse DMDA by DMCoarsen(). However, when I tried to call the DMCreateRestriction between these two DMDAs, I got error about " No support for this operation for this object type. DMCreateRestriction not implemented for this type." > > > > Can you help me get around this? > > Us DMCreateInterpolation() and then use MatMultTranspose() for example to apply the restriction. > > > > > Another simple question, where can I find the explanation/meaning of DMDA_Q0 and DMDA_Q1? > > This is finite element terminology, Q0 means piecewise constant basis functions (and hence piecewise constant interpolation) while Q1 means linear interpolation. > For Q0 you should think of the unknowns as living on the centers of each element while for Q1 the unknowns live on the vertices of the elements. > > Barry > > > > > Below is my test example. > > > > Thank you. > > > > Best, > > Xiangdong > > > > #include > > > > int main(int argc, char **argv) > > { > > PetscErrorCode ierr; > > > > PetscInitialize(&argc, &argv, PETSC_NULL, PETSC_NULL); > > > > int Nx=6, Ny=4, Nz=1; > > > > DM da; > > > > ierr = DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE, DMDA_STENCIL_STAR,Nx,Ny,PETSC_DECIDE, PETSC_DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); > > > > int refine_x=3, refine_y=2, refine_z=1; > > ierr = DMDASetRefinementFactor(da,refine_x,refine_y,refine_z); > > > > ierr = DMDASetInterpolationType(da,DMDA_Q0); > > > > ierr = DMSetFromOptions(da); CHKERRQ(ierr); > > ierr = DMSetUp(da); CHKERRQ(ierr); > > > > DM dac; > > ierr = DMCoarsen(da,NULL,&dac); > > > > Mat M; > > Vec V; > > > > ierr = DMCreateRestriction(dac,da,&M); > > > > MatView(M,PETSC_VIEWER_STDOUT_WORLD); > > > > > > ierr = PetscFinalize(); CHKERRQ(ierr); > > PetscFunctionReturn(0); > > } > > > > From mhbaghaei at mail.sjtu.edu.cn Wed May 9 13:34:23 2018 From: mhbaghaei at mail.sjtu.edu.cn (Amir) Date: Thu, 10 May 2018 02:34:23 +0800 Subject: [petsc-users] 2D ploar domain Message-ID: <1525890109.local-a037e657-7ca2-v1.2.1-7e7447b6@getmailspring.com> Hello I have a 2D polar computational domain, which equation specified on a tensor-product mesh.As I increase the number of nodes at the near center, I encounter that the LineSearch which I use generallly as bt, does not go well. In fact, I would not be able to converge. Do you have any idea on this? Thanks Amir -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 9 16:38:20 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 9 May 2018 17:38:20 -0400 Subject: [petsc-users] 2D ploar domain In-Reply-To: <1525890109.local-a037e657-7ca2-v1.2.1-7e7447b6@getmailspring.com> References: <1525890109.local-a037e657-7ca2-v1.2.1-7e7447b6@getmailspring.com> Message-ID: On Wed, May 9, 2018 at 2:34 PM, Amir wrote: > Hello > I have a 2D polar computational domain, which equation specified on a > tensor-product mesh.As I increase the number of nodes at the near center, I > encounter that the LineSearch which I use generallly as bt, does not go > well. In fact, I would not be able to converge. Do you have any idea on > this? > The convergence of nonlinear iterations depends strongly on the equation being solved. Its hard for us to say anything. Maybe with more specifics, but there is not a lot that can be said. Thanks, Matt > Thanks > Amir > > [image: Open Tracking] -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dayedut123 at 163.com Fri May 11 03:23:26 2018 From: dayedut123 at 163.com (=?GBK?B?ztI=?=) Date: Fri, 11 May 2018 16:23:26 +0800 (CST) Subject: [petsc-users] Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11 Message-ID: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> Hello all, I use the function MatCreateMPIAIJWithArrays to construct my matrix. But the number of local rows m and local columns n may change during the timestep advancing. When the local size changes, the error like "Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11" will appear. Any suggestions about it ? Thank you very much! Daye -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Fri May 11 06:14:29 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 11 May 2018 12:14:29 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> Message-ID: Dear Matt, Thank you for your help last time. I want to get more detail about the Petsc-MUMPS factorisation; so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". And I found the following functions are quite important to the question: PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS r,const MatFactorInfo *info); PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const MatFactorInfo *info); PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); I print some sentence to trace when these functions are called. Then I test my code; the values in the matrix is changing but the structure stays the same. Below is the output. We can see that at 0th step, all the symbolic, numeric and solve are called; in the subsequent steps only the solve stage is called, the numeric step is not called. Iteration 0 Step 0.0005 Time 0.0005 [INFO]: Direct Solver setup MatCholeskyFactorSymbolic_MUMPS finish MatCholeskyFactorSymbolic_MUMPS MatFactorNumeric_MUMPS finish MatFactorNumeric_MUMPS MatSolve_MUMPS Iteration 1 Step 0.0005 Time 0.0005 MatSolve_MUMPS Iteration 2 Step 0.0005 Time 0.001 MatSolve_MUMPS [INFO]: End of program!!! I am wondering if there is any possibility to split the numeric and solve stage (as you mentioned using KSPSolve). Thank you very much indeed. Kind Regards, Shidi On 2018-05-04 21:10, Y. Shidi wrote: > Thank you very much for your reply. > That is really clear. > > Kind Regards, > Shidi > > On 2018-05-04 21:05, Matthew Knepley wrote: >> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >> >>> Dear Matt, >>> >>> Thank you very much for your reply! >>> So what you mean is that I can just do the KSPSolve() every >>> iteration >>> once the MUMPS is set? >> >> Yes. >> >>> That means inside the KSPSolve() the numerical factorization is >>> performed. If that is the case, it seems that the ksp object is >>> not changed when the values in the matrix are changed. >> >> Yes. >> >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >> >> If you do SetOperators, it will redo the factorization. If you do not, >> it will look >> at the Mat object, determine that the structure has not changed, and >> just redo >> the numerical factorization. >> >> Thanks, >> >> Matt >> >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>> >>> Dear PETSc users, >>> >>> I am currently using MUMPS to solve linear systems directly. >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>> step and then solve the system. >>> >>> In my code, the values in the matrix is changed in each iteration, >>> but the structure of the matrix stays the same, which means the >>> performance can be improved if symbolic factorisation is only >>> performed once. Hence, it is necessary to split the symbolic >>> and numeric factorisation. However, I cannot find a specific step >>> (control parameter) to perform the numeric factorisation. >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>> it seems that the symbolic and numeric factorisation always perform >>> together. >>> >>> If you use KSPSolve instead, it will automatically preserve the >>> symbolic >>> factorization. >>> >>> Thanks, >>> >>> Matt >>> >>> So I am wondering if anyone has an idea about it. >>> >>> Below is how I set up MUMPS solver: >>> PC pc; >>> PetscBool flg_mumps, flg_mumps_ch; >>> flg_mumps = PETSC_FALSE; >>> flg_mumps_ch = PETSC_FALSE; >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>> NULL); >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>> NULL); >>> if(flg_mumps ||flg_mumps_ch) >>> { >>> KSPSetType(_ksp, KSPPREONLY); >>> PetscInt ival,icntl; >>> PetscReal val; >>> KSPGetPC(_ksp, &pc); >>> /// Set preconditioner type >>> if(flg_mumps) >>> { >>> PCSetType(pc, PCLU); >>> } >>> else if(flg_mumps_ch) >>> { >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc, PCCHOLESKY); >>> } >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc); >>> PCFactorGetMatrix(pc, &_F); >>> icntl = 7; ival = 0; >>> MatMumpsSetIcntl( _F, icntl, ival ); >>> MatMumpsSetIcntl(_F, 3, 6); >>> MatMumpsSetIcntl(_F, 4, 2); >>> } >>> KSPSetUp(_ksp); >>> >>> Kind Regards, >>> Shidi >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] >>> >>> Links: >>> ------ >>> [1] http://www.caam.rice.edu/~mk51/ [2] >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [2] >> >> >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ >> [2] http://www.caam.rice.edu/~mk51/ From knepley at gmail.com Fri May 11 06:59:52 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 May 2018 07:59:52 -0400 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> Message-ID: On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: > Dear Matt, > > Thank you for your help last time. > I want to get more detail about the Petsc-MUMPS factorisation; > so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". > And I found the following functions are quite important to > the question: > > PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS r,const > MatFactorInfo *info); > PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const MatFactorInfo > *info); > PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); > > I print some sentence to trace when these functions are called. > Then I test my code; the values in the matrix is changing but the > structure stays the same. Below is the output. > We can see that at 0th step, all the symbolic, numeric and solve > are called; in the subsequent steps only the solve stage is called, > the numeric step is not called. > How are you changing the matrix? Do you remember to assemble? Matt > Iteration 0 Step 0.0005 Time 0.0005 > [INFO]: Direct Solver setup > MatCholeskyFactorSymbolic_MUMPS > finish MatCholeskyFactorSymbolic_MUMPS > MatFactorNumeric_MUMPS > finish MatFactorNumeric_MUMPS > MatSolve_MUMPS > > Iteration 1 Step 0.0005 Time 0.0005 > MatSolve_MUMPS > > Iteration 2 Step 0.0005 Time 0.001 > MatSolve_MUMPS > > [INFO]: End of program!!! > > > I am wondering if there is any possibility to split the numeric > and solve stage (as you mentioned using KSPSolve). > > Thank you very much indeed. > > Kind Regards, > Shidi > > On 2018-05-04 21:10, Y. Shidi wrote: > >> Thank you very much for your reply. >> That is really clear. >> >> Kind Regards, >> Shidi >> >> On 2018-05-04 21:05, Matthew Knepley wrote: >> >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>> >>> Dear Matt, >>>> >>>> Thank you very much for your reply! >>>> So what you mean is that I can just do the KSPSolve() every >>>> iteration >>>> once the MUMPS is set? >>>> >>> >>> Yes. >>> >>> That means inside the KSPSolve() the numerical factorization is >>>> performed. If that is the case, it seems that the ksp object is >>>> not changed when the values in the matrix are changed. >>>> >>> >>> Yes. >>> >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>>> >>> >>> If you do SetOperators, it will redo the factorization. If you do not, >>> it will look >>> at the Mat object, determine that the structure has not changed, and >>> just redo >>> the numerical factorization. >>> >>> Thanks, >>> >>> Matt >>> >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>>> >>>> Dear PETSc users, >>>> >>>> I am currently using MUMPS to solve linear systems directly. >>>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>>> step and then solve the system. >>>> >>>> In my code, the values in the matrix is changed in each iteration, >>>> but the structure of the matrix stays the same, which means the >>>> performance can be improved if symbolic factorisation is only >>>> performed once. Hence, it is necessary to split the symbolic >>>> and numeric factorisation. However, I cannot find a specific step >>>> (control parameter) to perform the numeric factorisation. >>>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>>> it seems that the symbolic and numeric factorisation always perform >>>> together. >>>> >>>> If you use KSPSolve instead, it will automatically preserve the >>>> symbolic >>>> factorization. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> So I am wondering if anyone has an idea about it. >>>> >>>> Below is how I set up MUMPS solver: >>>> PC pc; >>>> PetscBool flg_mumps, flg_mumps_ch; >>>> flg_mumps = PETSC_FALSE; >>>> flg_mumps_ch = PETSC_FALSE; >>>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>>> NULL); >>>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>>> NULL); >>>> if(flg_mumps ||flg_mumps_ch) >>>> { >>>> KSPSetType(_ksp, KSPPREONLY); >>>> PetscInt ival,icntl; >>>> PetscReal val; >>>> KSPGetPC(_ksp, &pc); >>>> /// Set preconditioner type >>>> if(flg_mumps) >>>> { >>>> PCSetType(pc, PCLU); >>>> } >>>> else if(flg_mumps_ch) >>>> { >>>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>>> PCSetType(pc, PCCHOLESKY); >>>> } >>>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>>> PCFactorSetUpMatSolverPackage(pc); >>>> PCFactorGetMatrix(pc, &_F); >>>> icntl = 7; ival = 0; >>>> MatMumpsSetIcntl( _F, icntl, ival ); >>>> MatMumpsSetIcntl(_F, 3, 6); >>>> MatMumpsSetIcntl(_F, 4, 2); >>>> } >>>> KSPSetUp(_ksp); >>>> >>>> Kind Regards, >>>> Shidi >>>> >>>> -- >>>> >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to >>>> which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ [1] [1] >>>> >>>> Links: >>>> ------ >>>> [1] http://www.caam.rice.edu/~mk51/ [2] >>>> >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ [2] >>> >>> >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ >>> [2] http://www.caam.rice.edu/~mk51/ >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 11 07:07:35 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 May 2018 08:07:35 -0400 Subject: [petsc-users] Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11 In-Reply-To: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> References: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> Message-ID: On Fri, May 11, 2018 at 4:23 AM, ? wrote: > Hello all, > I use the function MatCreateMPIAIJWithArrays to construct my matrix. But > the number of local rows m and local columns n may change during the > timestep advancing. When the local size changes, the error like "Petsc > error: cannot chang local size of Amat after use old sizes 10 10 new sizes > 11 11" will appear. Any suggestions about it ? > If the parallel layout changes, you need to create a new matrix. Thanks, Matt > Thank you very much! > Daye > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Fri May 11 08:14:01 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 11 May 2018 14:14:01 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> Message-ID: <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> Thank you for your reply. > How are you changing the matrix? Do you remember to assemble? I use MatCreateMPIAIJWithArrays() to create the matrix, and after that I call MatAssemblyBegin() and MatAssemblyEnd(). But I actually destroy the matrix at the end of each iteration and create the matrix at the beginning of each iteration. Cheers, Shidi On 2018-05-11 12:59, Matthew Knepley wrote: > On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: > >> Dear Matt, >> >> Thank you for your help last time. >> I want to get more detail about the Petsc-MUMPS factorisation; >> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >> And I found the following functions are quite important to >> the question: >> >> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >> r,const MatFactorInfo *info); >> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >> MatFactorInfo *info); >> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >> >> I print some sentence to trace when these functions are called. >> Then I test my code; the values in the matrix is changing but the >> structure stays the same. Below is the output. >> We can see that at 0th step, all the symbolic, numeric and solve >> are called; in the subsequent steps only the solve stage is called, >> the numeric step is not called. > > How are you changing the matrix? Do you remember to assemble? > > Matt > >> Iteration 0 Step 0.0005 Time 0.0005 >> [INFO]: Direct Solver setup >> MatCholeskyFactorSymbolic_MUMPS >> finish MatCholeskyFactorSymbolic_MUMPS >> MatFactorNumeric_MUMPS >> finish MatFactorNumeric_MUMPS >> MatSolve_MUMPS >> >> Iteration 1 Step 0.0005 Time 0.0005 >> MatSolve_MUMPS >> >> Iteration 2 Step 0.0005 Time 0.001 >> MatSolve_MUMPS >> >> [INFO]: End of program!!! >> >> I am wondering if there is any possibility to split the numeric >> and solve stage (as you mentioned using KSPSolve). >> >> Thank you very much indeed. >> >> Kind Regards, >> Shidi >> >> On 2018-05-04 21:10, Y. Shidi wrote: >> Thank you very much for your reply. >> That is really clear. >> >> Kind Regards, >> Shidi >> >> On 2018-05-04 21:05, Matthew Knepley wrote: >> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >> >> Dear Matt, >> >> Thank you very much for your reply! >> So what you mean is that I can just do the KSPSolve() every >> iteration >> once the MUMPS is set? >> >> Yes. >> >> That means inside the KSPSolve() the numerical factorization is >> performed. If that is the case, it seems that the ksp object is >> not changed when the values in the matrix are changed. >> >> Yes. >> >> Or do I need to call both KSPSetOperators() and KSPSolve()? >> >> If you do SetOperators, it will redo the factorization. If you do >> not, >> it will look >> at the Mat object, determine that the structure has not changed, >> and >> just redo >> the numerical factorization. >> >> Thanks, >> >> Matt >> >> On 2018-05-04 14:44, Matthew Knepley wrote: >> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >> >> Dear PETSc users, >> >> I am currently using MUMPS to solve linear systems directly. >> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >> step and then solve the system. >> >> In my code, the values in the matrix is changed in each iteration, >> but the structure of the matrix stays the same, which means the >> performance can be improved if symbolic factorisation is only >> performed once. Hence, it is necessary to split the symbolic >> and numeric factorisation. However, I cannot find a specific step >> (control parameter) to perform the numeric factorisation. >> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >> it seems that the symbolic and numeric factorisation always perform >> together. >> >> If you use KSPSolve instead, it will automatically preserve the >> symbolic >> factorization. >> >> Thanks, >> >> Matt >> >> So I am wondering if anyone has an idea about it. >> >> Below is how I set up MUMPS solver: >> PC pc; >> PetscBool flg_mumps, flg_mumps_ch; >> flg_mumps = PETSC_FALSE; >> flg_mumps_ch = PETSC_FALSE; >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >> NULL); >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >> NULL); >> if(flg_mumps ||flg_mumps_ch) >> { >> KSPSetType(_ksp, KSPPREONLY); >> PetscInt ival,icntl; >> PetscReal val; >> KSPGetPC(_ksp, &pc); >> /// Set preconditioner type >> if(flg_mumps) >> { >> PCSetType(pc, PCLU); >> } >> else if(flg_mumps_ch) >> { >> MatSetOption(A, MAT_SPD, PETSC_TRUE); >> PCSetType(pc, PCCHOLESKY); >> } >> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc); >> PCFactorGetMatrix(pc, &_F); >> icntl = 7; ival = 0; >> MatMumpsSetIcntl( _F, icntl, ival ); >> MatMumpsSetIcntl(_F, 3, 6); >> MatMumpsSetIcntl(_F, 4, 2); >> } >> KSPSetUp(_ksp); >> >> Kind Regards, >> Shidi >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] >> >> Links: >> ------ >> [1] http://www.caam.rice.edu/~mk51/ [2] [2] >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] [2] >> >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ [1] >> [2] http://www.caam.rice.edu/~mk51/ [2] > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [2] > > > Links: > ------ > [1] https://www.cse.buffalo.edu/~knepley/ > [2] http://www.caam.rice.edu/~mk51/ From bsmith at mcs.anl.gov Fri May 11 10:13:18 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 11 May 2018 15:13:18 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> Message-ID: > On May 11, 2018, at 8:14 AM, Y. Shidi wrote: > > Thank you for your reply. > >> How are you changing the matrix? Do you remember to assemble? > I use MatCreateMPIAIJWithArrays() to create the matrix, > and after that I call MatAssemblyBegin() and MatAssemblyEnd(). If you use MatCreateMPIAIJWithArrays() you don't need to call MatAssemblyBegin() and MatAssemblyEnd(). > But I actually destroy the matrix at the end of each iteration > and create the matrix at the beginning of each iteration. This is a bug in PETSc. Since you are providing a new matrix with the same "state" value as the previous matrix the PC code the following code kicks in: ierr = PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); if (!pc->setupcalled) { ierr = PetscInfo(pc,"Setting up PC for first time\n");CHKERRQ(ierr); pc->flag = DIFFERENT_NONZERO_PATTERN; } else if (matstate == pc->matstate) { ierr = PetscInfo(pc,"Leaving PC with identical preconditioner since operator is unchanged\n");CHKERRQ(ierr); PetscFunctionReturn(0); and it returns without refactoring. We need an additional check that the matrix also remains the same. We will also need a test example that reproduces the problem to confirm that we have fixed it. Barry > > Cheers, > Shidi > > On 2018-05-11 12:59, Matthew Knepley wrote: >> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>> Dear Matt, >>> Thank you for your help last time. >>> I want to get more detail about the Petsc-MUMPS factorisation; >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>> And I found the following functions are quite important to >>> the question: >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>> r,const MatFactorInfo *info); >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>> MatFactorInfo *info); >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>> I print some sentence to trace when these functions are called. >>> Then I test my code; the values in the matrix is changing but the >>> structure stays the same. Below is the output. >>> We can see that at 0th step, all the symbolic, numeric and solve >>> are called; in the subsequent steps only the solve stage is called, >>> the numeric step is not called. >> How are you changing the matrix? Do you remember to assemble? >> Matt >>> Iteration 0 Step 0.0005 Time 0.0005 >>> [INFO]: Direct Solver setup >>> MatCholeskyFactorSymbolic_MUMPS >>> finish MatCholeskyFactorSymbolic_MUMPS >>> MatFactorNumeric_MUMPS >>> finish MatFactorNumeric_MUMPS >>> MatSolve_MUMPS >>> Iteration 1 Step 0.0005 Time 0.0005 >>> MatSolve_MUMPS >>> Iteration 2 Step 0.0005 Time 0.001 >>> MatSolve_MUMPS >>> [INFO]: End of program!!! >>> I am wondering if there is any possibility to split the numeric >>> and solve stage (as you mentioned using KSPSolve). >>> Thank you very much indeed. >>> Kind Regards, >>> Shidi >>> On 2018-05-04 21:10, Y. Shidi wrote: >>> Thank you very much for your reply. >>> That is really clear. >>> Kind Regards, >>> Shidi >>> On 2018-05-04 21:05, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>> Dear Matt, >>> Thank you very much for your reply! >>> So what you mean is that I can just do the KSPSolve() every >>> iteration >>> once the MUMPS is set? >>> Yes. >>> That means inside the KSPSolve() the numerical factorization is >>> performed. If that is the case, it seems that the ksp object is >>> not changed when the values in the matrix are changed. >>> Yes. >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>> If you do SetOperators, it will redo the factorization. If you do >>> not, >>> it will look >>> at the Mat object, determine that the structure has not changed, >>> and >>> just redo >>> the numerical factorization. >>> Thanks, >>> Matt >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>> Dear PETSc users, >>> I am currently using MUMPS to solve linear systems directly. >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>> step and then solve the system. >>> In my code, the values in the matrix is changed in each iteration, >>> but the structure of the matrix stays the same, which means the >>> performance can be improved if symbolic factorisation is only >>> performed once. Hence, it is necessary to split the symbolic >>> and numeric factorisation. However, I cannot find a specific step >>> (control parameter) to perform the numeric factorisation. >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>> it seems that the symbolic and numeric factorisation always perform >>> together. >>> If you use KSPSolve instead, it will automatically preserve the >>> symbolic >>> factorization. >>> Thanks, >>> Matt >>> So I am wondering if anyone has an idea about it. >>> Below is how I set up MUMPS solver: >>> PC pc; >>> PetscBool flg_mumps, flg_mumps_ch; >>> flg_mumps = PETSC_FALSE; >>> flg_mumps_ch = PETSC_FALSE; >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>> NULL); >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>> NULL); >>> if(flg_mumps ||flg_mumps_ch) >>> { >>> KSPSetType(_ksp, KSPPREONLY); >>> PetscInt ival,icntl; >>> PetscReal val; >>> KSPGetPC(_ksp, &pc); >>> /// Set preconditioner type >>> if(flg_mumps) >>> { >>> PCSetType(pc, PCLU); >>> } >>> else if(flg_mumps_ch) >>> { >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc, PCCHOLESKY); >>> } >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc); >>> PCFactorGetMatrix(pc, &_F); >>> icntl = 7; ival = 0; >>> MatMumpsSetIcntl( _F, icntl, ival ); >>> MatMumpsSetIcntl(_F, 3, 6); >>> MatMumpsSetIcntl(_F, 4, 2); >>> } >>> KSPSetUp(_ksp); >>> Kind Regards, >>> Shidi >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] >>> Links: >>> ------ >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>> [2] http://www.caam.rice.edu/~mk51/ [2] >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ [2] >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ >> [2] http://www.caam.rice.edu/~mk51/ From ys453 at cam.ac.uk Fri May 11 12:02:16 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 11 May 2018 18:02:16 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> Message-ID: <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> Thank you very much for your reply, Barry. > This is a bug in PETSc. Since you are providing a new matrix with > the same "state" value as the previous matrix the PC code the > following code So what you mean is that every time I change the value in the matrix, the PETSc only determines if the nonzero pattern change but not the values, and if it is unchanged neither of symbolic and numeric happens. I found the following code: if (!pc->setupcalled) { ierr = PetscInfo(pc,"Setting up PC for first time\n");CHKERRQ(ierr); pc->flag = DIFFERENT_NONZERO_PATTERN; } else if (matstate == pc->matstate) { ierr = PetscInfo(pc,"Leaving PC with identical preconditioner since operator is unchanged\n");CHKERRQ(ierr); PetscFunctionReturn(0); } else { if (matnonzerostate > pc->matnonzerostate) { ierr = PetscInfo(pc,"Setting up PC with different nonzero pattern\n");CHKERRQ(ierr); pc->flag = DIFFERENT_NONZERO_PATTERN; } else { ierr = PetscInfo(pc,"Setting up PC with same nonzero pattern\n");CHKERRQ(ierr); pc->flag = SAME_NONZERO_PATTERN; } } and I commend out "else if (matstate == pc->matstate){}", so it will do "Setting up PC with same nonzero pattern\n"; and it seems work in my case, only "MatFactorNumeric_MUMPS()" is calling in the subsequent iterations. But I am not quite sure, need some more tests. Thank you very much for your help indeed. Kind Regards, Shidi On 2018-05-11 16:13, Smith, Barry F. wrote: >> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >> >> Thank you for your reply. >> >>> How are you changing the matrix? Do you remember to assemble? >> I use MatCreateMPIAIJWithArrays() to create the matrix, >> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). > > If you use MatCreateMPIAIJWithArrays() you don't need to call > MatAssemblyBegin() and MatAssemblyEnd(). > >> But I actually destroy the matrix at the end of each iteration >> and create the matrix at the beginning of each iteration. > > This is a bug in PETSc. Since you are providing a new matrix with > the same "state" value as the previous matrix the PC code the > following code > kicks in: > > ierr = > PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); > ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); > if (!pc->setupcalled) { > ierr = PetscInfo(pc,"Setting up PC for first > time\n");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else if (matstate == pc->matstate) { > ierr = PetscInfo(pc,"Leaving PC with identical preconditioner > since operator is unchanged\n");CHKERRQ(ierr); > PetscFunctionReturn(0); > > and it returns without refactoring. > > We need an additional check that the matrix also remains the same. > > We will also need a test example that reproduces the problem to > confirm that we have fixed it. > > Barry > >> >> Cheers, >> Shidi >> >> On 2018-05-11 12:59, Matthew Knepley wrote: >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>>> Dear Matt, >>>> Thank you for your help last time. >>>> I want to get more detail about the Petsc-MUMPS factorisation; >>>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>>> And I found the following functions are quite important to >>>> the question: >>>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>>> r,const MatFactorInfo *info); >>>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>>> MatFactorInfo *info); >>>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>>> I print some sentence to trace when these functions are called. >>>> Then I test my code; the values in the matrix is changing but the >>>> structure stays the same. Below is the output. >>>> We can see that at 0th step, all the symbolic, numeric and solve >>>> are called; in the subsequent steps only the solve stage is called, >>>> the numeric step is not called. >>> How are you changing the matrix? Do you remember to assemble? >>> Matt >>>> Iteration 0 Step 0.0005 Time 0.0005 >>>> [INFO]: Direct Solver setup >>>> MatCholeskyFactorSymbolic_MUMPS >>>> finish MatCholeskyFactorSymbolic_MUMPS >>>> MatFactorNumeric_MUMPS >>>> finish MatFactorNumeric_MUMPS >>>> MatSolve_MUMPS >>>> Iteration 1 Step 0.0005 Time 0.0005 >>>> MatSolve_MUMPS >>>> Iteration 2 Step 0.0005 Time 0.001 >>>> MatSolve_MUMPS >>>> [INFO]: End of program!!! >>>> I am wondering if there is any possibility to split the numeric >>>> and solve stage (as you mentioned using KSPSolve). >>>> Thank you very much indeed. >>>> Kind Regards, >>>> Shidi >>>> On 2018-05-04 21:10, Y. Shidi wrote: >>>> Thank you very much for your reply. >>>> That is really clear. >>>> Kind Regards, >>>> Shidi >>>> On 2018-05-04 21:05, Matthew Knepley wrote: >>>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>>> Dear Matt, >>>> Thank you very much for your reply! >>>> So what you mean is that I can just do the KSPSolve() every >>>> iteration >>>> once the MUMPS is set? >>>> Yes. >>>> That means inside the KSPSolve() the numerical factorization is >>>> performed. If that is the case, it seems that the ksp object is >>>> not changed when the values in the matrix are changed. >>>> Yes. >>>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>>> If you do SetOperators, it will redo the factorization. If you do >>>> not, >>>> it will look >>>> at the Mat object, determine that the structure has not changed, >>>> and >>>> just redo >>>> the numerical factorization. >>>> Thanks, >>>> Matt >>>> On 2018-05-04 14:44, Matthew Knepley wrote: >>>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>>> Dear PETSc users, >>>> I am currently using MUMPS to solve linear systems directly. >>>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>>> step and then solve the system. >>>> In my code, the values in the matrix is changed in each iteration, >>>> but the structure of the matrix stays the same, which means the >>>> performance can be improved if symbolic factorisation is only >>>> performed once. Hence, it is necessary to split the symbolic >>>> and numeric factorisation. However, I cannot find a specific step >>>> (control parameter) to perform the numeric factorisation. >>>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>>> it seems that the symbolic and numeric factorisation always perform >>>> together. >>>> If you use KSPSolve instead, it will automatically preserve the >>>> symbolic >>>> factorization. >>>> Thanks, >>>> Matt >>>> So I am wondering if anyone has an idea about it. >>>> Below is how I set up MUMPS solver: >>>> PC pc; >>>> PetscBool flg_mumps, flg_mumps_ch; >>>> flg_mumps = PETSC_FALSE; >>>> flg_mumps_ch = PETSC_FALSE; >>>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>>> NULL); >>>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>>> NULL); >>>> if(flg_mumps ||flg_mumps_ch) >>>> { >>>> KSPSetType(_ksp, KSPPREONLY); >>>> PetscInt ival,icntl; >>>> PetscReal val; >>>> KSPGetPC(_ksp, &pc); >>>> /// Set preconditioner type >>>> if(flg_mumps) >>>> { >>>> PCSetType(pc, PCLU); >>>> } >>>> else if(flg_mumps_ch) >>>> { >>>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>>> PCSetType(pc, PCCHOLESKY); >>>> } >>>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>>> PCFactorSetUpMatSolverPackage(pc); >>>> PCFactorGetMatrix(pc, &_F); >>>> icntl = 7; ival = 0; >>>> MatMumpsSetIcntl( _F, icntl, ival ); >>>> MatMumpsSetIcntl(_F, 3, 6); >>>> MatMumpsSetIcntl(_F, 4, 2); >>>> } >>>> KSPSetUp(_ksp); >>>> Kind Regards, >>>> Shidi >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to >>>> which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] >>>> Links: >>>> ------ >>>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to >>>> which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>>> Links: >>>> ------ >>>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>>> [2] http://www.caam.rice.edu/~mk51/ [2] >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [2] >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ >>> [2] http://www.caam.rice.edu/~mk51/ From knepley at gmail.com Fri May 11 12:10:38 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 May 2018 13:10:38 -0400 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> Message-ID: On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: > Thank you very much for your reply, Barry. > > This is a bug in PETSc. Since you are providing a new matrix with >> the same "state" value as the previous matrix the PC code the >> following code >> > So what you mean is that every time I change the value in the matrix, > the PETSc only determines if the nonzero pattern change but not the > values, and if it is unchanged neither of symbolic and numeric > happens. > No, that is not what Barry is saying. PETSc looks at the matrix. If the structure has changed, it does symbolic and numeric factorization. If only values have changes, it does numeric factorization. HOWEVER, you gave it a new matrix with accidentally the same state marker, so it thought nothing had changed. We will fix this by also checking the pointer. For now, if you give the same Mat object back, it will do what you expect. Matt > I found the following code: > > if (!pc->setupcalled) { > ierr = PetscInfo(pc,"Setting up PC for first > time\n");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else if (matstate == pc->matstate) { > ierr = PetscInfo(pc,"Leaving PC with identical preconditioner since > operator is unchanged\n");CHKERRQ(ierr); > PetscFunctionReturn(0); > } else { > if (matnonzerostate > pc->matnonzerostate) { > ierr = PetscInfo(pc,"Setting up PC with different nonzero > pattern\n");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else { > ierr = PetscInfo(pc,"Setting up PC with same nonzero > pattern\n");CHKERRQ(ierr); > pc->flag = SAME_NONZERO_PATTERN; > } > } > > and I commend out "else if (matstate == pc->matstate){}", so it > will do "Setting up PC with same nonzero pattern\n"; and it seems > work in my case, only "MatFactorNumeric_MUMPS()" is calling in the > subsequent iterations. But I am not quite sure, need some more tests. > > Thank you very much for your help indeed. > > Kind Regards, > Shidi > > > > On 2018-05-11 16:13, Smith, Barry F. wrote: > >> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >>> >>> Thank you for your reply. >>> >>> How are you changing the matrix? Do you remember to assemble? >>>> >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >>> >> >> If you use MatCreateMPIAIJWithArrays() you don't need to call >> MatAssemblyBegin() and MatAssemblyEnd(). >> >> But I actually destroy the matrix at the end of each iteration >>> and create the matrix at the beginning of each iteration. >>> >> >> This is a bug in PETSc. Since you are providing a new matrix with >> the same "state" value as the previous matrix the PC code the >> following code >> kicks in: >> >> ierr = PetscObjectStateGet((PetscObject)pc->pmat,&matstate); >> CHKERRQ(ierr); >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >> if (!pc->setupcalled) { >> ierr = PetscInfo(pc,"Setting up PC for first >> time\n");CHKERRQ(ierr); >> pc->flag = DIFFERENT_NONZERO_PATTERN; >> } else if (matstate == pc->matstate) { >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >> since operator is unchanged\n");CHKERRQ(ierr); >> PetscFunctionReturn(0); >> >> and it returns without refactoring. >> >> We need an additional check that the matrix also remains the same. >> >> We will also need a test example that reproduces the problem to >> confirm that we have fixed it. >> >> Barry >> >> >>> Cheers, >>> Shidi >>> >>> On 2018-05-11 12:59, Matthew Knepley wrote: >>> >>>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>>> >>>>> Dear Matt, >>>>> Thank you for your help last time. >>>>> I want to get more detail about the Petsc-MUMPS factorisation; >>>>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>>>> And I found the following functions are quite important to >>>>> the question: >>>>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>>>> r,const MatFactorInfo *info); >>>>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>>>> MatFactorInfo *info); >>>>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>>>> I print some sentence to trace when these functions are called. >>>>> Then I test my code; the values in the matrix is changing but the >>>>> structure stays the same. Below is the output. >>>>> We can see that at 0th step, all the symbolic, numeric and solve >>>>> are called; in the subsequent steps only the solve stage is called, >>>>> the numeric step is not called. >>>>> >>>> How are you changing the matrix? Do you remember to assemble? >>>> Matt >>>> >>>>> Iteration 0 Step 0.0005 Time 0.0005 >>>>> [INFO]: Direct Solver setup >>>>> MatCholeskyFactorSymbolic_MUMPS >>>>> finish MatCholeskyFactorSymbolic_MUMPS >>>>> MatFactorNumeric_MUMPS >>>>> finish MatFactorNumeric_MUMPS >>>>> MatSolve_MUMPS >>>>> Iteration 1 Step 0.0005 Time 0.0005 >>>>> MatSolve_MUMPS >>>>> Iteration 2 Step 0.0005 Time 0.001 >>>>> MatSolve_MUMPS >>>>> [INFO]: End of program!!! >>>>> I am wondering if there is any possibility to split the numeric >>>>> and solve stage (as you mentioned using KSPSolve). >>>>> Thank you very much indeed. >>>>> Kind Regards, >>>>> Shidi >>>>> On 2018-05-04 21:10, Y. Shidi wrote: >>>>> Thank you very much for your reply. >>>>> That is really clear. >>>>> Kind Regards, >>>>> Shidi >>>>> On 2018-05-04 21:05, Matthew Knepley wrote: >>>>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>>>> Dear Matt, >>>>> Thank you very much for your reply! >>>>> So what you mean is that I can just do the KSPSolve() every >>>>> iteration >>>>> once the MUMPS is set? >>>>> Yes. >>>>> That means inside the KSPSolve() the numerical factorization is >>>>> performed. If that is the case, it seems that the ksp object is >>>>> not changed when the values in the matrix are changed. >>>>> Yes. >>>>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>>>> If you do SetOperators, it will redo the factorization. If you do >>>>> not, >>>>> it will look >>>>> at the Mat object, determine that the structure has not changed, >>>>> and >>>>> just redo >>>>> the numerical factorization. >>>>> Thanks, >>>>> Matt >>>>> On 2018-05-04 14:44, Matthew Knepley wrote: >>>>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>>>> Dear PETSc users, >>>>> I am currently using MUMPS to solve linear systems directly. >>>>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>>>> step and then solve the system. >>>>> In my code, the values in the matrix is changed in each iteration, >>>>> but the structure of the matrix stays the same, which means the >>>>> performance can be improved if symbolic factorisation is only >>>>> performed once. Hence, it is necessary to split the symbolic >>>>> and numeric factorisation. However, I cannot find a specific step >>>>> (control parameter) to perform the numeric factorisation. >>>>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>>>> it seems that the symbolic and numeric factorisation always perform >>>>> together. >>>>> If you use KSPSolve instead, it will automatically preserve the >>>>> symbolic >>>>> factorization. >>>>> Thanks, >>>>> Matt >>>>> So I am wondering if anyone has an idea about it. >>>>> Below is how I set up MUMPS solver: >>>>> PC pc; >>>>> PetscBool flg_mumps, flg_mumps_ch; >>>>> flg_mumps = PETSC_FALSE; >>>>> flg_mumps_ch = PETSC_FALSE; >>>>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>>>> NULL); >>>>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>>>> NULL); >>>>> if(flg_mumps ||flg_mumps_ch) >>>>> { >>>>> KSPSetType(_ksp, KSPPREONLY); >>>>> PetscInt ival,icntl; >>>>> PetscReal val; >>>>> KSPGetPC(_ksp, &pc); >>>>> /// Set preconditioner type >>>>> if(flg_mumps) >>>>> { >>>>> PCSetType(pc, PCLU); >>>>> } >>>>> else if(flg_mumps_ch) >>>>> { >>>>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>>>> PCSetType(pc, PCCHOLESKY); >>>>> } >>>>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>>>> PCFactorSetUpMatSolverPackage(pc); >>>>> PCFactorGetMatrix(pc, &_F); >>>>> icntl = 7; ival = 0; >>>>> MatMumpsSetIcntl( _F, icntl, ival ); >>>>> MatMumpsSetIcntl(_F, 3, 6); >>>>> MatMumpsSetIcntl(_F, 4, 2); >>>>> } >>>>> KSPSetUp(_ksp); >>>>> Kind Regards, >>>>> Shidi >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to >>>>> which >>>>> their experiments lead. >>>>> -- Norbert Wiener >>>>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] >>>>> Links: >>>>> ------ >>>>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to >>>>> which >>>>> their experiments lead. >>>>> -- Norbert Wiener >>>>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>>>> Links: >>>>> ------ >>>>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>>>> [2] http://www.caam.rice.edu/~mk51/ [2] >>>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> https://www.cse.buffalo.edu/~knepley/ [2] >>>> Links: >>>> ------ >>>> [1] https://www.cse.buffalo.edu/~knepley/ >>>> [2] http://www.caam.rice.edu/~mk51/ >>>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From evanum at gmail.com Fri May 11 13:28:56 2018 From: evanum at gmail.com (Evan Um) Date: Fri, 11 May 2018 11:28:56 -0700 Subject: [petsc-users] Placing FindPETSc.cmake Message-ID: Hi, I would like to ask a question about FindPETSc.cmake. I place the cmake file in the same directory where main.cpp is placed. I also placed the file in /usr/share/cmake_xx/Modules. Where should i put the file? What else should I do to use the file in cmake? Do I need any other lines in my cmakelists.txt except find_package(petsc)? Thanks for your comments. Evan ------------------------------------------------------------- cmake_minimum_required(VERSION 3.10) project(hellopetsc) SET(CMAKE_CXX_STANDARD 11) SET(CMAKE_C_COMPILER mpicc) SET(CMAKE_CXX_COMPILER mpicxx) find_package(PETSC COMPONENTS CXX) add_executable(hellopetsc main.cpp) ------------------------------------------------------------ CMake Warning at CMakeLists.txt:9 (find_package): By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "PETSC", but CMake did not find one. Could not find a package configuration file provided by "PETSC" with any of the following names: PETSCConfig.cmake petsc-config.cmake Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set "PETSC_DIR" to a directory containing one of the above files. If "PETSC" provides a separate development package or SDK, be sure it has been installed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Fri May 11 13:34:40 2018 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Fri, 11 May 2018 21:34:40 +0300 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: Message-ID: CMAKE is case sensitive on this. You should use find_package(PETSc ?.) > On May 11, 2018, at 9:28 PM, Evan Um wrote: > > Hi, > > I would like to ask a question about FindPETSc.cmake. I place the cmake file in the same directory where main.cpp is placed. I also placed the file in /usr/share/cmake_xx/Modules. Actually, it can be put in any directory pointed by the variable CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ > Where should i put the file? What else should I do to use the file in cmake? Do I need any other lines in my cmakelists.txt except find_package(petsc)? Thanks for your comments. > > Evan > > ------------------------------------------------------------- > > cmake_minimum_required(VERSION 3.10) > > project(hellopetsc) > > SET(CMAKE_CXX_STANDARD 11) > SET(CMAKE_C_COMPILER mpicc) > SET(CMAKE_CXX_COMPILER mpicxx) > > find_package(PETSC COMPONENTS CXX) > > add_executable(hellopetsc main.cpp) > > ------------------------------------------------------------ > CMake Warning at CMakeLists.txt:9 (find_package): > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this project has > asked CMake to find a package configuration file provided by "PETSC", but > CMake did not find one. > > Could not find a package configuration file provided by "PETSC" with any of > the following names: > > PETSCConfig.cmake > petsc-config.cmake > > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set > "PETSC_DIR" to a directory containing one of the above files. If "PETSC" > provides a separate development package or SDK, be sure it has been > installed. From evanum at gmail.com Fri May 11 13:56:38 2018 From: evanum at gmail.com (Evan Um) Date: Fri, 11 May 2018 11:56:38 -0700 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: Message-ID: Hi Stefano, Thanks for your comment. Now, cmake was able to locate FindPETSc.cmake file in my project directory, but I see a new error. Evan Messages ---------------------------------------------- /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc CMake Error at FindPETSc.cmake:123 (include): include could not find load file: FindPackageMultipass Call Stack (most recent call first): CMakeLists.txt:10 (find_package) CMake Error at FindPETSc.cmake:124 (find_package_multipass): Unknown CMake command "find_package_multipass". Call Stack (most recent call first): CMakeLists.txt:10 (find_package) -- Configuring incomplete, errors occurred! See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". ---------------------------------------------- CMakeLists.txt ---------------------------------------------- /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc CMake Error at FindPETSc.cmake:123 (include): include could not find load file: FindPackageMultipass Call Stack (most recent call first): CMakeLists.txt:10 (find_package) CMake Error at FindPETSc.cmake:124 (find_package_multipass): Unknown CMake command "find_package_multipass". Call Stack (most recent call first): CMakeLists.txt:10 (find_package) -- Configuring incomplete, errors occurred! See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". [Finished] On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini wrote: > CMAKE is case sensitive on this. You should use find_package(PETSc ?.) > > > > On May 11, 2018, at 9:28 PM, Evan Um wrote: > > > > Hi, > > > > I would like to ask a question about FindPETSc.cmake. I place the cmake > file in the same directory where main.cpp is placed. I also placed the file > in /usr/share/cmake_xx/Modules. > > Actually, it can be put in any directory pointed by the variable > CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ > > > Where should i put the file? What else should I do to use the file in > cmake? Do I need any other lines in my cmakelists.txt except > find_package(petsc)? Thanks for your comments. > > > > Evan > > > > ------------------------------------------------------------- > > > > cmake_minimum_required(VERSION 3.10) > > > > project(hellopetsc) > > > > SET(CMAKE_CXX_STANDARD 11) > > SET(CMAKE_C_COMPILER mpicc) > > SET(CMAKE_CXX_COMPILER mpicxx) > > > > find_package(PETSC COMPONENTS CXX) > > > > add_executable(hellopetsc main.cpp) > > > > ------------------------------------------------------------ > > CMake Warning at CMakeLists.txt:9 (find_package): > > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this project > has > > asked CMake to find a package configuration file provided by "PETSC", > but > > CMake did not find one. > > > > Could not find a package configuration file provided by "PETSC" with > any of > > the following names: > > > > PETSCConfig.cmake > > petsc-config.cmake > > > > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set > > "PETSC_DIR" to a directory containing one of the above files. If > "PETSC" > > provides a separate development package or SDK, be sure it has been > > installed. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri May 11 14:08:42 2018 From: jed at jedbrown.org (Jed Brown) Date: Fri, 11 May 2018 13:08:42 -0600 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: Message-ID: <874ljeau2t.fsf@jedbrown.org> Yes, it depends on this module from the same repository. Note that you can use pkg-config to find PETSc these days. Evan Um writes: > Hi Stefano, > > Thanks for your comment. Now, cmake was able to locate FindPETSc.cmake file > in my project directory, but I see a new error. > > Evan > > > Messages > ---------------------------------------------- > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > CMake Error at FindPETSc.cmake:123 (include): > include could not find load file: > > FindPackageMultipass > Call Stack (most recent call first): > CMakeLists.txt:10 (find_package) > > > CMake Error at FindPETSc.cmake:124 (find_package_multipass): > Unknown CMake command "find_package_multipass". > Call Stack (most recent call first): > CMakeLists.txt:10 (find_package) > > > -- Configuring incomplete, errors occurred! > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". > ---------------------------------------------- > > CMakeLists.txt > ---------------------------------------------- > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > CMake Error at FindPETSc.cmake:123 (include): > include could not find load file: > > FindPackageMultipass > Call Stack (most recent call first): > CMakeLists.txt:10 (find_package) > > > CMake Error at FindPETSc.cmake:124 (find_package_multipass): > Unknown CMake command "find_package_multipass". > Call Stack (most recent call first): > CMakeLists.txt:10 (find_package) > > > -- Configuring incomplete, errors occurred! > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". > > [Finished] > > > > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini > wrote: > >> CMAKE is case sensitive on this. You should use find_package(PETSc ?.) >> >> >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: >> > >> > Hi, >> > >> > I would like to ask a question about FindPETSc.cmake. I place the cmake >> file in the same directory where main.cpp is placed. I also placed the file >> in /usr/share/cmake_xx/Modules. >> >> Actually, it can be put in any directory pointed by the variable >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ >> >> > Where should i put the file? What else should I do to use the file in >> cmake? Do I need any other lines in my cmakelists.txt except >> find_package(petsc)? Thanks for your comments. >> > >> > Evan >> > >> > ------------------------------------------------------------- >> > >> > cmake_minimum_required(VERSION 3.10) >> > >> > project(hellopetsc) >> > >> > SET(CMAKE_CXX_STANDARD 11) >> > SET(CMAKE_C_COMPILER mpicc) >> > SET(CMAKE_CXX_COMPILER mpicxx) >> > >> > find_package(PETSC COMPONENTS CXX) >> > >> > add_executable(hellopetsc main.cpp) >> > >> > ------------------------------------------------------------ >> > CMake Warning at CMakeLists.txt:9 (find_package): >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this project >> has >> > asked CMake to find a package configuration file provided by "PETSC", >> but >> > CMake did not find one. >> > >> > Could not find a package configuration file provided by "PETSC" with >> any of >> > the following names: >> > >> > PETSCConfig.cmake >> > petsc-config.cmake >> > >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set >> > "PETSC_DIR" to a directory containing one of the above files. If >> "PETSC" >> > provides a separate development package or SDK, be sure it has been >> > installed. >> >> From bsmith at mcs.anl.gov Fri May 11 14:08:54 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 11 May 2018 19:08:54 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> Message-ID: <60859FBB-436A-4BD4-8473-AEF5F5C6516C@mcs.anl.gov> > On May 11, 2018, at 12:10 PM, Matthew Knepley wrote: > > On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: > Thank you very much for your reply, Barry. > > This is a bug in PETSc. Since you are providing a new matrix with > the same "state" value as the previous matrix the PC code the > following code > So what you mean is that every time I change the value in the matrix, > the PETSc only determines if the nonzero pattern change but not the > values, and if it is unchanged neither of symbolic and numeric > happens. > > No, that is not what Barry is saying. Matt, what you say below is exactly what I was saying. > > PETSc looks at the matrix. > If the structure has changed, it does symbolic and numeric factorization. > If only values have changes, it does numeric factorization. > > HOWEVER, you gave it a new matrix with accidentally the same state marker, > so it thought nothing had changed. We will fix this by also checking the pointer. > For now, if you give the same Mat object back, it will do what you expect. > > Matt > > I found the following code: > > if (!pc->setupcalled) { > ierr = PetscInfo(pc,"Setting up PC for first time\n");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else if (matstate == pc->matstate) { > ierr = PetscInfo(pc,"Leaving PC with identical preconditioner since operator is unchanged\n");CHKERRQ(ierr); > PetscFunctionReturn(0); > } else { > if (matnonzerostate > pc->matnonzerostate) { > ierr = PetscInfo(pc,"Setting up PC with different nonzero pattern\n");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else { > ierr = PetscInfo(pc,"Setting up PC with same nonzero pattern\n");CHKERRQ(ierr); > pc->flag = SAME_NONZERO_PATTERN; > } > } > > and I commend out "else if (matstate == pc->matstate){}", so it > will do "Setting up PC with same nonzero pattern\n"; and it seems > work in my case, only "MatFactorNumeric_MUMPS()" is calling in the > subsequent iterations. But I am not quite sure, need some more tests. > > Thank you very much for your help indeed. > > Kind Regards, > Shidi > > > > On 2018-05-11 16:13, Smith, Barry F. wrote: > On May 11, 2018, at 8:14 AM, Y. Shidi wrote: > > Thank you for your reply. > > How are you changing the matrix? Do you remember to assemble? > I use MatCreateMPIAIJWithArrays() to create the matrix, > and after that I call MatAssemblyBegin() and MatAssemblyEnd(). > > If you use MatCreateMPIAIJWithArrays() you don't need to call > MatAssemblyBegin() and MatAssemblyEnd(). > > But I actually destroy the matrix at the end of each iteration > and create the matrix at the beginning of each iteration. > > This is a bug in PETSc. Since you are providing a new matrix with > the same "state" value as the previous matrix the PC code the > following code > kicks in: > > ierr = PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); > ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); > if (!pc->setupcalled) { > ierr = PetscInfo(pc,"Setting up PC for first > time\n");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else if (matstate == pc->matstate) { > ierr = PetscInfo(pc,"Leaving PC with identical preconditioner > since operator is unchanged\n");CHKERRQ(ierr); > PetscFunctionReturn(0); > > and it returns without refactoring. > > We need an additional check that the matrix also remains the same. > > We will also need a test example that reproduces the problem to > confirm that we have fixed it. > > Barry > > > Cheers, > Shidi > > On 2018-05-11 12:59, Matthew Knepley wrote: > On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: > Dear Matt, > Thank you for your help last time. > I want to get more detail about the Petsc-MUMPS factorisation; > so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". > And I found the following functions are quite important to > the question: > PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS > r,const MatFactorInfo *info); > PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const > MatFactorInfo *info); > PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); > I print some sentence to trace when these functions are called. > Then I test my code; the values in the matrix is changing but the > structure stays the same. Below is the output. > We can see that at 0th step, all the symbolic, numeric and solve > are called; in the subsequent steps only the solve stage is called, > the numeric step is not called. > How are you changing the matrix? Do you remember to assemble? > Matt > Iteration 0 Step 0.0005 Time 0.0005 > [INFO]: Direct Solver setup > MatCholeskyFactorSymbolic_MUMPS > finish MatCholeskyFactorSymbolic_MUMPS > MatFactorNumeric_MUMPS > finish MatFactorNumeric_MUMPS > MatSolve_MUMPS > Iteration 1 Step 0.0005 Time 0.0005 > MatSolve_MUMPS > Iteration 2 Step 0.0005 Time 0.001 > MatSolve_MUMPS > [INFO]: End of program!!! > I am wondering if there is any possibility to split the numeric > and solve stage (as you mentioned using KSPSolve). > Thank you very much indeed. > Kind Regards, > Shidi > On 2018-05-04 21:10, Y. Shidi wrote: > Thank you very much for your reply. > That is really clear. > Kind Regards, > Shidi > On 2018-05-04 21:05, Matthew Knepley wrote: > On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: > Dear Matt, > Thank you very much for your reply! > So what you mean is that I can just do the KSPSolve() every > iteration > once the MUMPS is set? > Yes. > That means inside the KSPSolve() the numerical factorization is > performed. If that is the case, it seems that the ksp object is > not changed when the values in the matrix are changed. > Yes. > Or do I need to call both KSPSetOperators() and KSPSolve()? > If you do SetOperators, it will redo the factorization. If you do > not, > it will look > at the Mat object, determine that the structure has not changed, > and > just redo > the numerical factorization. > Thanks, > Matt > On 2018-05-04 14:44, Matthew Knepley wrote: > On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: > Dear PETSc users, > I am currently using MUMPS to solve linear systems directly. > Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing > step and then solve the system. > In my code, the values in the matrix is changed in each iteration, > but the structure of the matrix stays the same, which means the > performance can be improved if symbolic factorisation is only > performed once. Hence, it is necessary to split the symbolic > and numeric factorisation. However, I cannot find a specific step > (control parameter) to perform the numeric factorisation. > I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, > it seems that the symbolic and numeric factorisation always perform > together. > If you use KSPSolve instead, it will automatically preserve the > symbolic > factorization. > Thanks, > Matt > So I am wondering if anyone has an idea about it. > Below is how I set up MUMPS solver: > PC pc; > PetscBool flg_mumps, flg_mumps_ch; > flg_mumps = PETSC_FALSE; > flg_mumps_ch = PETSC_FALSE; > PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, > NULL); > PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, > NULL); > if(flg_mumps ||flg_mumps_ch) > { > KSPSetType(_ksp, KSPPREONLY); > PetscInt ival,icntl; > PetscReal val; > KSPGetPC(_ksp, &pc); > /// Set preconditioner type > if(flg_mumps) > { > PCSetType(pc, PCLU); > } > else if(flg_mumps_ch) > { > MatSetOption(A, MAT_SPD, PETSC_TRUE); > PCSetType(pc, PCCHOLESKY); > } > PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(pc); > PCFactorGetMatrix(pc, &_F); > icntl = 7; ival = 0; > MatMumpsSetIcntl( _F, icntl, ival ); > MatMumpsSetIcntl(_F, 3, 6); > MatMumpsSetIcntl(_F, 4, 2); > } > KSPSetUp(_ksp); > Kind Regards, > Shidi > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which > their experiments lead. > -- Norbert Wiener > https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] > Links: > ------ > [1] http://www.caam.rice.edu/~mk51/ [2] [2] > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which > their experiments lead. > -- Norbert Wiener > https://www.cse.buffalo.edu/~knepley/ [1] [2] > Links: > ------ > [1] https://www.cse.buffalo.edu/~knepley/ [1] > [2] http://www.caam.rice.edu/~mk51/ [2] > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > https://www.cse.buffalo.edu/~knepley/ [2] > Links: > ------ > [1] https://www.cse.buffalo.edu/~knepley/ > [2] http://www.caam.rice.edu/~mk51/ > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From evanum at gmail.com Fri May 11 16:06:21 2018 From: evanum at gmail.com (Evan Um) Date: Fri, 11 May 2018 14:06:21 -0700 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: <874ljeau2t.fsf@jedbrown.org> References: <874ljeau2t.fsf@jedbrown.org> Message-ID: Hi Jed, Thanks for the comment. I added the module but still saw errors (before I arrived here, I had to do cp - r /home/evan/petsc//lib/petsc/ conf/rules /home/evan/petsc//lib/petsc/conf/petscrules). They are supposed to be defined by FindPETSc.cmake. How could I solve these errors? Could you also explain a little bit about how to use pkg-config to find PETSc? Thanks! Evan ---------------- cmake_minimum_required(VERSION 3.10) project(hellopetsc) list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) SET(CMAKE_CXX_STANDARD 11) SET(CMAKE_C_COMPILER mpicc) SET(CMAKE_CXX_COMPILER mpicxx) find_package(PETSc COMPONENTS CXX) add_executable(hellopetsc main.cpp) ----------------- ----------------- /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found version "3.9.1") -- Configuring done -- Generating done -- Build files have been written to: /home/evan/CLionProjects/ hellopetsc/cmake-build-debug [Finished] ----------------- On Fri, May 11, 2018 at 12:08 PM, Jed Brown wrote: > Yes, it depends on this module from the same repository. > > Note that you can use pkg-config to find PETSc these days. > > Evan Um writes: > > > Hi Stefano, > > > > Thanks for your comment. Now, cmake was able to locate FindPETSc.cmake > file > > in my project directory, but I see a new error. > > > > Evan > > > > > > Messages > > ---------------------------------------------- > > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > -DCMAKE_BUILD_TYPE=Debug > > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > > CMake Error at FindPETSc.cmake:123 (include): > > include could not find load file: > > > > FindPackageMultipass > > Call Stack (most recent call first): > > CMakeLists.txt:10 (find_package) > > > > > > CMake Error at FindPETSc.cmake:124 (find_package_multipass): > > Unknown CMake command "find_package_multipass". > > Call Stack (most recent call first): > > CMakeLists.txt:10 (find_package) > > > > > > -- Configuring incomplete, errors occurred! > > See also > > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ > CMakeFiles/CMakeOutput.log". > > See also > > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ > CMakeFiles/CMakeError.log". > > ---------------------------------------------- > > > > CMakeLists.txt > > ---------------------------------------------- > > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > -DCMAKE_BUILD_TYPE=Debug > > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > > CMake Error at FindPETSc.cmake:123 (include): > > include could not find load file: > > > > FindPackageMultipass > > Call Stack (most recent call first): > > CMakeLists.txt:10 (find_package) > > > > > > CMake Error at FindPETSc.cmake:124 (find_package_multipass): > > Unknown CMake command "find_package_multipass". > > Call Stack (most recent call first): > > CMakeLists.txt:10 (find_package) > > > > > > -- Configuring incomplete, errors occurred! > > See also > > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ > CMakeFiles/CMakeOutput.log". > > See also > > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ > CMakeFiles/CMakeError.log". > > > > [Finished] > > > > > > > > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini < > stefano.zampini at gmail.com > >> wrote: > > > >> CMAKE is case sensitive on this. You should use find_package(PETSc ?.) > >> > >> > >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: > >> > > >> > Hi, > >> > > >> > I would like to ask a question about FindPETSc.cmake. I place the > cmake > >> file in the same directory where main.cpp is placed. I also placed the > file > >> in /usr/share/cmake_xx/Modules. > >> > >> Actually, it can be put in any directory pointed by the variable > >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ > >> > >> > Where should i put the file? What else should I do to use the file in > >> cmake? Do I need any other lines in my cmakelists.txt except > >> find_package(petsc)? Thanks for your comments. > >> > > >> > Evan > >> > > >> > ------------------------------------------------------------- > >> > > >> > cmake_minimum_required(VERSION 3.10) > >> > > >> > project(hellopetsc) > >> > > >> > SET(CMAKE_CXX_STANDARD 11) > >> > SET(CMAKE_C_COMPILER mpicc) > >> > SET(CMAKE_CXX_COMPILER mpicxx) > >> > > >> > find_package(PETSC COMPONENTS CXX) > >> > > >> > add_executable(hellopetsc main.cpp) > >> > > >> > ------------------------------------------------------------ > >> > CMake Warning at CMakeLists.txt:9 (find_package): > >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this project > >> has > >> > asked CMake to find a package configuration file provided by > "PETSC", > >> but > >> > CMake did not find one. > >> > > >> > Could not find a package configuration file provided by "PETSC" with > >> any of > >> > the following names: > >> > > >> > PETSCConfig.cmake > >> > petsc-config.cmake > >> > > >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set > >> > "PETSC_DIR" to a directory containing one of the above files. If > >> "PETSC" > >> > provides a separate development package or SDK, be sure it has been > >> > installed. > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 11 16:13:23 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 May 2018 17:13:23 -0400 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: <874ljeau2t.fsf@jedbrown.org> Message-ID: On Fri, May 11, 2018 at 5:06 PM, Evan Um wrote: > Hi Jed, > > Thanks for the comment. I added the module but still saw errors (before I > arrived here, I had to do cp - r /home/evan/petsc//lib/petsc/ > conf/rules /home/evan/petsc//lib/petsc/conf/petscrules). > That is wrong. It means PETSC_ARCH is (null) instead of the correct string. Matt > They are supposed to be defined by FindPETSc.cmake. How could I solve > these errors? > > Could you also explain a little bit about how to use pkg-config to find > PETSc? Thanks! > > Evan > > ---------------- > cmake_minimum_required(VERSION 3.10) > > project(hellopetsc) > list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) > > SET(CMAKE_CXX_STANDARD 11) > SET(CMAKE_C_COMPILER mpicc) > SET(CMAKE_CXX_COMPILER mpicxx) > > find_package(PETSc COMPONENTS CXX) > > add_executable(hellopetsc main.cpp) > ----------------- > > ----------------- > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" > /home/evan/CLionProjects/hellopetsc > -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. > (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found > version "3.9.1") > -- Configuring done > -- Generating done > -- Build files have been written to: /home/evan/CLionProjects/hello > petsc/cmake-build-debug > > [Finished] > ----------------- > > > On Fri, May 11, 2018 at 12:08 PM, Jed Brown wrote: > >> Yes, it depends on this module from the same repository. >> >> Note that you can use pkg-config to find PETSc these days. >> >> Evan Um writes: >> >> > Hi Stefano, >> > >> > Thanks for your comment. Now, cmake was able to locate FindPETSc.cmake >> file >> > in my project directory, but I see a new error. >> > >> > Evan >> > >> > >> > Messages >> > ---------------------------------------------- >> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> -DCMAKE_BUILD_TYPE=Debug >> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >> > CMake Error at FindPETSc.cmake:123 (include): >> > include could not find load file: >> > >> > FindPackageMultipass >> > Call Stack (most recent call first): >> > CMakeLists.txt:10 (find_package) >> > >> > >> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >> > Unknown CMake command "find_package_multipass". >> > Call Stack (most recent call first): >> > CMakeLists.txt:10 (find_package) >> > >> > >> > -- Configuring incomplete, errors occurred! >> > See also >> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> Files/CMakeOutput.log". >> > See also >> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> Files/CMakeError.log". >> > ---------------------------------------------- >> > >> > CMakeLists.txt >> > ---------------------------------------------- >> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> -DCMAKE_BUILD_TYPE=Debug >> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >> > CMake Error at FindPETSc.cmake:123 (include): >> > include could not find load file: >> > >> > FindPackageMultipass >> > Call Stack (most recent call first): >> > CMakeLists.txt:10 (find_package) >> > >> > >> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >> > Unknown CMake command "find_package_multipass". >> > Call Stack (most recent call first): >> > CMakeLists.txt:10 (find_package) >> > >> > >> > -- Configuring incomplete, errors occurred! >> > See also >> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> Files/CMakeOutput.log". >> > See also >> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> Files/CMakeError.log". >> > >> > [Finished] >> > >> > >> > >> > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini < >> stefano.zampini at gmail.com >> >> wrote: >> > >> >> CMAKE is case sensitive on this. You should use find_package(PETSc ?.) >> >> >> >> >> >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: >> >> > >> >> > Hi, >> >> > >> >> > I would like to ask a question about FindPETSc.cmake. I place the >> cmake >> >> file in the same directory where main.cpp is placed. I also placed the >> file >> >> in /usr/share/cmake_xx/Modules. >> >> >> >> Actually, it can be put in any directory pointed by the variable >> >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ >> >> >> >> > Where should i put the file? What else should I do to use the file in >> >> cmake? Do I need any other lines in my cmakelists.txt except >> >> find_package(petsc)? Thanks for your comments. >> >> > >> >> > Evan >> >> > >> >> > ------------------------------------------------------------- >> >> > >> >> > cmake_minimum_required(VERSION 3.10) >> >> > >> >> > project(hellopetsc) >> >> > >> >> > SET(CMAKE_CXX_STANDARD 11) >> >> > SET(CMAKE_C_COMPILER mpicc) >> >> > SET(CMAKE_CXX_COMPILER mpicxx) >> >> > >> >> > find_package(PETSC COMPONENTS CXX) >> >> > >> >> > add_executable(hellopetsc main.cpp) >> >> > >> >> > ------------------------------------------------------------ >> >> > CMake Warning at CMakeLists.txt:9 (find_package): >> >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this >> project >> >> has >> >> > asked CMake to find a package configuration file provided by >> "PETSC", >> >> but >> >> > CMake did not find one. >> >> > >> >> > Could not find a package configuration file provided by "PETSC" >> with >> >> any of >> >> > the following names: >> >> > >> >> > PETSCConfig.cmake >> >> > petsc-config.cmake >> >> > >> >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set >> >> > "PETSC_DIR" to a directory containing one of the above files. If >> >> "PETSC" >> >> > provides a separate development package or SDK, be sure it has been >> >> > installed. >> >> >> >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From evanum at gmail.com Fri May 11 17:04:47 2018 From: evanum at gmail.com (Evan Um) Date: Fri, 11 May 2018 15:04:47 -0700 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: <874ljeau2t.fsf@jedbrown.org> Message-ID: Matt and Jed, Thanks for your comments. I still have a trouble in using FindPETSc.cmake. I removed the directory: (i.e. canceling cp - r /home/evan/petsc//lib/petsc/ conf/rules /home/evan/petsc//lib/petsc/conf/petscrules) and explicitly set SET (PETSC_ARCH "arch-linux2-c-debug all"). Now, FindPETSc.cmake does not recognize PETSC_DIR and PETSC_ARCH. When I read the comments inside FindPETSc.cmake, the twos should be refined by FindPETSc.cmake. BTW, I have my petsc at /home/evan/petsc. So all are defaults. Thanks for your kind help. Evan -------------- cmake_minimum_required(VERSION 3.10) project(hellopetsc) list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) SET(CMAKE_CXX_STANDARD 11) SET(CMAKE_C_COMPILER mpicc) SET(CMAKE_CXX_COMPILER mpicxx) SET (PETSC_ARCH "arch-linux2-c-debug all") find_package(PETSc COMPONENTS CXX) add_executable(hellopetsc main.cpp) -------------- /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc CMake Error at FindPETSc.cmake:140 (message): The pair PETSC_DIR=/home/evan/petsc PETSC_ARCH=arch-linux2-c-debug all do not specify a valid PETSc installation Call Stack (most recent call first): CMakeLists.txt:12 (find_package) -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found version "3.9.1") -- Configuring incomplete, errors occurred! See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". [Finished] -------------- On Fri, May 11, 2018 at 2:13 PM, Matthew Knepley wrote: > On Fri, May 11, 2018 at 5:06 PM, Evan Um wrote: > >> Hi Jed, >> >> Thanks for the comment. I added the module but still saw errors (before I >> arrived here, I had to do cp - r /home/evan/petsc//lib/petsc/ >> conf/rules /home/evan/petsc//lib/petsc/conf/petscrules). >> > > That is wrong. It means PETSC_ARCH is (null) instead of the correct string. > > Matt > > >> They are supposed to be defined by FindPETSc.cmake. How could I solve >> these errors? >> >> Could you also explain a little bit about how to use pkg-config to find >> PETSc? Thanks! >> >> Evan >> >> ---------------- >> cmake_minimum_required(VERSION 3.10) >> >> project(hellopetsc) >> list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) >> >> SET(CMAKE_CXX_STANDARD 11) >> SET(CMAKE_C_COMPILER mpicc) >> SET(CMAKE_CXX_COMPILER mpicxx) >> >> find_package(PETSc COMPONENTS CXX) >> >> add_executable(hellopetsc main.cpp) >> ----------------- >> >> ----------------- >> /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" >> /home/evan/CLionProjects/hellopetsc >> -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. >> (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found >> version "3.9.1") >> -- Configuring done >> -- Generating done >> -- Build files have been written to: /home/evan/CLionProjects/hello >> petsc/cmake-build-debug >> >> [Finished] >> ----------------- >> >> >> On Fri, May 11, 2018 at 12:08 PM, Jed Brown wrote: >> >>> Yes, it depends on this module from the same repository. >>> >>> Note that you can use pkg-config to find PETSc these days. >>> >>> Evan Um writes: >>> >>> > Hi Stefano, >>> > >>> > Thanks for your comment. Now, cmake was able to locate FindPETSc.cmake >>> file >>> > in my project directory, but I see a new error. >>> > >>> > Evan >>> > >>> > >>> > Messages >>> > ---------------------------------------------- >>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >>> -DCMAKE_BUILD_TYPE=Debug >>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >>> > CMake Error at FindPETSc.cmake:123 (include): >>> > include could not find load file: >>> > >>> > FindPackageMultipass >>> > Call Stack (most recent call first): >>> > CMakeLists.txt:10 (find_package) >>> > >>> > >>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >>> > Unknown CMake command "find_package_multipass". >>> > Call Stack (most recent call first): >>> > CMakeLists.txt:10 (find_package) >>> > >>> > >>> > -- Configuring incomplete, errors occurred! >>> > See also >>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>> Files/CMakeOutput.log". >>> > See also >>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>> Files/CMakeError.log". >>> > ---------------------------------------------- >>> > >>> > CMakeLists.txt >>> > ---------------------------------------------- >>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >>> -DCMAKE_BUILD_TYPE=Debug >>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >>> > CMake Error at FindPETSc.cmake:123 (include): >>> > include could not find load file: >>> > >>> > FindPackageMultipass >>> > Call Stack (most recent call first): >>> > CMakeLists.txt:10 (find_package) >>> > >>> > >>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >>> > Unknown CMake command "find_package_multipass". >>> > Call Stack (most recent call first): >>> > CMakeLists.txt:10 (find_package) >>> > >>> > >>> > -- Configuring incomplete, errors occurred! >>> > See also >>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>> Files/CMakeOutput.log". >>> > See also >>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>> Files/CMakeError.log". >>> > >>> > [Finished] >>> > >>> > >>> > >>> > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini < >>> stefano.zampini at gmail.com >>> >> wrote: >>> > >>> >> CMAKE is case sensitive on this. You should use find_package(PETSc ?.) >>> >> >>> >> >>> >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: >>> >> > >>> >> > Hi, >>> >> > >>> >> > I would like to ask a question about FindPETSc.cmake. I place the >>> cmake >>> >> file in the same directory where main.cpp is placed. I also placed >>> the file >>> >> in /usr/share/cmake_xx/Modules. >>> >> >>> >> Actually, it can be put in any directory pointed by the variable >>> >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ >>> >> >>> >> > Where should i put the file? What else should I do to use the file >>> in >>> >> cmake? Do I need any other lines in my cmakelists.txt except >>> >> find_package(petsc)? Thanks for your comments. >>> >> > >>> >> > Evan >>> >> > >>> >> > ------------------------------------------------------------- >>> >> > >>> >> > cmake_minimum_required(VERSION 3.10) >>> >> > >>> >> > project(hellopetsc) >>> >> > >>> >> > SET(CMAKE_CXX_STANDARD 11) >>> >> > SET(CMAKE_C_COMPILER mpicc) >>> >> > SET(CMAKE_CXX_COMPILER mpicxx) >>> >> > >>> >> > find_package(PETSC COMPONENTS CXX) >>> >> > >>> >> > add_executable(hellopetsc main.cpp) >>> >> > >>> >> > ------------------------------------------------------------ >>> >> > CMake Warning at CMakeLists.txt:9 (find_package): >>> >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this >>> project >>> >> has >>> >> > asked CMake to find a package configuration file provided by >>> "PETSC", >>> >> but >>> >> > CMake did not find one. >>> >> > >>> >> > Could not find a package configuration file provided by "PETSC" >>> with >>> >> any of >>> >> > the following names: >>> >> > >>> >> > PETSCConfig.cmake >>> >> > petsc-config.cmake >>> >> > >>> >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set >>> >> > "PETSC_DIR" to a directory containing one of the above files. If >>> >> "PETSC" >>> >> > provides a separate development package or SDK, be sure it has >>> been >>> >> > installed. >>> >> >>> >> >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri May 11 17:10:53 2018 From: jed at jedbrown.org (Jed Brown) Date: Fri, 11 May 2018 16:10:53 -0600 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: <874ljeau2t.fsf@jedbrown.org> Message-ID: <87wow9972q.fsf@jedbrown.org> Evan Um writes: > Matt and Jed, > > Thanks for your comments. > > I still have a trouble in using FindPETSc.cmake. I removed the directory: > (i.e. canceling cp - r /home/evan/petsc//lib/petsc/ > conf/rules /home/evan/petsc//lib/petsc/conf/petscrules) and explicitly set > SET (PETSC_ARCH "arch-linux2-c-debug all"). No " all". > Now, FindPETSc.cmake does not recognize PETSC_DIR and PETSC_ARCH. When I > read the comments inside FindPETSc.cmake, the twos should be refined by > FindPETSc.cmake. BTW, I have my petsc at /home/evan/petsc. So all are > defaults. > > Thanks for your kind help. > > Evan > > -------------- > cmake_minimum_required(VERSION 3.10) > > project(hellopetsc) > list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) > > SET(CMAKE_CXX_STANDARD 11) > SET(CMAKE_C_COMPILER mpicc) > SET(CMAKE_CXX_COMPILER mpicxx) > > SET (PETSC_ARCH "arch-linux2-c-debug all") > > find_package(PETSc COMPONENTS CXX) > > add_executable(hellopetsc main.cpp) > -------------- > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > CMake Error at FindPETSc.cmake:140 (message): > The pair PETSC_DIR=/home/evan/petsc PETSC_ARCH=arch-linux2-c-debug all do > not specify a valid PETSc installation > Call Stack (most recent call first): > CMakeLists.txt:12 (find_package) > > > -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. > (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found > version "3.9.1") > -- Configuring incomplete, errors occurred! > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". > > [Finished] > -------------- > > > On Fri, May 11, 2018 at 2:13 PM, Matthew Knepley wrote: > >> On Fri, May 11, 2018 at 5:06 PM, Evan Um wrote: >> >>> Hi Jed, >>> >>> Thanks for the comment. I added the module but still saw errors (before I >>> arrived here, I had to do cp - r /home/evan/petsc//lib/petsc/ >>> conf/rules /home/evan/petsc//lib/petsc/conf/petscrules). >>> >> >> That is wrong. It means PETSC_ARCH is (null) instead of the correct string. >> >> Matt >> >> >>> They are supposed to be defined by FindPETSc.cmake. How could I solve >>> these errors? >>> >>> Could you also explain a little bit about how to use pkg-config to find >>> PETSc? Thanks! >>> >>> Evan >>> >>> ---------------- >>> cmake_minimum_required(VERSION 3.10) >>> >>> project(hellopetsc) >>> list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) >>> >>> SET(CMAKE_CXX_STANDARD 11) >>> SET(CMAKE_C_COMPILER mpicc) >>> SET(CMAKE_CXX_COMPILER mpicxx) >>> >>> find_package(PETSc COMPONENTS CXX) >>> >>> add_executable(hellopetsc main.cpp) >>> ----------------- >>> >>> ----------------- >>> /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >>> -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" >>> /home/evan/CLionProjects/hellopetsc >>> -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. >>> (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found >>> version "3.9.1") >>> -- Configuring done >>> -- Generating done >>> -- Build files have been written to: /home/evan/CLionProjects/hello >>> petsc/cmake-build-debug >>> >>> [Finished] >>> ----------------- >>> >>> >>> On Fri, May 11, 2018 at 12:08 PM, Jed Brown wrote: >>> >>>> Yes, it depends on this module from the same repository. >>>> >>>> Note that you can use pkg-config to find PETSc these days. >>>> >>>> Evan Um writes: >>>> >>>> > Hi Stefano, >>>> > >>>> > Thanks for your comment. Now, cmake was able to locate FindPETSc.cmake >>>> file >>>> > in my project directory, but I see a new error. >>>> > >>>> > Evan >>>> > >>>> > >>>> > Messages >>>> > ---------------------------------------------- >>>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >>>> -DCMAKE_BUILD_TYPE=Debug >>>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >>>> > CMake Error at FindPETSc.cmake:123 (include): >>>> > include could not find load file: >>>> > >>>> > FindPackageMultipass >>>> > Call Stack (most recent call first): >>>> > CMakeLists.txt:10 (find_package) >>>> > >>>> > >>>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >>>> > Unknown CMake command "find_package_multipass". >>>> > Call Stack (most recent call first): >>>> > CMakeLists.txt:10 (find_package) >>>> > >>>> > >>>> > -- Configuring incomplete, errors occurred! >>>> > See also >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>>> Files/CMakeOutput.log". >>>> > See also >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>>> Files/CMakeError.log". >>>> > ---------------------------------------------- >>>> > >>>> > CMakeLists.txt >>>> > ---------------------------------------------- >>>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >>>> -DCMAKE_BUILD_TYPE=Debug >>>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >>>> > CMake Error at FindPETSc.cmake:123 (include): >>>> > include could not find load file: >>>> > >>>> > FindPackageMultipass >>>> > Call Stack (most recent call first): >>>> > CMakeLists.txt:10 (find_package) >>>> > >>>> > >>>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >>>> > Unknown CMake command "find_package_multipass". >>>> > Call Stack (most recent call first): >>>> > CMakeLists.txt:10 (find_package) >>>> > >>>> > >>>> > -- Configuring incomplete, errors occurred! >>>> > See also >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>>> Files/CMakeOutput.log". >>>> > See also >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >>>> Files/CMakeError.log". >>>> > >>>> > [Finished] >>>> > >>>> > >>>> > >>>> > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini < >>>> stefano.zampini at gmail.com >>>> >> wrote: >>>> > >>>> >> CMAKE is case sensitive on this. You should use find_package(PETSc ?.) >>>> >> >>>> >> >>>> >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: >>>> >> > >>>> >> > Hi, >>>> >> > >>>> >> > I would like to ask a question about FindPETSc.cmake. I place the >>>> cmake >>>> >> file in the same directory where main.cpp is placed. I also placed >>>> the file >>>> >> in /usr/share/cmake_xx/Modules. >>>> >> >>>> >> Actually, it can be put in any directory pointed by the variable >>>> >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ >>>> >> >>>> >> > Where should i put the file? What else should I do to use the file >>>> in >>>> >> cmake? Do I need any other lines in my cmakelists.txt except >>>> >> find_package(petsc)? Thanks for your comments. >>>> >> > >>>> >> > Evan >>>> >> > >>>> >> > ------------------------------------------------------------- >>>> >> > >>>> >> > cmake_minimum_required(VERSION 3.10) >>>> >> > >>>> >> > project(hellopetsc) >>>> >> > >>>> >> > SET(CMAKE_CXX_STANDARD 11) >>>> >> > SET(CMAKE_C_COMPILER mpicc) >>>> >> > SET(CMAKE_CXX_COMPILER mpicxx) >>>> >> > >>>> >> > find_package(PETSC COMPONENTS CXX) >>>> >> > >>>> >> > add_executable(hellopetsc main.cpp) >>>> >> > >>>> >> > ------------------------------------------------------------ >>>> >> > CMake Warning at CMakeLists.txt:9 (find_package): >>>> >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this >>>> project >>>> >> has >>>> >> > asked CMake to find a package configuration file provided by >>>> "PETSC", >>>> >> but >>>> >> > CMake did not find one. >>>> >> > >>>> >> > Could not find a package configuration file provided by "PETSC" >>>> with >>>> >> any of >>>> >> > the following names: >>>> >> > >>>> >> > PETSCConfig.cmake >>>> >> > petsc-config.cmake >>>> >> > >>>> >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or set >>>> >> > "PETSC_DIR" to a directory containing one of the above files. If >>>> >> "PETSC" >>>> >> > provides a separate development package or SDK, be sure it has >>>> been >>>> >> > installed. >>>> >> >>>> >> >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> From evanum at gmail.com Fri May 11 17:41:16 2018 From: evanum at gmail.com (Evan Um) Date: Fri, 11 May 2018 15:41:16 -0700 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: <87wow9972q.fsf@jedbrown.org> References: <874ljeau2t.fsf@jedbrown.org> <87wow9972q.fsf@jedbrown.org> Message-ID: Jed, After all was removed, I see a new error from FindPETSc.cmake. I added to cmake SET(CMAKE_CXX_COMPILER mpicxx) FIND_PACKAGE(MPI REQUIRED) INCLUDE_DIRECTORIES(${MPI_INCLUDE_PATH}) and tested but no luck. It seems that I need to let FindPETSc.cmake know a compiler location. Thanks for your help. Evan ------------ cmake_minimum_required(VERSION 3.10) project(hellopetsc) list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) SET(CMAKE_CXX_STANDARD 11) SET(CMAKE_C_COMPILER mpicc) SET(CMAKE_CXX_COMPILER mpicxx) SET (PETSC_ARCH "arch-linux2-c-debug") find_package(PETSc COMPONENTS CXX) add_executable(hellopetsc main.cpp) ------------- /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc CMake Error at FindPETSc.cmake:179 (include): include could not find load file: ResolveCompilerPaths Call Stack (most recent call first): CMakeLists.txt:12 (find_package) CMake Error at FindPETSc.cmake:181 (resolve_includes): Unknown CMake command "resolve_includes". Call Stack (most recent call first): CMakeLists.txt:12 (find_package) -- Configuring incomplete, errors occurred! See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". See also "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". [Finished] On Fri, May 11, 2018 at 3:10 PM, Jed Brown wrote: > Evan Um writes: > > > Matt and Jed, > > > > Thanks for your comments. > > > > I still have a trouble in using FindPETSc.cmake. I removed the directory: > > (i.e. canceling cp - r /home/evan/petsc//lib/petsc/ > > conf/rules /home/evan/petsc//lib/petsc/conf/petscrules) and explicitly > set > > SET (PETSC_ARCH "arch-linux2-c-debug all"). > > No " all". > > > Now, FindPETSc.cmake does not recognize PETSC_DIR and PETSC_ARCH. When I > > read the comments inside FindPETSc.cmake, the twos should be refined by > > FindPETSc.cmake. BTW, I have my petsc at /home/evan/petsc. So all are > > defaults. > > > > Thanks for your kind help. > > > > Evan > > > > -------------- > > cmake_minimum_required(VERSION 3.10) > > > > project(hellopetsc) > > list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) > > > > SET(CMAKE_CXX_STANDARD 11) > > SET(CMAKE_C_COMPILER mpicc) > > SET(CMAKE_CXX_COMPILER mpicxx) > > > > SET (PETSC_ARCH "arch-linux2-c-debug all") > > > > find_package(PETSc COMPONENTS CXX) > > > > add_executable(hellopetsc main.cpp) > > -------------- > > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > -DCMAKE_BUILD_TYPE=Debug > > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > > CMake Error at FindPETSc.cmake:140 (message): > > The pair PETSC_DIR=/home/evan/petsc PETSC_ARCH=arch-linux2-c-debug all > do > > not specify a valid PETSc installation > > Call Stack (most recent call first): > > CMakeLists.txt:12 (find_package) > > > > > > -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. > > (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found > > version "3.9.1") > > -- Configuring incomplete, errors occurred! > > See also > > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ > CMakeFiles/CMakeOutput.log". > > See also > > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ > CMakeFiles/CMakeError.log". > > > > [Finished] > > -------------- > > > > > > On Fri, May 11, 2018 at 2:13 PM, Matthew Knepley > wrote: > > > >> On Fri, May 11, 2018 at 5:06 PM, Evan Um wrote: > >> > >>> Hi Jed, > >>> > >>> Thanks for the comment. I added the module but still saw errors > (before I > >>> arrived here, I had to do cp - r /home/evan/petsc//lib/petsc/ > >>> conf/rules /home/evan/petsc//lib/petsc/conf/petscrules). > >>> > >> > >> That is wrong. It means PETSC_ARCH is (null) instead of the correct > string. > >> > >> Matt > >> > >> > >>> They are supposed to be defined by FindPETSc.cmake. How could I solve > >>> these errors? > >>> > >>> Could you also explain a little bit about how to use pkg-config to find > >>> PETSc? Thanks! > >>> > >>> Evan > >>> > >>> ---------------- > >>> cmake_minimum_required(VERSION 3.10) > >>> > >>> project(hellopetsc) > >>> list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) > >>> > >>> SET(CMAKE_CXX_STANDARD 11) > >>> SET(CMAKE_C_COMPILER mpicc) > >>> SET(CMAKE_CXX_COMPILER mpicxx) > >>> > >>> find_package(PETSc COMPONENTS CXX) > >>> > >>> add_executable(hellopetsc main.cpp) > >>> ----------------- > >>> > >>> ----------------- > >>> /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > >>> -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" > >>> /home/evan/CLionProjects/hellopetsc > >>> -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. > >>> (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found > >>> version "3.9.1") > >>> -- Configuring done > >>> -- Generating done > >>> -- Build files have been written to: /home/evan/CLionProjects/hello > >>> petsc/cmake-build-debug > >>> > >>> [Finished] > >>> ----------------- > >>> > >>> > >>> On Fri, May 11, 2018 at 12:08 PM, Jed Brown wrote: > >>> > >>>> Yes, it depends on this module from the same repository. > >>>> > >>>> Note that you can use pkg-config to find PETSc these days. > >>>> > >>>> Evan Um writes: > >>>> > >>>> > Hi Stefano, > >>>> > > >>>> > Thanks for your comment. Now, cmake was able to locate > FindPETSc.cmake > >>>> file > >>>> > in my project directory, but I see a new error. > >>>> > > >>>> > Evan > >>>> > > >>>> > > >>>> > Messages > >>>> > ---------------------------------------------- > >>>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > >>>> -DCMAKE_BUILD_TYPE=Debug > >>>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/ > hellopetsc > >>>> > CMake Error at FindPETSc.cmake:123 (include): > >>>> > include could not find load file: > >>>> > > >>>> > FindPackageMultipass > >>>> > Call Stack (most recent call first): > >>>> > CMakeLists.txt:10 (find_package) > >>>> > > >>>> > > >>>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): > >>>> > Unknown CMake command "find_package_multipass". > >>>> > Call Stack (most recent call first): > >>>> > CMakeLists.txt:10 (find_package) > >>>> > > >>>> > > >>>> > -- Configuring incomplete, errors occurred! > >>>> > See also > >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake > >>>> Files/CMakeOutput.log". > >>>> > See also > >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake > >>>> Files/CMakeError.log". > >>>> > ---------------------------------------------- > >>>> > > >>>> > CMakeLists.txt > >>>> > ---------------------------------------------- > >>>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake > >>>> -DCMAKE_BUILD_TYPE=Debug > >>>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/ > hellopetsc > >>>> > CMake Error at FindPETSc.cmake:123 (include): > >>>> > include could not find load file: > >>>> > > >>>> > FindPackageMultipass > >>>> > Call Stack (most recent call first): > >>>> > CMakeLists.txt:10 (find_package) > >>>> > > >>>> > > >>>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): > >>>> > Unknown CMake command "find_package_multipass". > >>>> > Call Stack (most recent call first): > >>>> > CMakeLists.txt:10 (find_package) > >>>> > > >>>> > > >>>> > -- Configuring incomplete, errors occurred! > >>>> > See also > >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake > >>>> Files/CMakeOutput.log". > >>>> > See also > >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake > >>>> Files/CMakeError.log". > >>>> > > >>>> > [Finished] > >>>> > > >>>> > > >>>> > > >>>> > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini < > >>>> stefano.zampini at gmail.com > >>>> >> wrote: > >>>> > > >>>> >> CMAKE is case sensitive on this. You should use find_package(PETSc > ?.) > >>>> >> > >>>> >> > >>>> >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: > >>>> >> > > >>>> >> > Hi, > >>>> >> > > >>>> >> > I would like to ask a question about FindPETSc.cmake. I place the > >>>> cmake > >>>> >> file in the same directory where main.cpp is placed. I also placed > >>>> the file > >>>> >> in /usr/share/cmake_xx/Modules. > >>>> >> > >>>> >> Actually, it can be put in any directory pointed by the variable > >>>> >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ > >>>> >> > >>>> >> > Where should i put the file? What else should I do to use the > file > >>>> in > >>>> >> cmake? Do I need any other lines in my cmakelists.txt except > >>>> >> find_package(petsc)? Thanks for your comments. > >>>> >> > > >>>> >> > Evan > >>>> >> > > >>>> >> > ------------------------------------------------------------- > >>>> >> > > >>>> >> > cmake_minimum_required(VERSION 3.10) > >>>> >> > > >>>> >> > project(hellopetsc) > >>>> >> > > >>>> >> > SET(CMAKE_CXX_STANDARD 11) > >>>> >> > SET(CMAKE_C_COMPILER mpicc) > >>>> >> > SET(CMAKE_CXX_COMPILER mpicxx) > >>>> >> > > >>>> >> > find_package(PETSC COMPONENTS CXX) > >>>> >> > > >>>> >> > add_executable(hellopetsc main.cpp) > >>>> >> > > >>>> >> > ------------------------------------------------------------ > >>>> >> > CMake Warning at CMakeLists.txt:9 (find_package): > >>>> >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this > >>>> project > >>>> >> has > >>>> >> > asked CMake to find a package configuration file provided by > >>>> "PETSC", > >>>> >> but > >>>> >> > CMake did not find one. > >>>> >> > > >>>> >> > Could not find a package configuration file provided by "PETSC" > >>>> with > >>>> >> any of > >>>> >> > the following names: > >>>> >> > > >>>> >> > PETSCConfig.cmake > >>>> >> > petsc-config.cmake > >>>> >> > > >>>> >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or > set > >>>> >> > "PETSC_DIR" to a directory containing one of the above files. > If > >>>> >> "PETSC" > >>>> >> > provides a separate development package or SDK, be sure it has > >>>> been > >>>> >> > installed. > >>>> >> > >>>> >> > >>>> > >>> > >>> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their > >> experiments is infinitely more interesting than any results to which > their > >> experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri May 11 17:51:54 2018 From: jed at jedbrown.org (Jed Brown) Date: Fri, 11 May 2018 16:51:54 -0600 Subject: [petsc-users] Placing FindPETSc.cmake In-Reply-To: References: <874ljeau2t.fsf@jedbrown.org> <87wow9972q.fsf@jedbrown.org> Message-ID: <87bmdl956d.fsf@jedbrown.org> Once again, see the repository from which you copied this script. Evan Um writes: > Jed, > > After all was removed, I see a new error from FindPETSc.cmake. > I added to cmake > SET(CMAKE_CXX_COMPILER mpicxx) > FIND_PACKAGE(MPI REQUIRED) > INCLUDE_DIRECTORIES(${MPI_INCLUDE_PATH}) > and tested but no luck. > > It seems that I need to let FindPETSc.cmake know a compiler location. > Thanks for your help. > > Evan > > > ------------ > cmake_minimum_required(VERSION 3.10) > > project(hellopetsc) > list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) > > SET(CMAKE_CXX_STANDARD 11) > SET(CMAKE_C_COMPILER mpicc) > SET(CMAKE_CXX_COMPILER mpicxx) > > SET (PETSC_ARCH "arch-linux2-c-debug") > > find_package(PETSc COMPONENTS CXX) > > add_executable(hellopetsc main.cpp) > ------------- > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake -DCMAKE_BUILD_TYPE=Debug > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc > CMake Error at FindPETSc.cmake:179 (include): > include could not find load file: > > ResolveCompilerPaths > Call Stack (most recent call first): > CMakeLists.txt:12 (find_package) > > > CMake Error at FindPETSc.cmake:181 (resolve_includes): > Unknown CMake command "resolve_includes". > Call Stack (most recent call first): > CMakeLists.txt:12 (find_package) > > > -- Configuring incomplete, errors occurred! > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeOutput.log". > See also > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMakeFiles/CMakeError.log". > > [Finished] > > > On Fri, May 11, 2018 at 3:10 PM, Jed Brown wrote: > >> Evan Um writes: >> >> > Matt and Jed, >> > >> > Thanks for your comments. >> > >> > I still have a trouble in using FindPETSc.cmake. I removed the directory: >> > (i.e. canceling cp - r /home/evan/petsc//lib/petsc/ >> > conf/rules /home/evan/petsc//lib/petsc/conf/petscrules) and explicitly >> set >> > SET (PETSC_ARCH "arch-linux2-c-debug all"). >> >> No " all". >> >> > Now, FindPETSc.cmake does not recognize PETSC_DIR and PETSC_ARCH. When I >> > read the comments inside FindPETSc.cmake, the twos should be refined by >> > FindPETSc.cmake. BTW, I have my petsc at /home/evan/petsc. So all are >> > defaults. >> > >> > Thanks for your kind help. >> > >> > Evan >> > >> > -------------- >> > cmake_minimum_required(VERSION 3.10) >> > >> > project(hellopetsc) >> > list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) >> > >> > SET(CMAKE_CXX_STANDARD 11) >> > SET(CMAKE_C_COMPILER mpicc) >> > SET(CMAKE_CXX_COMPILER mpicxx) >> > >> > SET (PETSC_ARCH "arch-linux2-c-debug all") >> > >> > find_package(PETSc COMPONENTS CXX) >> > >> > add_executable(hellopetsc main.cpp) >> > -------------- >> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> -DCMAKE_BUILD_TYPE=Debug >> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/hellopetsc >> > CMake Error at FindPETSc.cmake:140 (message): >> > The pair PETSC_DIR=/home/evan/petsc PETSC_ARCH=arch-linux2-c-debug all >> do >> > not specify a valid PETSc installation >> > Call Stack (most recent call first): >> > CMakeLists.txt:12 (find_package) >> > >> > >> > -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. >> > (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found >> > version "3.9.1") >> > -- Configuring incomplete, errors occurred! >> > See also >> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ >> CMakeFiles/CMakeOutput.log". >> > See also >> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/ >> CMakeFiles/CMakeError.log". >> > >> > [Finished] >> > -------------- >> > >> > >> > On Fri, May 11, 2018 at 2:13 PM, Matthew Knepley >> wrote: >> > >> >> On Fri, May 11, 2018 at 5:06 PM, Evan Um wrote: >> >> >> >>> Hi Jed, >> >>> >> >>> Thanks for the comment. I added the module but still saw errors >> (before I >> >>> arrived here, I had to do cp - r /home/evan/petsc//lib/petsc/ >> >>> conf/rules /home/evan/petsc//lib/petsc/conf/petscrules). >> >>> >> >> >> >> That is wrong. It means PETSC_ARCH is (null) instead of the correct >> string. >> >> >> >> Matt >> >> >> >> >> >>> They are supposed to be defined by FindPETSc.cmake. How could I solve >> >>> these errors? >> >>> >> >>> Could you also explain a little bit about how to use pkg-config to find >> >>> PETSc? Thanks! >> >>> >> >>> Evan >> >>> >> >>> ---------------- >> >>> cmake_minimum_required(VERSION 3.10) >> >>> >> >>> project(hellopetsc) >> >>> list (APPEND CMAKE_MODULE_PATH /home/evan/CLionProjects/hellopetsc) >> >>> >> >>> SET(CMAKE_CXX_STANDARD 11) >> >>> SET(CMAKE_C_COMPILER mpicc) >> >>> SET(CMAKE_CXX_COMPILER mpicxx) >> >>> >> >>> find_package(PETSc COMPONENTS CXX) >> >>> >> >>> add_executable(hellopetsc main.cpp) >> >>> ----------------- >> >>> >> >>> ----------------- >> >>> /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> >>> -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" >> >>> /home/evan/CLionProjects/hellopetsc >> >>> -- PETSc could not be found. Be sure to set PETSC_DIR and PETSC_ARCH. >> >>> (missing: PETSC_INCLUDES PETSC_LIBRARIES PETSC_EXECUTABLE_RUNS) (found >> >>> version "3.9.1") >> >>> -- Configuring done >> >>> -- Generating done >> >>> -- Build files have been written to: /home/evan/CLionProjects/hello >> >>> petsc/cmake-build-debug >> >>> >> >>> [Finished] >> >>> ----------------- >> >>> >> >>> >> >>> On Fri, May 11, 2018 at 12:08 PM, Jed Brown wrote: >> >>> >> >>>> Yes, it depends on this module from the same repository. >> >>>> >> >>>> Note that you can use pkg-config to find PETSc these days. >> >>>> >> >>>> Evan Um writes: >> >>>> >> >>>> > Hi Stefano, >> >>>> > >> >>>> > Thanks for your comment. Now, cmake was able to locate >> FindPETSc.cmake >> >>>> file >> >>>> > in my project directory, but I see a new error. >> >>>> > >> >>>> > Evan >> >>>> > >> >>>> > >> >>>> > Messages >> >>>> > ---------------------------------------------- >> >>>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> >>>> -DCMAKE_BUILD_TYPE=Debug >> >>>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/ >> hellopetsc >> >>>> > CMake Error at FindPETSc.cmake:123 (include): >> >>>> > include could not find load file: >> >>>> > >> >>>> > FindPackageMultipass >> >>>> > Call Stack (most recent call first): >> >>>> > CMakeLists.txt:10 (find_package) >> >>>> > >> >>>> > >> >>>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >> >>>> > Unknown CMake command "find_package_multipass". >> >>>> > Call Stack (most recent call first): >> >>>> > CMakeLists.txt:10 (find_package) >> >>>> > >> >>>> > >> >>>> > -- Configuring incomplete, errors occurred! >> >>>> > See also >> >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> >>>> Files/CMakeOutput.log". >> >>>> > See also >> >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> >>>> Files/CMakeError.log". >> >>>> > ---------------------------------------------- >> >>>> > >> >>>> > CMakeLists.txt >> >>>> > ---------------------------------------------- >> >>>> > /home/evan/opt/clion-2018.1.2/bin/cmake/bin/cmake >> >>>> -DCMAKE_BUILD_TYPE=Debug >> >>>> > -G "CodeBlocks - Unix Makefiles" /home/evan/CLionProjects/ >> hellopetsc >> >>>> > CMake Error at FindPETSc.cmake:123 (include): >> >>>> > include could not find load file: >> >>>> > >> >>>> > FindPackageMultipass >> >>>> > Call Stack (most recent call first): >> >>>> > CMakeLists.txt:10 (find_package) >> >>>> > >> >>>> > >> >>>> > CMake Error at FindPETSc.cmake:124 (find_package_multipass): >> >>>> > Unknown CMake command "find_package_multipass". >> >>>> > Call Stack (most recent call first): >> >>>> > CMakeLists.txt:10 (find_package) >> >>>> > >> >>>> > >> >>>> > -- Configuring incomplete, errors occurred! >> >>>> > See also >> >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> >>>> Files/CMakeOutput.log". >> >>>> > See also >> >>>> > "/home/evan/CLionProjects/hellopetsc/cmake-build-debug/CMake >> >>>> Files/CMakeError.log". >> >>>> > >> >>>> > [Finished] >> >>>> > >> >>>> > >> >>>> > >> >>>> > On Fri, May 11, 2018 at 11:34 AM, Stefano Zampini < >> >>>> stefano.zampini at gmail.com >> >>>> >> wrote: >> >>>> > >> >>>> >> CMAKE is case sensitive on this. You should use find_package(PETSc >> ?.) >> >>>> >> >> >>>> >> >> >>>> >> > On May 11, 2018, at 9:28 PM, Evan Um wrote: >> >>>> >> > >> >>>> >> > Hi, >> >>>> >> > >> >>>> >> > I would like to ask a question about FindPETSc.cmake. I place the >> >>>> cmake >> >>>> >> file in the same directory where main.cpp is placed. I also placed >> >>>> the file >> >>>> >> in /usr/share/cmake_xx/Modules. >> >>>> >> >> >>>> >> Actually, it can be put in any directory pointed by the variable >> >>>> >> CMAKE_MODULE_PATH. If I were you, I would not modify /usr/share/ >> >>>> >> >> >>>> >> > Where should i put the file? What else should I do to use the >> file >> >>>> in >> >>>> >> cmake? Do I need any other lines in my cmakelists.txt except >> >>>> >> find_package(petsc)? Thanks for your comments. >> >>>> >> > >> >>>> >> > Evan >> >>>> >> > >> >>>> >> > ------------------------------------------------------------- >> >>>> >> > >> >>>> >> > cmake_minimum_required(VERSION 3.10) >> >>>> >> > >> >>>> >> > project(hellopetsc) >> >>>> >> > >> >>>> >> > SET(CMAKE_CXX_STANDARD 11) >> >>>> >> > SET(CMAKE_C_COMPILER mpicc) >> >>>> >> > SET(CMAKE_CXX_COMPILER mpicxx) >> >>>> >> > >> >>>> >> > find_package(PETSC COMPONENTS CXX) >> >>>> >> > >> >>>> >> > add_executable(hellopetsc main.cpp) >> >>>> >> > >> >>>> >> > ------------------------------------------------------------ >> >>>> >> > CMake Warning at CMakeLists.txt:9 (find_package): >> >>>> >> > By not providing "FindPETSC.cmake" in CMAKE_MODULE_PATH this >> >>>> project >> >>>> >> has >> >>>> >> > asked CMake to find a package configuration file provided by >> >>>> "PETSC", >> >>>> >> but >> >>>> >> > CMake did not find one. >> >>>> >> > >> >>>> >> > Could not find a package configuration file provided by "PETSC" >> >>>> with >> >>>> >> any of >> >>>> >> > the following names: >> >>>> >> > >> >>>> >> > PETSCConfig.cmake >> >>>> >> > petsc-config.cmake >> >>>> >> > >> >>>> >> > Add the installation prefix of "PETSC" to CMAKE_PREFIX_PATH or >> set >> >>>> >> > "PETSC_DIR" to a directory containing one of the above files. >> If >> >>>> >> "PETSC" >> >>>> >> > provides a separate development package or SDK, be sure it has >> >>>> been >> >>>> >> > installed. >> >>>> >> >> >>>> >> >> >>>> >> >>> >> >>> >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> >> experiments is infinitely more interesting than any results to which >> their >> >> experiments lead. >> >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> From ys453 at cam.ac.uk Fri May 11 17:58:45 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 11 May 2018 23:58:45 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> Message-ID: <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> Thank you for your reply. > For now, if you give the same Mat object back, it will do what you > expect. Sorry, I am confused here. The same Mat object, is it that I do not destroy the Mat at the end of iteration? Moreover, which function should I actually call to put the Mat object back? Sorry for being stupid on this. Kind Regards, Shidi On 2018-05-11 18:10, Matthew Knepley wrote: > On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: > >> Thank you very much for your reply, Barry. >> >>> This is a bug in PETSc. Since you are providing a new matrix >>> with >>> the same "state" value as the previous matrix the PC code the >>> following code >> So what you mean is that every time I change the value in the >> matrix, >> the PETSc only determines if the nonzero pattern change but not the >> values, and if it is unchanged neither of symbolic and numeric >> happens. > > No, that is not what Barry is saying. > > PETSc looks at the matrix. > If the structure has changed, it does symbolic and numeric > factorization. > If only values have changes, it does numeric factorization. > > HOWEVER, you gave it a new matrix with accidentally the same state > marker, > so it thought nothing had changed. We will fix this by also checking > the pointer. > For now, if you give the same Mat object back, it will do what you > expect. > > Matt > >> I found the following code: >> >> if (!pc->setupcalled) { >> ierr = PetscInfo(pc,"Setting up PC for first >> timen");CHKERRQ(ierr); >> pc->flag = DIFFERENT_NONZERO_PATTERN; >> } else if (matstate == pc->matstate) { >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >> since operator is unchangedn");CHKERRQ(ierr); >> PetscFunctionReturn(0); >> } else { >> if (matnonzerostate > pc->matnonzerostate) { >> ierr = PetscInfo(pc,"Setting up PC with different nonzero >> patternn");CHKERRQ(ierr); >> pc->flag = DIFFERENT_NONZERO_PATTERN; >> } else { >> ierr = PetscInfo(pc,"Setting up PC with same nonzero >> patternn");CHKERRQ(ierr); >> pc->flag = SAME_NONZERO_PATTERN; >> } >> } >> >> and I commend out "else if (matstate == pc->matstate){}", so it >> will do "Setting up PC with same nonzero patternn"; and it seems >> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >> subsequent iterations. But I am not quite sure, need some more >> tests. >> >> Thank you very much for your help indeed. >> >> Kind Regards, >> Shidi >> >> On 2018-05-11 16:13, Smith, Barry F. wrote: >> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >> >> Thank you for your reply. >> >> How are you changing the matrix? Do you remember to assemble? >> I use MatCreateMPIAIJWithArrays() to create the matrix, >> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). > > If you use MatCreateMPIAIJWithArrays() you don't need to call > MatAssemblyBegin() and MatAssemblyEnd(). > >> But I actually destroy the matrix at the end of each iteration >> and create the matrix at the beginning of each iteration. > > This is a bug in PETSc. Since you are providing a new matrix with > the same "state" value as the previous matrix the PC code the > following code > kicks in: > > ierr = > PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); > ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); > if (!pc->setupcalled) { > ierr = PetscInfo(pc,"Setting up PC for first > timen");CHKERRQ(ierr); > pc->flag = DIFFERENT_NONZERO_PATTERN; > } else if (matstate == pc->matstate) { > ierr = PetscInfo(pc,"Leaving PC with identical preconditioner > since operator is unchangedn");CHKERRQ(ierr); > PetscFunctionReturn(0); > > and it returns without refactoring. > > We need an additional check that the matrix also remains the same. > > We will also need a test example that reproduces the problem to > confirm that we have fixed it. > > Barry > >> Cheers, >> Shidi >> >> On 2018-05-11 12:59, Matthew Knepley wrote: >> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >> Dear Matt, >> Thank you for your help last time. >> I want to get more detail about the Petsc-MUMPS factorisation; >> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >> And I found the following functions are quite important to >> the question: >> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >> r,const MatFactorInfo *info); >> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >> MatFactorInfo *info); >> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >> I print some sentence to trace when these functions are called. >> Then I test my code; the values in the matrix is changing but the >> structure stays the same. Below is the output. >> We can see that at 0th step, all the symbolic, numeric and solve >> are called; in the subsequent steps only the solve stage is called, >> the numeric step is not called. >> How are you changing the matrix? Do you remember to assemble? >> Matt >> Iteration 0 Step 0.0005 Time 0.0005 >> [INFO]: Direct Solver setup >> MatCholeskyFactorSymbolic_MUMPS >> finish MatCholeskyFactorSymbolic_MUMPS >> MatFactorNumeric_MUMPS >> finish MatFactorNumeric_MUMPS >> MatSolve_MUMPS >> Iteration 1 Step 0.0005 Time 0.0005 >> MatSolve_MUMPS >> Iteration 2 Step 0.0005 Time 0.001 >> MatSolve_MUMPS >> [INFO]: End of program!!! >> I am wondering if there is any possibility to split the numeric >> and solve stage (as you mentioned using KSPSolve). >> Thank you very much indeed. >> Kind Regards, >> Shidi >> On 2018-05-04 21:10, Y. Shidi wrote: >> Thank you very much for your reply. >> That is really clear. >> Kind Regards, >> Shidi >> On 2018-05-04 21:05, Matthew Knepley wrote: >> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >> Dear Matt, >> Thank you very much for your reply! >> So what you mean is that I can just do the KSPSolve() every >> iteration >> once the MUMPS is set? >> Yes. >> That means inside the KSPSolve() the numerical factorization is >> performed. If that is the case, it seems that the ksp object is >> not changed when the values in the matrix are changed. >> Yes. >> Or do I need to call both KSPSetOperators() and KSPSolve()? >> If you do SetOperators, it will redo the factorization. If you do >> not, >> it will look >> at the Mat object, determine that the structure has not changed, >> and >> just redo >> the numerical factorization. >> Thanks, >> Matt >> On 2018-05-04 14:44, Matthew Knepley wrote: >> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >> Dear PETSc users, >> I am currently using MUMPS to solve linear systems directly. >> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >> step and then solve the system. >> In my code, the values in the matrix is changed in each iteration, >> but the structure of the matrix stays the same, which means the >> performance can be improved if symbolic factorisation is only >> performed once. Hence, it is necessary to split the symbolic >> and numeric factorisation. However, I cannot find a specific step >> (control parameter) to perform the numeric factorisation. >> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >> it seems that the symbolic and numeric factorisation always perform >> together. >> If you use KSPSolve instead, it will automatically preserve the >> symbolic >> factorization. >> Thanks, >> Matt >> So I am wondering if anyone has an idea about it. >> Below is how I set up MUMPS solver: >> PC pc; >> PetscBool flg_mumps, flg_mumps_ch; >> flg_mumps = PETSC_FALSE; >> flg_mumps_ch = PETSC_FALSE; >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >> NULL); >> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >> NULL); >> if(flg_mumps ||flg_mumps_ch) >> { >> KSPSetType(_ksp, KSPPREONLY); >> PetscInt ival,icntl; >> PetscReal val; >> KSPGetPC(_ksp, &pc); >> /// Set preconditioner type >> if(flg_mumps) >> { >> PCSetType(pc, PCLU); >> } >> else if(flg_mumps_ch) >> { >> MatSetOption(A, MAT_SPD, PETSC_TRUE); >> PCSetType(pc, PCCHOLESKY); >> } >> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc); >> PCFactorGetMatrix(pc, &_F); >> icntl = 7; ival = 0; >> MatMumpsSetIcntl( _F, icntl, ival ); >> MatMumpsSetIcntl(_F, 3, 6); >> MatMumpsSetIcntl(_F, 4, 2); >> } >> KSPSetUp(_ksp); >> Kind Regards, >> Shidi >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >> Links: >> ------ >> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ [1] [2] >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ [1] >> [2] http://www.caam.rice.edu/~mk51/ [2] > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [2] > > > Links: > ------ > [1] https://www.cse.buffalo.edu/~knepley/ > [2] http://www.caam.rice.edu/~mk51/ From bsmith at mcs.anl.gov Fri May 11 18:02:40 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 11 May 2018 23:02:40 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> Message-ID: <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> Shidi, You stated: >> and create the matrix at the beginning of each iteration. We never anticipated your use case where you keep the same KSP and make a new matrix for each KSPSolve so we have bugs in PETSc that do not handle that case. Barry > On May 11, 2018, at 5:58 PM, Y. Shidi wrote: > > Thank you for your reply. >> For now, if you give the same Mat object back, it will do what you >> expect. > Sorry, I am confused here. > The same Mat object, is it that I do not destroy the Mat at the > end of iteration? > Moreover, which function should I actually call to put the Mat > object back? > Sorry for being stupid on this. > > Kind Regards, > Shidi > > > > On 2018-05-11 18:10, Matthew Knepley wrote: >> On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: >>> Thank you very much for your reply, Barry. >>>> This is a bug in PETSc. Since you are providing a new matrix >>>> with >>>> the same "state" value as the previous matrix the PC code the >>>> following code >>> So what you mean is that every time I change the value in the >>> matrix, >>> the PETSc only determines if the nonzero pattern change but not the >>> values, and if it is unchanged neither of symbolic and numeric >>> happens. >> No, that is not what Barry is saying. >> PETSc looks at the matrix. >> If the structure has changed, it does symbolic and numeric >> factorization. >> If only values have changes, it does numeric factorization. >> HOWEVER, you gave it a new matrix with accidentally the same state >> marker, >> so it thought nothing had changed. We will fix this by also checking >> the pointer. >> For now, if you give the same Mat object back, it will do what you >> expect. >> Matt >>> I found the following code: >>> if (!pc->setupcalled) { >>> ierr = PetscInfo(pc,"Setting up PC for first >>> timen");CHKERRQ(ierr); >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> } else if (matstate == pc->matstate) { >>> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>> since operator is unchangedn");CHKERRQ(ierr); >>> PetscFunctionReturn(0); >>> } else { >>> if (matnonzerostate > pc->matnonzerostate) { >>> ierr = PetscInfo(pc,"Setting up PC with different nonzero >>> patternn");CHKERRQ(ierr); >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> } else { >>> ierr = PetscInfo(pc,"Setting up PC with same nonzero >>> patternn");CHKERRQ(ierr); >>> pc->flag = SAME_NONZERO_PATTERN; >>> } >>> } >>> and I commend out "else if (matstate == pc->matstate){}", so it >>> will do "Setting up PC with same nonzero patternn"; and it seems >>> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >>> subsequent iterations. But I am not quite sure, need some more >>> tests. >>> Thank you very much for your help indeed. >>> Kind Regards, >>> Shidi >>> On 2018-05-11 16:13, Smith, Barry F. wrote: >>> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >>> Thank you for your reply. >>> How are you changing the matrix? Do you remember to assemble? >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >> If you use MatCreateMPIAIJWithArrays() you don't need to call >> MatAssemblyBegin() and MatAssemblyEnd(). >>> But I actually destroy the matrix at the end of each iteration >>> and create the matrix at the beginning of each iteration. >> This is a bug in PETSc. Since you are providing a new matrix with >> the same "state" value as the previous matrix the PC code the >> following code >> kicks in: >> ierr = >> PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >> if (!pc->setupcalled) { >> ierr = PetscInfo(pc,"Setting up PC for first >> timen");CHKERRQ(ierr); >> pc->flag = DIFFERENT_NONZERO_PATTERN; >> } else if (matstate == pc->matstate) { >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >> since operator is unchangedn");CHKERRQ(ierr); >> PetscFunctionReturn(0); >> and it returns without refactoring. >> We need an additional check that the matrix also remains the same. >> We will also need a test example that reproduces the problem to >> confirm that we have fixed it. >> Barry >>> Cheers, >>> Shidi >>> On 2018-05-11 12:59, Matthew Knepley wrote: >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>> Dear Matt, >>> Thank you for your help last time. >>> I want to get more detail about the Petsc-MUMPS factorisation; >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>> And I found the following functions are quite important to >>> the question: >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>> r,const MatFactorInfo *info); >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>> MatFactorInfo *info); >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>> I print some sentence to trace when these functions are called. >>> Then I test my code; the values in the matrix is changing but the >>> structure stays the same. Below is the output. >>> We can see that at 0th step, all the symbolic, numeric and solve >>> are called; in the subsequent steps only the solve stage is called, >>> the numeric step is not called. >>> How are you changing the matrix? Do you remember to assemble? >>> Matt >>> Iteration 0 Step 0.0005 Time 0.0005 >>> [INFO]: Direct Solver setup >>> MatCholeskyFactorSymbolic_MUMPS >>> finish MatCholeskyFactorSymbolic_MUMPS >>> MatFactorNumeric_MUMPS >>> finish MatFactorNumeric_MUMPS >>> MatSolve_MUMPS >>> Iteration 1 Step 0.0005 Time 0.0005 >>> MatSolve_MUMPS >>> Iteration 2 Step 0.0005 Time 0.001 >>> MatSolve_MUMPS >>> [INFO]: End of program!!! >>> I am wondering if there is any possibility to split the numeric >>> and solve stage (as you mentioned using KSPSolve). >>> Thank you very much indeed. >>> Kind Regards, >>> Shidi >>> On 2018-05-04 21:10, Y. Shidi wrote: >>> Thank you very much for your reply. >>> That is really clear. >>> Kind Regards, >>> Shidi >>> On 2018-05-04 21:05, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>> Dear Matt, >>> Thank you very much for your reply! >>> So what you mean is that I can just do the KSPSolve() every >>> iteration >>> once the MUMPS is set? >>> Yes. >>> That means inside the KSPSolve() the numerical factorization is >>> performed. If that is the case, it seems that the ksp object is >>> not changed when the values in the matrix are changed. >>> Yes. >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>> If you do SetOperators, it will redo the factorization. If you do >>> not, >>> it will look >>> at the Mat object, determine that the structure has not changed, >>> and >>> just redo >>> the numerical factorization. >>> Thanks, >>> Matt >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>> Dear PETSc users, >>> I am currently using MUMPS to solve linear systems directly. >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>> step and then solve the system. >>> In my code, the values in the matrix is changed in each iteration, >>> but the structure of the matrix stays the same, which means the >>> performance can be improved if symbolic factorisation is only >>> performed once. Hence, it is necessary to split the symbolic >>> and numeric factorisation. However, I cannot find a specific step >>> (control parameter) to perform the numeric factorisation. >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>> it seems that the symbolic and numeric factorisation always perform >>> together. >>> If you use KSPSolve instead, it will automatically preserve the >>> symbolic >>> factorization. >>> Thanks, >>> Matt >>> So I am wondering if anyone has an idea about it. >>> Below is how I set up MUMPS solver: >>> PC pc; >>> PetscBool flg_mumps, flg_mumps_ch; >>> flg_mumps = PETSC_FALSE; >>> flg_mumps_ch = PETSC_FALSE; >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>> NULL); >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>> NULL); >>> if(flg_mumps ||flg_mumps_ch) >>> { >>> KSPSetType(_ksp, KSPPREONLY); >>> PetscInt ival,icntl; >>> PetscReal val; >>> KSPGetPC(_ksp, &pc); >>> /// Set preconditioner type >>> if(flg_mumps) >>> { >>> PCSetType(pc, PCLU); >>> } >>> else if(flg_mumps_ch) >>> { >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc, PCCHOLESKY); >>> } >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc); >>> PCFactorGetMatrix(pc, &_F); >>> icntl = 7; ival = 0; >>> MatMumpsSetIcntl( _F, icntl, ival ); >>> MatMumpsSetIcntl(_F, 3, 6); >>> MatMumpsSetIcntl(_F, 4, 2); >>> } >>> KSPSetUp(_ksp); >>> Kind Regards, >>> Shidi >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >>> Links: >>> ------ >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >>> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>> [2] http://www.caam.rice.edu/~mk51/ [2] >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ [2] >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ >> [2] http://www.caam.rice.edu/~mk51/ From adener at anl.gov Fri May 11 18:14:14 2018 From: adener at anl.gov (Dener, Alp) Date: Fri, 11 May 2018 23:14:14 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> Message-ID: On May 11, 2018 at 6:03:09 PM, Smith, Barry F. (bsmith at mcs.anl.gov) wrote: Shidi, You stated: >> and create the matrix at the beginning of each iteration. We never anticipated your use case where you keep the same KSP and make a new matrix for each KSPSolve so we have bugs in PETSc that do not handle that case. We do this in TAO for bounded Newton algorithms where the constraints are handled with an active-set method. We have to construct a new reduced Hessian with a different size every time the active variable indexes change. We don?t completely destroy the KSP solver and create a brand new one whenever this happens. We simply call KSPReset() followed by KSPSetOperators(). This preserves the solver type and other KSP options set in the beginning (e.g.: maximum iterations or various tolerances), but permits the matrix (and the preconditioner) to be completely different objects from one KSPSolve() to another. I don?t know if this behavior is universally supported for all KSP solvers, but it?s been working bug-free for us for STCG, NASH and GLTR solvers. I?ve also made it work with GMRES in a separate test. It may be worth it for Shidi to give it a try and see if it works. Barry > On May 11, 2018, at 5:58 PM, Y. Shidi wrote: > > Thank you for your reply. >> For now, if you give the same Mat object back, it will do what you >> expect. > Sorry, I am confused here. > The same Mat object, is it that I do not destroy the Mat at the > end of iteration? > Moreover, which function should I actually call to put the Mat > object back? > Sorry for being stupid on this. > > Kind Regards, > Shidi > > > > On 2018-05-11 18:10, Matthew Knepley wrote: >> On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: >>> Thank you very much for your reply, Barry. >>>> This is a bug in PETSc. Since you are providing a new matrix >>>> with >>>> the same "state" value as the previous matrix the PC code the >>>> following code >>> So what you mean is that every time I change the value in the >>> matrix, >>> the PETSc only determines if the nonzero pattern change but not the >>> values, and if it is unchanged neither of symbolic and numeric >>> happens. >> No, that is not what Barry is saying. >> PETSc looks at the matrix. >> If the structure has changed, it does symbolic and numeric >> factorization. >> If only values have changes, it does numeric factorization. >> HOWEVER, you gave it a new matrix with accidentally the same state >> marker, >> so it thought nothing had changed. We will fix this by also checking >> the pointer. >> For now, if you give the same Mat object back, it will do what you >> expect. >> Matt >>> I found the following code: >>> if (!pc->setupcalled) { >>> ierr = PetscInfo(pc,"Setting up PC for first >>> timen");CHKERRQ(ierr); >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> } else if (matstate == pc->matstate) { >>> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>> since operator is unchangedn");CHKERRQ(ierr); >>> PetscFunctionReturn(0); >>> } else { >>> if (matnonzerostate > pc->matnonzerostate) { >>> ierr = PetscInfo(pc,"Setting up PC with different nonzero >>> patternn");CHKERRQ(ierr); >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> } else { >>> ierr = PetscInfo(pc,"Setting up PC with same nonzero >>> patternn");CHKERRQ(ierr); >>> pc->flag = SAME_NONZERO_PATTERN; >>> } >>> } >>> and I commend out "else if (matstate == pc->matstate){}", so it >>> will do "Setting up PC with same nonzero patternn"; and it seems >>> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >>> subsequent iterations. But I am not quite sure, need some more >>> tests. >>> Thank you very much for your help indeed. >>> Kind Regards, >>> Shidi >>> On 2018-05-11 16:13, Smith, Barry F. wrote: >>> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >>> Thank you for your reply. >>> How are you changing the matrix? Do you remember to assemble? >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >> If you use MatCreateMPIAIJWithArrays() you don't need to call >> MatAssemblyBegin() and MatAssemblyEnd(). >>> But I actually destroy the matrix at the end of each iteration >>> and create the matrix at the beginning of each iteration. >> This is a bug in PETSc. Since you are providing a new matrix with >> the same "state" value as the previous matrix the PC code the >> following code >> kicks in: >> ierr = >> PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >> if (!pc->setupcalled) { >> ierr = PetscInfo(pc,"Setting up PC for first >> timen");CHKERRQ(ierr); >> pc->flag = DIFFERENT_NONZERO_PATTERN; >> } else if (matstate == pc->matstate) { >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >> since operator is unchangedn");CHKERRQ(ierr); >> PetscFunctionReturn(0); >> and it returns without refactoring. >> We need an additional check that the matrix also remains the same. >> We will also need a test example that reproduces the problem to >> confirm that we have fixed it. >> Barry >>> Cheers, >>> Shidi >>> On 2018-05-11 12:59, Matthew Knepley wrote: >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>> Dear Matt, >>> Thank you for your help last time. >>> I want to get more detail about the Petsc-MUMPS factorisation; >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>> And I found the following functions are quite important to >>> the question: >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>> r,const MatFactorInfo *info); >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>> MatFactorInfo *info); >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>> I print some sentence to trace when these functions are called. >>> Then I test my code; the values in the matrix is changing but the >>> structure stays the same. Below is the output. >>> We can see that at 0th step, all the symbolic, numeric and solve >>> are called; in the subsequent steps only the solve stage is called, >>> the numeric step is not called. >>> How are you changing the matrix? Do you remember to assemble? >>> Matt >>> Iteration 0 Step 0.0005 Time 0.0005 >>> [INFO]: Direct Solver setup >>> MatCholeskyFactorSymbolic_MUMPS >>> finish MatCholeskyFactorSymbolic_MUMPS >>> MatFactorNumeric_MUMPS >>> finish MatFactorNumeric_MUMPS >>> MatSolve_MUMPS >>> Iteration 1 Step 0.0005 Time 0.0005 >>> MatSolve_MUMPS >>> Iteration 2 Step 0.0005 Time 0.001 >>> MatSolve_MUMPS >>> [INFO]: End of program!!! >>> I am wondering if there is any possibility to split the numeric >>> and solve stage (as you mentioned using KSPSolve). >>> Thank you very much indeed. >>> Kind Regards, >>> Shidi >>> On 2018-05-04 21:10, Y. Shidi wrote: >>> Thank you very much for your reply. >>> That is really clear. >>> Kind Regards, >>> Shidi >>> On 2018-05-04 21:05, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>> Dear Matt, >>> Thank you very much for your reply! >>> So what you mean is that I can just do the KSPSolve() every >>> iteration >>> once the MUMPS is set? >>> Yes. >>> That means inside the KSPSolve() the numerical factorization is >>> performed. If that is the case, it seems that the ksp object is >>> not changed when the values in the matrix are changed. >>> Yes. >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>> If you do SetOperators, it will redo the factorization. If you do >>> not, >>> it will look >>> at the Mat object, determine that the structure has not changed, >>> and >>> just redo >>> the numerical factorization. >>> Thanks, >>> Matt >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>> Dear PETSc users, >>> I am currently using MUMPS to solve linear systems directly. >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>> step and then solve the system. >>> In my code, the values in the matrix is changed in each iteration, >>> but the structure of the matrix stays the same, which means the >>> performance can be improved if symbolic factorisation is only >>> performed once. Hence, it is necessary to split the symbolic >>> and numeric factorisation. However, I cannot find a specific step >>> (control parameter) to perform the numeric factorisation. >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>> it seems that the symbolic and numeric factorisation always perform >>> together. >>> If you use KSPSolve instead, it will automatically preserve the >>> symbolic >>> factorization. >>> Thanks, >>> Matt >>> So I am wondering if anyone has an idea about it. >>> Below is how I set up MUMPS solver: >>> PC pc; >>> PetscBool flg_mumps, flg_mumps_ch; >>> flg_mumps = PETSC_FALSE; >>> flg_mumps_ch = PETSC_FALSE; >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>> NULL); >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>> NULL); >>> if(flg_mumps ||flg_mumps_ch) >>> { >>> KSPSetType(_ksp, KSPPREONLY); >>> PetscInt ival,icntl; >>> PetscReal val; >>> KSPGetPC(_ksp, &pc); >>> /// Set preconditioner type >>> if(flg_mumps) >>> { >>> PCSetType(pc, PCLU); >>> } >>> else if(flg_mumps_ch) >>> { >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc, PCCHOLESKY); >>> } >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc); >>> PCFactorGetMatrix(pc, &_F); >>> icntl = 7; ival = 0; >>> MatMumpsSetIcntl( _F, icntl, ival ); >>> MatMumpsSetIcntl(_F, 3, 6); >>> MatMumpsSetIcntl(_F, 4, 2); >>> } >>> KSPSetUp(_ksp); >>> Kind Regards, >>> Shidi >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >>> Links: >>> ------ >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >>> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>> Links: >>> ------ >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>> [2] http://www.caam.rice.edu/~mk51/ [2] >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ [2] >> Links: >> ------ >> [1] https://www.cse.buffalo.edu/~knepley/ >> [2] http://www.caam.rice.edu/~mk51/ Alp Dener Postdoctoral Appointee Argonne National Laboratory Mathematics and Computer Science Division -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri May 11 18:37:34 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 11 May 2018 23:37:34 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> Message-ID: <3D26459B-C8DF-4261-96AF-058C579BA0F5@mcs.anl.gov> > On May 11, 2018, at 6:14 PM, Dener, Alp wrote: > > > On May 11, 2018 at 6:03:09 PM, Smith, Barry F. (bsmith at mcs.anl.gov) wrote: > >> >> Shidi, >> >> You stated: >> >> >> and create the matrix at the beginning of each iteration. >> >> We never anticipated your use case where you keep the same KSP and make a new matrix for each KSPSolve so we have bugs in PETSc that do not handle that case. > We do this in TAO for bounded Newton algorithms where the constraints are handled with an active-set method. We have to construct a new reduced Hessian with a different size every time the active variable indexes change. > > We don?t completely destroy the KSP solver and create a brand new one whenever this happens. We simply call KSPReset() followed by KSPSetOperators(). This preserves the solver type and other KSP options set in the beginning (e.g.: maximum iterations or various tolerances), but permits the matrix (and the preconditioner) to be completely different objects from one KSPSolve() to another. > > I don?t know if this behavior is universally supported for all KSP solvers, but it?s been working bug-free for us for STCG, NASH and GLTR solvers. I?ve also made it work with GMRES in a separate test. It may be worth it for Shidi to give it a try and see if it works. Alp, It should work to produce correct answers, unlike now when incorrect answers are produced but it doesn't serve Shidi's goal of reusing the matrix symbolic factorization. Shidi, Here is how you can get what you want. The first time in you use MatCreateMPIAIJWithArrays() to create the matrix. Do NOT destroy the matrix at the end of the loop. Instead you use MatSetValues() to transfer the values from your i,j,a arrays to the matrix with a simple loop for the second and ever other time you build the matrix. Here is a prototype (untested) of the routine you need. /* The A is the matrix obtained with MatCreateMPIAIJWithArrays(); the other arguments are the same as you pass to MatCreateMPIAIJWithArrays */ PetscErrorCode MatCopyValuesMPIAIJWithArrays(Mat A,m,i,j,a) { PetscErrorCode ierr; PetscInt i, row,rstart,nnz; ierr = MatGetOwnershipRange(A,&rstart,NULL);CHKERRQ(ierr); for (ii=0; ii >> >> >> Barry >> >> >> > On May 11, 2018, at 5:58 PM, Y. Shidi wrote: >> > >> > Thank you for your reply. >> >> For now, if you give the same Mat object back, it will do what you >> >> expect. >> > Sorry, I am confused here. >> > The same Mat object, is it that I do not destroy the Mat at the >> > end of iteration? >> > Moreover, which function should I actually call to put the Mat >> > object back? >> > Sorry for being stupid on this. >> > >> > Kind Regards, >> > Shidi >> > >> > >> > >> > On 2018-05-11 18:10, Matthew Knepley wrote: >> >> On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: >> >>> Thank you very much for your reply, Barry. >> >>>> This is a bug in PETSc. Since you are providing a new matrix >> >>>> with >> >>>> the same "state" value as the previous matrix the PC code the >> >>>> following code >> >>> So what you mean is that every time I change the value in the >> >>> matrix, >> >>> the PETSc only determines if the nonzero pattern change but not the >> >>> values, and if it is unchanged neither of symbolic and numeric >> >>> happens. >> >> No, that is not what Barry is saying. >> >> PETSc looks at the matrix. >> >> If the structure has changed, it does symbolic and numeric >> >> factorization. >> >> If only values have changes, it does numeric factorization. >> >> HOWEVER, you gave it a new matrix with accidentally the same state >> >> marker, >> >> so it thought nothing had changed. We will fix this by also checking >> >> the pointer. >> >> For now, if you give the same Mat object back, it will do what you >> >> expect. >> >> Matt >> >>> I found the following code: >> >>> if (!pc->setupcalled) { >> >>> ierr = PetscInfo(pc,"Setting up PC for first >> >>> timen");CHKERRQ(ierr); >> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >> >>> } else if (matstate == pc->matstate) { >> >>> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >> >>> since operator is unchangedn");CHKERRQ(ierr); >> >>> PetscFunctionReturn(0); >> >>> } else { >> >>> if (matnonzerostate > pc->matnonzerostate) { >> >>> ierr = PetscInfo(pc,"Setting up PC with different nonzero >> >>> patternn");CHKERRQ(ierr); >> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >> >>> } else { >> >>> ierr = PetscInfo(pc,"Setting up PC with same nonzero >> >>> patternn");CHKERRQ(ierr); >> >>> pc->flag = SAME_NONZERO_PATTERN; >> >>> } >> >>> } >> >>> and I commend out "else if (matstate == pc->matstate){}", so it >> >>> will do "Setting up PC with same nonzero patternn"; and it seems >> >>> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >> >>> subsequent iterations. But I am not quite sure, need some more >> >>> tests. >> >>> Thank you very much for your help indeed. >> >>> Kind Regards, >> >>> Shidi >> >>> On 2018-05-11 16:13, Smith, Barry F. wrote: >> >>> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >> >>> Thank you for your reply. >> >>> How are you changing the matrix? Do you remember to assemble? >> >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >> >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >> >> If you use MatCreateMPIAIJWithArrays() you don't need to call >> >> MatAssemblyBegin() and MatAssemblyEnd(). >> >>> But I actually destroy the matrix at the end of each iteration >> >>> and create the matrix at the beginning of each iteration. >> >> This is a bug in PETSc. Since you are providing a new matrix with >> >> the same "state" value as the previous matrix the PC code the >> >> following code >> >> kicks in: >> >> ierr = >> >> PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); >> >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >> >> if (!pc->setupcalled) { >> >> ierr = PetscInfo(pc,"Setting up PC for first >> >> timen");CHKERRQ(ierr); >> >> pc->flag = DIFFERENT_NONZERO_PATTERN; >> >> } else if (matstate == pc->matstate) { >> >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >> >> since operator is unchangedn");CHKERRQ(ierr); >> >> PetscFunctionReturn(0); >> >> and it returns without refactoring. >> >> We need an additional check that the matrix also remains the same. >> >> We will also need a test example that reproduces the problem to >> >> confirm that we have fixed it. >> >> Barry >> >>> Cheers, >> >>> Shidi >> >>> On 2018-05-11 12:59, Matthew Knepley wrote: >> >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >> >>> Dear Matt, >> >>> Thank you for your help last time. >> >>> I want to get more detail about the Petsc-MUMPS factorisation; >> >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >> >>> And I found the following functions are quite important to >> >>> the question: >> >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >> >>> r,const MatFactorInfo *info); >> >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >> >>> MatFactorInfo *info); >> >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >> >>> I print some sentence to trace when these functions are called. >> >>> Then I test my code; the values in the matrix is changing but the >> >>> structure stays the same. Below is the output. >> >>> We can see that at 0th step, all the symbolic, numeric and solve >> >>> are called; in the subsequent steps only the solve stage is called, >> >>> the numeric step is not called. >> >>> How are you changing the matrix? Do you remember to assemble? >> >>> Matt >> >>> Iteration 0 Step 0.0005 Time 0.0005 >> >>> [INFO]: Direct Solver setup >> >>> MatCholeskyFactorSymbolic_MUMPS >> >>> finish MatCholeskyFactorSymbolic_MUMPS >> >>> MatFactorNumeric_MUMPS >> >>> finish MatFactorNumeric_MUMPS >> >>> MatSolve_MUMPS >> >>> Iteration 1 Step 0.0005 Time 0.0005 >> >>> MatSolve_MUMPS >> >>> Iteration 2 Step 0.0005 Time 0.001 >> >>> MatSolve_MUMPS >> >>> [INFO]: End of program!!! >> >>> I am wondering if there is any possibility to split the numeric >> >>> and solve stage (as you mentioned using KSPSolve). >> >>> Thank you very much indeed. >> >>> Kind Regards, >> >>> Shidi >> >>> On 2018-05-04 21:10, Y. Shidi wrote: >> >>> Thank you very much for your reply. >> >>> That is really clear. >> >>> Kind Regards, >> >>> Shidi >> >>> On 2018-05-04 21:05, Matthew Knepley wrote: >> >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >> >>> Dear Matt, >> >>> Thank you very much for your reply! >> >>> So what you mean is that I can just do the KSPSolve() every >> >>> iteration >> >>> once the MUMPS is set? >> >>> Yes. >> >>> That means inside the KSPSolve() the numerical factorization is >> >>> performed. If that is the case, it seems that the ksp object is >> >>> not changed when the values in the matrix are changed. >> >>> Yes. >> >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >> >>> If you do SetOperators, it will redo the factorization. If you do >> >>> not, >> >>> it will look >> >>> at the Mat object, determine that the structure has not changed, >> >>> and >> >>> just redo >> >>> the numerical factorization. >> >>> Thanks, >> >>> Matt >> >>> On 2018-05-04 14:44, Matthew Knepley wrote: >> >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >> >>> Dear PETSc users, >> >>> I am currently using MUMPS to solve linear systems directly. >> >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >> >>> step and then solve the system. >> >>> In my code, the values in the matrix is changed in each iteration, >> >>> but the structure of the matrix stays the same, which means the >> >>> performance can be improved if symbolic factorisation is only >> >>> performed once. Hence, it is necessary to split the symbolic >> >>> and numeric factorisation. However, I cannot find a specific step >> >>> (control parameter) to perform the numeric factorisation. >> >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >> >>> it seems that the symbolic and numeric factorisation always perform >> >>> together. >> >>> If you use KSPSolve instead, it will automatically preserve the >> >>> symbolic >> >>> factorization. >> >>> Thanks, >> >>> Matt >> >>> So I am wondering if anyone has an idea about it. >> >>> Below is how I set up MUMPS solver: >> >>> PC pc; >> >>> PetscBool flg_mumps, flg_mumps_ch; >> >>> flg_mumps = PETSC_FALSE; >> >>> flg_mumps_ch = PETSC_FALSE; >> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >> >>> NULL); >> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >> >>> NULL); >> >>> if(flg_mumps ||flg_mumps_ch) >> >>> { >> >>> KSPSetType(_ksp, KSPPREONLY); >> >>> PetscInt ival,icntl; >> >>> PetscReal val; >> >>> KSPGetPC(_ksp, &pc); >> >>> /// Set preconditioner type >> >>> if(flg_mumps) >> >>> { >> >>> PCSetType(pc, PCLU); >> >>> } >> >>> else if(flg_mumps_ch) >> >>> { >> >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >> >>> PCSetType(pc, PCCHOLESKY); >> >>> } >> >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >> >>> PCFactorSetUpMatSolverPackage(pc); >> >>> PCFactorGetMatrix(pc, &_F); >> >>> icntl = 7; ival = 0; >> >>> MatMumpsSetIcntl( _F, icntl, ival ); >> >>> MatMumpsSetIcntl(_F, 3, 6); >> >>> MatMumpsSetIcntl(_F, 4, 2); >> >>> } >> >>> KSPSetUp(_ksp); >> >>> Kind Regards, >> >>> Shidi >> >>> -- >> >>> What most experimenters take for granted before they begin their >> >>> experiments is infinitely more interesting than any results to >> >>> which >> >>> their experiments lead. >> >>> -- Norbert Wiener >> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >> >>> Links: >> >>> ------ >> >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >> >>> -- >> >>> What most experimenters take for granted before they begin their >> >>> experiments is infinitely more interesting than any results to >> >>> which >> >>> their experiments lead. >> >>> -- Norbert Wiener >> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >> >>> Links: >> >>> ------ >> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >> >>> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >> >>> -- >> >>> What most experimenters take for granted before they begin their >> >>> experiments is infinitely more interesting than any results to >> >>> which >> >>> their experiments lead. >> >>> -- Norbert Wiener >> >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >> >>> Links: >> >>> ------ >> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >> >>> [2] http://www.caam.rice.edu/~mk51/ [2] >> >> -- >> >> What most experimenters take for granted before they begin their >> >> experiments is infinitely more interesting than any results to which >> >> their experiments lead. >> >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [2] >> >> Links: >> >> ------ >> >> [1] https://www.cse.buffalo.edu/~knepley/ >> >> [2] http://www.caam.rice.edu/~mk51/ > Alp Dener > Postdoctoral Appointee > Argonne National Laboratory > Mathematics and Computer Science Division > From ys453 at cam.ac.uk Fri May 11 19:00:59 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Sat, 12 May 2018 01:00:59 +0100 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <3D26459B-C8DF-4261-96AF-058C579BA0F5@mcs.anl.gov> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> <3D26459B-C8DF-4261-96AF-058C579BA0F5@mcs.anl.gov> Message-ID: <29fcb51b0becd64d72974fa95bd9eac4@cam.ac.uk> Alp, thank you for your suggestion. As Barry pointed out, I want to keep the symbolic factorization, because the matrix structure in my problem is unchanged for some iterations (doing some moving mesh) and this part is also sequential. Thank you very much for your help and time. Barry, I really appreciate your time and help. I will have a try on this. The basic idea here is keeping the Mat object, but I am using MatCreateMPIAIJWithArrays() instead, and this function always create a new Mat object. I think I get it, I have checked previously that if I do not destroy the Mat object created by MatCreateMPIAIJWithArrays() every time, the memory usage is keep going. Thank you very much for your help indeed. Kind Regards, Shidi On 2018-05-12 00:37, Smith, Barry F. wrote: >> On May 11, 2018, at 6:14 PM, Dener, Alp wrote: >> >> >> On May 11, 2018 at 6:03:09 PM, Smith, Barry F. (bsmith at mcs.anl.gov) >> wrote: >> >>> >>> Shidi, >>> >>> You stated: >>> >>> >> and create the matrix at the beginning of each iteration. >>> >>> We never anticipated your use case where you keep the same KSP and >>> make a new matrix for each KSPSolve so we have bugs in PETSc that do >>> not handle that case. >> We do this in TAO for bounded Newton algorithms where the constraints >> are handled with an active-set method. We have to construct a new >> reduced Hessian with a different size every time the active variable >> indexes change. >> >> We don?t completely destroy the KSP solver and create a brand new one >> whenever this happens. We simply call KSPReset() followed by >> KSPSetOperators(). This preserves the solver type and other KSP >> options set in the beginning (e.g.: maximum iterations or various >> tolerances), but permits the matrix (and the preconditioner) to be >> completely different objects from one KSPSolve() to another. >> >> I don?t know if this behavior is universally supported for all KSP >> solvers, but it?s been working bug-free for us for STCG, NASH and GLTR >> solvers. I?ve also made it work with GMRES in a separate test. It may >> be worth it for Shidi to give it a try and see if it works. > > Alp, > > It should work to produce correct answers, unlike now when incorrect > answers are produced but it doesn't serve Shidi's goal of reusing the > matrix symbolic factorization. > > Shidi, > > Here is how you can get what you want. The first time in you use > MatCreateMPIAIJWithArrays() to create the matrix. Do NOT destroy the > matrix at the end of the loop. Instead you use MatSetValues() to > transfer the values from your i,j,a arrays to the matrix with a simple > loop for the second and ever other time you build the matrix. Here is > a prototype (untested) of the routine you need. > > /* The A is the matrix obtained with MatCreateMPIAIJWithArrays(); the > other arguments are the same as you pass to MatCreateMPIAIJWithArrays > */ > > PetscErrorCode MatCopyValuesMPIAIJWithArrays(Mat A,m,i,j,a) > { > PetscErrorCode ierr; > PetscInt i, row,rstart,nnz; > > ierr = MatGetOwnershipRange(A,&rstart,NULL);CHKERRQ(ierr); > for (ii=0; ii row = ii + rstart; > nnz = i[ii+1]- i[ii]; > ierr = > MatSetValues(A,1,&row,nnz,j+i[ii],values+i[ii],INSERT_VALUES);CHKERRQ(ierr); > } > ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERR(ierr); > ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERR(ierr); > PetscFunctionReturn(0); > } > > Plus it is more efficient than your current code because it does not > have to build the A from scratch each time. > > Good luck > > Barry > > > > > > > >> >>> >>> >>> Barry >>> >>> >>> > On May 11, 2018, at 5:58 PM, Y. Shidi wrote: >>> > >>> > Thank you for your reply. >>> >> For now, if you give the same Mat object back, it will do what you >>> >> expect. >>> > Sorry, I am confused here. >>> > The same Mat object, is it that I do not destroy the Mat at the >>> > end of iteration? >>> > Moreover, which function should I actually call to put the Mat >>> > object back? >>> > Sorry for being stupid on this. >>> > >>> > Kind Regards, >>> > Shidi >>> > >>> > >>> > >>> > On 2018-05-11 18:10, Matthew Knepley wrote: >>> >> On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: >>> >>> Thank you very much for your reply, Barry. >>> >>>> This is a bug in PETSc. Since you are providing a new matrix >>> >>>> with >>> >>>> the same "state" value as the previous matrix the PC code the >>> >>>> following code >>> >>> So what you mean is that every time I change the value in the >>> >>> matrix, >>> >>> the PETSc only determines if the nonzero pattern change but not the >>> >>> values, and if it is unchanged neither of symbolic and numeric >>> >>> happens. >>> >> No, that is not what Barry is saying. >>> >> PETSc looks at the matrix. >>> >> If the structure has changed, it does symbolic and numeric >>> >> factorization. >>> >> If only values have changes, it does numeric factorization. >>> >> HOWEVER, you gave it a new matrix with accidentally the same state >>> >> marker, >>> >> so it thought nothing had changed. We will fix this by also checking >>> >> the pointer. >>> >> For now, if you give the same Mat object back, it will do what you >>> >> expect. >>> >> Matt >>> >>> I found the following code: >>> >>> if (!pc->setupcalled) { >>> >>> ierr = PetscInfo(pc,"Setting up PC for first >>> >>> timen");CHKERRQ(ierr); >>> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> >>> } else if (matstate == pc->matstate) { >>> >>> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>> >>> since operator is unchangedn");CHKERRQ(ierr); >>> >>> PetscFunctionReturn(0); >>> >>> } else { >>> >>> if (matnonzerostate > pc->matnonzerostate) { >>> >>> ierr = PetscInfo(pc,"Setting up PC with different nonzero >>> >>> patternn");CHKERRQ(ierr); >>> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> >>> } else { >>> >>> ierr = PetscInfo(pc,"Setting up PC with same nonzero >>> >>> patternn");CHKERRQ(ierr); >>> >>> pc->flag = SAME_NONZERO_PATTERN; >>> >>> } >>> >>> } >>> >>> and I commend out "else if (matstate == pc->matstate){}", so it >>> >>> will do "Setting up PC with same nonzero patternn"; and it seems >>> >>> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >>> >>> subsequent iterations. But I am not quite sure, need some more >>> >>> tests. >>> >>> Thank you very much for your help indeed. >>> >>> Kind Regards, >>> >>> Shidi >>> >>> On 2018-05-11 16:13, Smith, Barry F. wrote: >>> >>> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >>> >>> Thank you for your reply. >>> >>> How are you changing the matrix? Do you remember to assemble? >>> >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >>> >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >>> >> If you use MatCreateMPIAIJWithArrays() you don't need to call >>> >> MatAssemblyBegin() and MatAssemblyEnd(). >>> >>> But I actually destroy the matrix at the end of each iteration >>> >>> and create the matrix at the beginning of each iteration. >>> >> This is a bug in PETSc. Since you are providing a new matrix with >>> >> the same "state" value as the previous matrix the PC code the >>> >> following code >>> >> kicks in: >>> >> ierr = >>> >> PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); >>> >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >>> >> if (!pc->setupcalled) { >>> >> ierr = PetscInfo(pc,"Setting up PC for first >>> >> timen");CHKERRQ(ierr); >>> >> pc->flag = DIFFERENT_NONZERO_PATTERN; >>> >> } else if (matstate == pc->matstate) { >>> >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>> >> since operator is unchangedn");CHKERRQ(ierr); >>> >> PetscFunctionReturn(0); >>> >> and it returns without refactoring. >>> >> We need an additional check that the matrix also remains the same. >>> >> We will also need a test example that reproduces the problem to >>> >> confirm that we have fixed it. >>> >> Barry >>> >>> Cheers, >>> >>> Shidi >>> >>> On 2018-05-11 12:59, Matthew Knepley wrote: >>> >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>> >>> Dear Matt, >>> >>> Thank you for your help last time. >>> >>> I want to get more detail about the Petsc-MUMPS factorisation; >>> >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>> >>> And I found the following functions are quite important to >>> >>> the question: >>> >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>> >>> r,const MatFactorInfo *info); >>> >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>> >>> MatFactorInfo *info); >>> >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>> >>> I print some sentence to trace when these functions are called. >>> >>> Then I test my code; the values in the matrix is changing but the >>> >>> structure stays the same. Below is the output. >>> >>> We can see that at 0th step, all the symbolic, numeric and solve >>> >>> are called; in the subsequent steps only the solve stage is called, >>> >>> the numeric step is not called. >>> >>> How are you changing the matrix? Do you remember to assemble? >>> >>> Matt >>> >>> Iteration 0 Step 0.0005 Time 0.0005 >>> >>> [INFO]: Direct Solver setup >>> >>> MatCholeskyFactorSymbolic_MUMPS >>> >>> finish MatCholeskyFactorSymbolic_MUMPS >>> >>> MatFactorNumeric_MUMPS >>> >>> finish MatFactorNumeric_MUMPS >>> >>> MatSolve_MUMPS >>> >>> Iteration 1 Step 0.0005 Time 0.0005 >>> >>> MatSolve_MUMPS >>> >>> Iteration 2 Step 0.0005 Time 0.001 >>> >>> MatSolve_MUMPS >>> >>> [INFO]: End of program!!! >>> >>> I am wondering if there is any possibility to split the numeric >>> >>> and solve stage (as you mentioned using KSPSolve). >>> >>> Thank you very much indeed. >>> >>> Kind Regards, >>> >>> Shidi >>> >>> On 2018-05-04 21:10, Y. Shidi wrote: >>> >>> Thank you very much for your reply. >>> >>> That is really clear. >>> >>> Kind Regards, >>> >>> Shidi >>> >>> On 2018-05-04 21:05, Matthew Knepley wrote: >>> >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>> >>> Dear Matt, >>> >>> Thank you very much for your reply! >>> >>> So what you mean is that I can just do the KSPSolve() every >>> >>> iteration >>> >>> once the MUMPS is set? >>> >>> Yes. >>> >>> That means inside the KSPSolve() the numerical factorization is >>> >>> performed. If that is the case, it seems that the ksp object is >>> >>> not changed when the values in the matrix are changed. >>> >>> Yes. >>> >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>> >>> If you do SetOperators, it will redo the factorization. If you do >>> >>> not, >>> >>> it will look >>> >>> at the Mat object, determine that the structure has not changed, >>> >>> and >>> >>> just redo >>> >>> the numerical factorization. >>> >>> Thanks, >>> >>> Matt >>> >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>> >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>> >>> Dear PETSc users, >>> >>> I am currently using MUMPS to solve linear systems directly. >>> >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>> >>> step and then solve the system. >>> >>> In my code, the values in the matrix is changed in each iteration, >>> >>> but the structure of the matrix stays the same, which means the >>> >>> performance can be improved if symbolic factorisation is only >>> >>> performed once. Hence, it is necessary to split the symbolic >>> >>> and numeric factorisation. However, I cannot find a specific step >>> >>> (control parameter) to perform the numeric factorisation. >>> >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>> >>> it seems that the symbolic and numeric factorisation always perform >>> >>> together. >>> >>> If you use KSPSolve instead, it will automatically preserve the >>> >>> symbolic >>> >>> factorization. >>> >>> Thanks, >>> >>> Matt >>> >>> So I am wondering if anyone has an idea about it. >>> >>> Below is how I set up MUMPS solver: >>> >>> PC pc; >>> >>> PetscBool flg_mumps, flg_mumps_ch; >>> >>> flg_mumps = PETSC_FALSE; >>> >>> flg_mumps_ch = PETSC_FALSE; >>> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>> >>> NULL); >>> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>> >>> NULL); >>> >>> if(flg_mumps ||flg_mumps_ch) >>> >>> { >>> >>> KSPSetType(_ksp, KSPPREONLY); >>> >>> PetscInt ival,icntl; >>> >>> PetscReal val; >>> >>> KSPGetPC(_ksp, &pc); >>> >>> /// Set preconditioner type >>> >>> if(flg_mumps) >>> >>> { >>> >>> PCSetType(pc, PCLU); >>> >>> } >>> >>> else if(flg_mumps_ch) >>> >>> { >>> >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>> >>> PCSetType(pc, PCCHOLESKY); >>> >>> } >>> >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>> >>> PCFactorSetUpMatSolverPackage(pc); >>> >>> PCFactorGetMatrix(pc, &_F); >>> >>> icntl = 7; ival = 0; >>> >>> MatMumpsSetIcntl( _F, icntl, ival ); >>> >>> MatMumpsSetIcntl(_F, 3, 6); >>> >>> MatMumpsSetIcntl(_F, 4, 2); >>> >>> } >>> >>> KSPSetUp(_ksp); >>> >>> Kind Regards, >>> >>> Shidi >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> >>> experiments is infinitely more interesting than any results to >>> >>> which >>> >>> their experiments lead. >>> >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >>> >>> Links: >>> >>> ------ >>> >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> >>> experiments is infinitely more interesting than any results to >>> >>> which >>> >>> their experiments lead. >>> >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >>> >>> Links: >>> >>> ------ >>> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >>> >>> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >>> >>> -- >>> >>> What most experimenters take for granted before they begin their >>> >>> experiments is infinitely more interesting than any results to >>> >>> which >>> >>> their experiments lead. >>> >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>> >>> Links: >>> >>> ------ >>> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>> >>> [2] http://www.caam.rice.edu/~mk51/ [2] >>> >> -- >>> >> What most experimenters take for granted before they begin their >>> >> experiments is infinitely more interesting than any results to which >>> >> their experiments lead. >>> >> -- Norbert Wiener >>> >> https://www.cse.buffalo.edu/~knepley/ [2] >>> >> Links: >>> >> ------ >>> >> [1] https://www.cse.buffalo.edu/~knepley/ >>> >> [2] http://www.caam.rice.edu/~mk51/ >> Alp Dener >> Postdoctoral Appointee >> Argonne National Laboratory >> Mathematics and Computer Science Division >> From bsmith at mcs.anl.gov Fri May 11 19:05:55 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 12 May 2018 00:05:55 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <29fcb51b0becd64d72974fa95bd9eac4@cam.ac.uk> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> <3D26459B-C8DF-4261-96AF-058C579BA0F5@mcs.anl.gov> <29fcb51b0becd64d72974fa95bd9eac4@cam.ac.uk> Message-ID: > On May 11, 2018, at 7:00 PM, Y. Shidi wrote: > > Alp, > thank you for your suggestion. As Barry pointed out, I want to > keep the symbolic factorization, because the matrix structure > in my problem is unchanged for some iterations (doing some > moving mesh) and this part is also sequential. > Thank you very much for your help and time. > > Barry, > I really appreciate your time and help. > I will have a try on this. > The basic idea here is keeping the Mat object, but I am using > MatCreateMPIAIJWithArrays() instead, and this function always > create a new Mat object. I think I get it, I have checked > previously that if I do not destroy the Mat object created by > MatCreateMPIAIJWithArrays() every time, the memory usage is > keep going. > I understand exactly what you are doing. With my suggested approach you will not use more and more memory because you only CREATE the matrix once with MatCreateMPIAIJWithArrays(), everything after that you simply refill the same initial matrix with MatCopyValuesMPIAIJWithArrays(). Please let me know if you have any difficulties with my approach, Barry > Thank you very much for your help indeed. > > Kind Regards, > Shidi > > > On 2018-05-12 00:37, Smith, Barry F. wrote: >>> On May 11, 2018, at 6:14 PM, Dener, Alp wrote: >>> On May 11, 2018 at 6:03:09 PM, Smith, Barry F. (bsmith at mcs.anl.gov) wrote: >>>> Shidi, >>>> You stated: >>>> >> and create the matrix at the beginning of each iteration. >>>> We never anticipated your use case where you keep the same KSP and make a new matrix for each KSPSolve so we have bugs in PETSc that do not handle that case. >>> We do this in TAO for bounded Newton algorithms where the constraints are handled with an active-set method. We have to construct a new reduced Hessian with a different size every time the active variable indexes change. >>> We don?t completely destroy the KSP solver and create a brand new one whenever this happens. We simply call KSPReset() followed by KSPSetOperators(). This preserves the solver type and other KSP options set in the beginning (e.g.: maximum iterations or various tolerances), but permits the matrix (and the preconditioner) to be completely different objects from one KSPSolve() to another. >>> I don?t know if this behavior is universally supported for all KSP solvers, but it?s been working bug-free for us for STCG, NASH and GLTR solvers. I?ve also made it work with GMRES in a separate test. It may be worth it for Shidi to give it a try and see if it works. >> Alp, >> It should work to produce correct answers, unlike now when incorrect >> answers are produced but it doesn't serve Shidi's goal of reusing the >> matrix symbolic factorization. >> Shidi, >> Here is how you can get what you want. The first time in you use >> MatCreateMPIAIJWithArrays() to create the matrix. Do NOT destroy the >> matrix at the end of the loop. Instead you use MatSetValues() to >> transfer the values from your i,j,a arrays to the matrix with a simple >> loop for the second and ever other time you build the matrix. Here is >> a prototype (untested) of the routine you need. >> /* The A is the matrix obtained with MatCreateMPIAIJWithArrays(); the >> other arguments are the same as you pass to MatCreateMPIAIJWithArrays >> */ >> PetscErrorCode MatCopyValuesMPIAIJWithArrays(Mat A,m,i,j,a) >> { >> PetscErrorCode ierr; >> PetscInt i, row,rstart,nnz; >> ierr = MatGetOwnershipRange(A,&rstart,NULL);CHKERRQ(ierr); >> for (ii=0; ii> row = ii + rstart; >> nnz = i[ii+1]- i[ii]; >> ierr = >> MatSetValues(A,1,&row,nnz,j+i[ii],values+i[ii],INSERT_VALUES);CHKERRQ(ierr); >> } >> ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERR(ierr); >> ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERR(ierr); >> PetscFunctionReturn(0); >> } >> Plus it is more efficient than your current code because it does not >> have to build the A from scratch each time. >> Good luck >> Barry >>>> Barry >>>> > On May 11, 2018, at 5:58 PM, Y. Shidi wrote: >>>> > >>>> > Thank you for your reply. >>>> >> For now, if you give the same Mat object back, it will do what you >>>> >> expect. >>>> > Sorry, I am confused here. >>>> > The same Mat object, is it that I do not destroy the Mat at the >>>> > end of iteration? >>>> > Moreover, which function should I actually call to put the Mat >>>> > object back? >>>> > Sorry for being stupid on this. >>>> > >>>> > Kind Regards, >>>> > Shidi >>>> > >>>> > >>>> > >>>> > On 2018-05-11 18:10, Matthew Knepley wrote: >>>> >> On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: >>>> >>> Thank you very much for your reply, Barry. >>>> >>>> This is a bug in PETSc. Since you are providing a new matrix >>>> >>>> with >>>> >>>> the same "state" value as the previous matrix the PC code the >>>> >>>> following code >>>> >>> So what you mean is that every time I change the value in the >>>> >>> matrix, >>>> >>> the PETSc only determines if the nonzero pattern change but not the >>>> >>> values, and if it is unchanged neither of symbolic and numeric >>>> >>> happens. >>>> >> No, that is not what Barry is saying. >>>> >> PETSc looks at the matrix. >>>> >> If the structure has changed, it does symbolic and numeric >>>> >> factorization. >>>> >> If only values have changes, it does numeric factorization. >>>> >> HOWEVER, you gave it a new matrix with accidentally the same state >>>> >> marker, >>>> >> so it thought nothing had changed. We will fix this by also checking >>>> >> the pointer. >>>> >> For now, if you give the same Mat object back, it will do what you >>>> >> expect. >>>> >> Matt >>>> >>> I found the following code: >>>> >>> if (!pc->setupcalled) { >>>> >>> ierr = PetscInfo(pc,"Setting up PC for first >>>> >>> timen");CHKERRQ(ierr); >>>> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>>> >>> } else if (matstate == pc->matstate) { >>>> >>> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>>> >>> since operator is unchangedn");CHKERRQ(ierr); >>>> >>> PetscFunctionReturn(0); >>>> >>> } else { >>>> >>> if (matnonzerostate > pc->matnonzerostate) { >>>> >>> ierr = PetscInfo(pc,"Setting up PC with different nonzero >>>> >>> patternn");CHKERRQ(ierr); >>>> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>>> >>> } else { >>>> >>> ierr = PetscInfo(pc,"Setting up PC with same nonzero >>>> >>> patternn");CHKERRQ(ierr); >>>> >>> pc->flag = SAME_NONZERO_PATTERN; >>>> >>> } >>>> >>> } >>>> >>> and I commend out "else if (matstate == pc->matstate){}", so it >>>> >>> will do "Setting up PC with same nonzero patternn"; and it seems >>>> >>> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >>>> >>> subsequent iterations. But I am not quite sure, need some more >>>> >>> tests. >>>> >>> Thank you very much for your help indeed. >>>> >>> Kind Regards, >>>> >>> Shidi >>>> >>> On 2018-05-11 16:13, Smith, Barry F. wrote: >>>> >>> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >>>> >>> Thank you for your reply. >>>> >>> How are you changing the matrix? Do you remember to assemble? >>>> >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >>>> >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >>>> >> If you use MatCreateMPIAIJWithArrays() you don't need to call >>>> >> MatAssemblyBegin() and MatAssemblyEnd(). >>>> >>> But I actually destroy the matrix at the end of each iteration >>>> >>> and create the matrix at the beginning of each iteration. >>>> >> This is a bug in PETSc. Since you are providing a new matrix with >>>> >> the same "state" value as the previous matrix the PC code the >>>> >> following code >>>> >> kicks in: >>>> >> ierr = >>>> >> PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); >>>> >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >>>> >> if (!pc->setupcalled) { >>>> >> ierr = PetscInfo(pc,"Setting up PC for first >>>> >> timen");CHKERRQ(ierr); >>>> >> pc->flag = DIFFERENT_NONZERO_PATTERN; >>>> >> } else if (matstate == pc->matstate) { >>>> >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>>> >> since operator is unchangedn");CHKERRQ(ierr); >>>> >> PetscFunctionReturn(0); >>>> >> and it returns without refactoring. >>>> >> We need an additional check that the matrix also remains the same. >>>> >> We will also need a test example that reproduces the problem to >>>> >> confirm that we have fixed it. >>>> >> Barry >>>> >>> Cheers, >>>> >>> Shidi >>>> >>> On 2018-05-11 12:59, Matthew Knepley wrote: >>>> >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>>> >>> Dear Matt, >>>> >>> Thank you for your help last time. >>>> >>> I want to get more detail about the Petsc-MUMPS factorisation; >>>> >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>>> >>> And I found the following functions are quite important to >>>> >>> the question: >>>> >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>>> >>> r,const MatFactorInfo *info); >>>> >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>>> >>> MatFactorInfo *info); >>>> >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>>> >>> I print some sentence to trace when these functions are called. >>>> >>> Then I test my code; the values in the matrix is changing but the >>>> >>> structure stays the same. Below is the output. >>>> >>> We can see that at 0th step, all the symbolic, numeric and solve >>>> >>> are called; in the subsequent steps only the solve stage is called, >>>> >>> the numeric step is not called. >>>> >>> How are you changing the matrix? Do you remember to assemble? >>>> >>> Matt >>>> >>> Iteration 0 Step 0.0005 Time 0.0005 >>>> >>> [INFO]: Direct Solver setup >>>> >>> MatCholeskyFactorSymbolic_MUMPS >>>> >>> finish MatCholeskyFactorSymbolic_MUMPS >>>> >>> MatFactorNumeric_MUMPS >>>> >>> finish MatFactorNumeric_MUMPS >>>> >>> MatSolve_MUMPS >>>> >>> Iteration 1 Step 0.0005 Time 0.0005 >>>> >>> MatSolve_MUMPS >>>> >>> Iteration 2 Step 0.0005 Time 0.001 >>>> >>> MatSolve_MUMPS >>>> >>> [INFO]: End of program!!! >>>> >>> I am wondering if there is any possibility to split the numeric >>>> >>> and solve stage (as you mentioned using KSPSolve). >>>> >>> Thank you very much indeed. >>>> >>> Kind Regards, >>>> >>> Shidi >>>> >>> On 2018-05-04 21:10, Y. Shidi wrote: >>>> >>> Thank you very much for your reply. >>>> >>> That is really clear. >>>> >>> Kind Regards, >>>> >>> Shidi >>>> >>> On 2018-05-04 21:05, Matthew Knepley wrote: >>>> >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>>> >>> Dear Matt, >>>> >>> Thank you very much for your reply! >>>> >>> So what you mean is that I can just do the KSPSolve() every >>>> >>> iteration >>>> >>> once the MUMPS is set? >>>> >>> Yes. >>>> >>> That means inside the KSPSolve() the numerical factorization is >>>> >>> performed. If that is the case, it seems that the ksp object is >>>> >>> not changed when the values in the matrix are changed. >>>> >>> Yes. >>>> >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>>> >>> If you do SetOperators, it will redo the factorization. If you do >>>> >>> not, >>>> >>> it will look >>>> >>> at the Mat object, determine that the structure has not changed, >>>> >>> and >>>> >>> just redo >>>> >>> the numerical factorization. >>>> >>> Thanks, >>>> >>> Matt >>>> >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>>> >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>>> >>> Dear PETSc users, >>>> >>> I am currently using MUMPS to solve linear systems directly. >>>> >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>>> >>> step and then solve the system. >>>> >>> In my code, the values in the matrix is changed in each iteration, >>>> >>> but the structure of the matrix stays the same, which means the >>>> >>> performance can be improved if symbolic factorisation is only >>>> >>> performed once. Hence, it is necessary to split the symbolic >>>> >>> and numeric factorisation. However, I cannot find a specific step >>>> >>> (control parameter) to perform the numeric factorisation. >>>> >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>>> >>> it seems that the symbolic and numeric factorisation always perform >>>> >>> together. >>>> >>> If you use KSPSolve instead, it will automatically preserve the >>>> >>> symbolic >>>> >>> factorization. >>>> >>> Thanks, >>>> >>> Matt >>>> >>> So I am wondering if anyone has an idea about it. >>>> >>> Below is how I set up MUMPS solver: >>>> >>> PC pc; >>>> >>> PetscBool flg_mumps, flg_mumps_ch; >>>> >>> flg_mumps = PETSC_FALSE; >>>> >>> flg_mumps_ch = PETSC_FALSE; >>>> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>>> >>> NULL); >>>> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>>> >>> NULL); >>>> >>> if(flg_mumps ||flg_mumps_ch) >>>> >>> { >>>> >>> KSPSetType(_ksp, KSPPREONLY); >>>> >>> PetscInt ival,icntl; >>>> >>> PetscReal val; >>>> >>> KSPGetPC(_ksp, &pc); >>>> >>> /// Set preconditioner type >>>> >>> if(flg_mumps) >>>> >>> { >>>> >>> PCSetType(pc, PCLU); >>>> >>> } >>>> >>> else if(flg_mumps_ch) >>>> >>> { >>>> >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>>> >>> PCSetType(pc, PCCHOLESKY); >>>> >>> } >>>> >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>>> >>> PCFactorSetUpMatSolverPackage(pc); >>>> >>> PCFactorGetMatrix(pc, &_F); >>>> >>> icntl = 7; ival = 0; >>>> >>> MatMumpsSetIcntl( _F, icntl, ival ); >>>> >>> MatMumpsSetIcntl(_F, 3, 6); >>>> >>> MatMumpsSetIcntl(_F, 4, 2); >>>> >>> } >>>> >>> KSPSetUp(_ksp); >>>> >>> Kind Regards, >>>> >>> Shidi >>>> >>> -- >>>> >>> What most experimenters take for granted before they begin their >>>> >>> experiments is infinitely more interesting than any results to >>>> >>> which >>>> >>> their experiments lead. >>>> >>> -- Norbert Wiener >>>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >>>> >>> Links: >>>> >>> ------ >>>> >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >>>> >>> -- >>>> >>> What most experimenters take for granted before they begin their >>>> >>> experiments is infinitely more interesting than any results to >>>> >>> which >>>> >>> their experiments lead. >>>> >>> -- Norbert Wiener >>>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >>>> >>> Links: >>>> >>> ------ >>>> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >>>> >>> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >>>> >>> -- >>>> >>> What most experimenters take for granted before they begin their >>>> >>> experiments is infinitely more interesting than any results to >>>> >>> which >>>> >>> their experiments lead. >>>> >>> -- Norbert Wiener >>>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>>> >>> Links: >>>> >>> ------ >>>> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>>> >>> [2] http://www.caam.rice.edu/~mk51/ [2] >>>> >> -- >>>> >> What most experimenters take for granted before they begin their >>>> >> experiments is infinitely more interesting than any results to which >>>> >> their experiments lead. >>>> >> -- Norbert Wiener >>>> >> https://www.cse.buffalo.edu/~knepley/ [2] >>>> >> Links: >>>> >> ------ >>>> >> [1] https://www.cse.buffalo.edu/~knepley/ >>>> >> [2] http://www.caam.rice.edu/~mk51/ >>> Alp Dener >>> Postdoctoral Appointee >>> Argonne National Laboratory >>> Mathematics and Computer Science Division From dayedut123 at 163.com Sat May 12 07:08:41 2018 From: dayedut123 at 163.com (=?GBK?B?ztI=?=) Date: Sat, 12 May 2018 20:08:41 +0800 (CST) Subject: [petsc-users] Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11 In-Reply-To: References: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> Message-ID: <6a61277f.92a2.1635440c889.Coremail.dayedut123@163.com> Thanks for your reply! I'm confused about how to create a new matrix during the time step advancing. For better understand my problem, simple pseudo code (just contains the main functions) like this: ///////////////////////////////// KSP ksp; PC pc; KSPCreate; Mat A; for(int timestep=0; timestep<20; timestep++) { //for example if(timestep==2) { localsize change; } MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, localsize, localsize, m, m, istore, jstore, vstore, &A); KSPSolve; MatDestroy(&A); } ///////////////////////////// Thanks again! Daye At 2018-05-11 19:07:35, "Matthew Knepley" wrote: On Fri, May 11, 2018 at 4:23 AM, ? wrote: Hello all, I use the function MatCreateMPIAIJWithArrays to construct my matrix. But the number of local rows m and local columns n may change during the timestep advancing. When the local size changes, the error like "Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11" will appear. Any suggestions about it ? If the parallel layout changes, you need to create a new matrix. Thanks, Matt Thank you very much! Daye -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat May 12 09:00:16 2018 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 12 May 2018 10:00:16 -0400 Subject: [petsc-users] Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11 In-Reply-To: <6a61277f.92a2.1635440c889.Coremail.dayedut123@163.com> References: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> <6a61277f.92a2.1635440c889.Coremail.dayedut123@163.com> Message-ID: On Sat, May 12, 2018 at 8:08 AM, ? wrote: > > Thanks for your reply! I'm confused about how to create a new matrix > during the time step advancing. > For better understand my problem, simple pseudo code (just contains the > main functions) like this: > It looks like MatDestroy should come before the next Create. Matt > ///////////////////////////////// > KSP ksp; > PC pc; > KSPCreate; > Mat A; > for(int timestep=0; timestep<20; timestep++) > { > //for example > if(timestep==2) > { > localsize change; > } > MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, localsize, localsize, m, m, > istore, jstore, vstore, &A); > KSPSolve; > > MatDestroy(&A); > } > ///////////////////////////// > Thanks again! > Daye > > > > > At 2018-05-11 19:07:35, "Matthew Knepley" wrote: > > On Fri, May 11, 2018 at 4:23 AM, ? wrote: > >> Hello all, >> I use the function MatCreateMPIAIJWithArrays to construct my matrix. But >> the number of local rows m and local columns n may change during the >> timestep advancing. When the local size changes, the error like "Petsc >> error: cannot chang local size of Amat after use old sizes 10 10 new sizes >> 11 11" will appear. Any suggestions about it ? >> > > If the parallel layout changes, you need to create a new matrix. > > Thanks, > > Matt > > >> Thank you very much! >> Daye >> >> >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 12 11:30:28 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 12 May 2018 16:30:28 +0000 Subject: [petsc-users] Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11 In-Reply-To: <6a61277f.92a2.1635440c889.Coremail.dayedut123@163.com> References: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> <6a61277f.92a2.1635440c889.Coremail.dayedut123@163.com> Message-ID: <791306B3-325C-48D0-89EF-035BC2ADBD28@anl.gov> Also if you change the "local size" of the matrix/vectors you must call KSPReset() so that the KSP/PC can free up all of their (previously sized) work matrices and vectors. Barry > On May 12, 2018, at 7:08 AM, ? wrote: > > > Thanks for your reply! I'm confused about how to create a new matrix during the time step advancing. > For better understand my problem, simple pseudo code (just contains the main functions) like this: > ///////////////////////////////// > KSP ksp; > PC pc; > KSPCreate; > Mat A; > for(int timestep=0; timestep<20; timestep++) > { > //for example > if(timestep==2) > { > localsize change; > } > MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, localsize, localsize, m, m, istore, jstore, vstore, &A); > KSPSolve; > > MatDestroy(&A); > } > ///////////////////////////// > Thanks again! > Daye > > > > > At 2018-05-11 19:07:35, "Matthew Knepley" wrote: > On Fri, May 11, 2018 at 4:23 AM, ? wrote: > Hello all, > I use the function MatCreateMPIAIJWithArrays to construct my matrix. But the number of local rows m and local columns n may change during the timestep advancing. When the local size changes, the error like "Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11" will appear. Any suggestions about it ? > > If the parallel layout changes, you need to create a new matrix. > > Thanks, > > Matt > > Thank you very much! > Daye > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > From dayedut123 at 163.com Sat May 12 11:46:14 2018 From: dayedut123 at 163.com (dayedut123 at 163.com) Date: Sun, 13 May 2018 00:46:14 +0800 Subject: [petsc-users] Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11 In-Reply-To: References: <78cea4a9.f3cd.1634e4c31e8.Coremail.dayedut123@163.com> <6a61277f.92a2.1635440c889.Coremail.dayedut123@163.com> Message-ID: <3AFA2BED-95D3-43E1-A7B4-5E94E054620A@163.com> You mean I should put MatDestroy before Matcreate? But is it different from that if I put destroy just after create in current timestep? Thanks again Daye ???? iPhone > ? 2018?5?12????10:00?Matthew Knepley ??? > >> On Sat, May 12, 2018 at 8:08 AM, ? wrote: >> >> Thanks for your reply! I'm confused about how to create a new matrix during the time step advancing. >> For better understand my problem, simple pseudo code (just contains the main functions) like this: > > It looks like MatDestroy should come before the next Create. > > Matt > >> ///////////////////////////////// >> KSP ksp; >> PC pc; >> KSPCreate; >> Mat A; >> for(int timestep=0; timestep<20; timestep++) >> { >> //for example >> if(timestep==2) >> { >> localsize change; >> } >> MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD, localsize, localsize, m, m, istore, jstore, vstore, &A); >> KSPSolve; >> >> MatDestroy(&A); >> } >> ///////////////////////////// >> Thanks again! >> Daye >> >> >> >> >> At 2018-05-11 19:07:35, "Matthew Knepley" wrote: >>> On Fri, May 11, 2018 at 4:23 AM, ? wrote: >>> Hello all, >>> I use the function MatCreateMPIAIJWithArrays to construct my matrix. But the number of local rows m and local columns n may change during the timestep advancing. When the local size changes, the error like "Petsc error: cannot chang local size of Amat after use old sizes 10 10 new sizes 11 11" will appear. Any suggestions about it ? >> >> If the parallel layout changes, you need to create a new matrix. >> >> Thanks, >> >> Matt >> >>> Thank you very much! >>> Daye >>> >>> >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 12 12:14:52 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 12 May 2018 17:14:52 +0000 Subject: [petsc-users] PETSc-MUMPS interface, numeric and symbolic factorisation In-Reply-To: <49e0904f2a690f49be5fdd20e859f5a0@cam.ac.uk> References: <9f21d2b4ec2a8e1626ae551ac6cee099@cam.ac.uk> <82d6cf34146a00cfecbcb74e7853edb8@cam.ac.uk> <1b73263c6f02ba7faa24a580a82afbe8@cam.ac.uk> <6fa878a0caa3950ec018e04aa3a44000@cam.ac.uk> <6D91C20C-BD00-45D8-8E5B-A385CF3E50F9@mcs.anl.gov> <3D26459B-C8DF-4261-96AF-058C579BA0F5@mcs.anl.gov> <29fcb51b0becd64d72974fa95bd9eac4@cam.ac.uk> <49e0904f2a690f49be5fdd20e859f5a0@cam.ac.uk> Message-ID: <24FFD796-6C96-4392-A7B2-9B7B56101063@mcs.anl.gov> > On May 12, 2018, at 12:05 PM, Y. Shidi wrote: > > Dear Barry, > > Thank you very much for your help. > Your approach works. > Based on your idea, I have also tried to use MatDuplicate() > for creating a new matrix that defines > the linear system and used MatCopy() for the subsequent > iterations; and it also works. Thanks for the update. My approach should be more efficient since it requires one less matrix, no creation of a new matrix each solve and less copying of matrix entries between data-structures. Barry > > Kind Regards, > Shidi > > On 2018-05-12 01:05, Smith, Barry F. wrote: >>> On May 11, 2018, at 7:00 PM, Y. Shidi wrote: >>> Alp, >>> thank you for your suggestion. As Barry pointed out, I want to >>> keep the symbolic factorization, because the matrix structure >>> in my problem is unchanged for some iterations (doing some >>> moving mesh) and this part is also sequential. >>> Thank you very much for your help and time. >>> Barry, >>> I really appreciate your time and help. >>> I will have a try on this. >>> The basic idea here is keeping the Mat object, but I am using >>> MatCreateMPIAIJWithArrays() instead, and this function always >>> create a new Mat object. I think I get it, I have checked >>> previously that if I do not destroy the Mat object created by >>> MatCreateMPIAIJWithArrays() every time, the memory usage is >>> keep going. >> I understand exactly what you are doing. With my suggested approach >> you will not use more and more memory because you only CREATE the >> matrix once with MatCreateMPIAIJWithArrays(), everything after that >> you simply refill the same initial matrix with >> MatCopyValuesMPIAIJWithArrays(). >> Please let me know if you have any difficulties with my approach, >> Barry >>> Thank you very much for your help indeed. >>> Kind Regards, >>> Shidi >>> On 2018-05-12 00:37, Smith, Barry F. wrote: >>>>> On May 11, 2018, at 6:14 PM, Dener, Alp wrote: >>>>> On May 11, 2018 at 6:03:09 PM, Smith, Barry F. (bsmith at mcs.anl.gov) wrote: >>>>>> Shidi, >>>>>> You stated: >>>>>> >> and create the matrix at the beginning of each iteration. >>>>>> We never anticipated your use case where you keep the same KSP and make a new matrix for each KSPSolve so we have bugs in PETSc that do not handle that case. >>>>> We do this in TAO for bounded Newton algorithms where the constraints are handled with an active-set method. We have to construct a new reduced Hessian with a different size every time the active variable indexes change. >>>>> We don?t completely destroy the KSP solver and create a brand new one whenever this happens. We simply call KSPReset() followed by KSPSetOperators(). This preserves the solver type and other KSP options set in the beginning (e.g.: maximum iterations or various tolerances), but permits the matrix (and the preconditioner) to be completely different objects from one KSPSolve() to another. >>>>> I don?t know if this behavior is universally supported for all KSP solvers, but it?s been working bug-free for us for STCG, NASH and GLTR solvers. I?ve also made it work with GMRES in a separate test. It may be worth it for Shidi to give it a try and see if it works. >>>> Alp, >>>> It should work to produce correct answers, unlike now when incorrect >>>> answers are produced but it doesn't serve Shidi's goal of reusing the >>>> matrix symbolic factorization. >>>> Shidi, >>>> Here is how you can get what you want. The first time in you use >>>> MatCreateMPIAIJWithArrays() to create the matrix. Do NOT destroy the >>>> matrix at the end of the loop. Instead you use MatSetValues() to >>>> transfer the values from your i,j,a arrays to the matrix with a simple >>>> loop for the second and ever other time you build the matrix. Here is >>>> a prototype (untested) of the routine you need. >>>> /* The A is the matrix obtained with MatCreateMPIAIJWithArrays(); the >>>> other arguments are the same as you pass to MatCreateMPIAIJWithArrays >>>> */ >>>> PetscErrorCode MatCopyValuesMPIAIJWithArrays(Mat A,m,i,j,a) >>>> { >>>> PetscErrorCode ierr; >>>> PetscInt i, row,rstart,nnz; >>>> ierr = MatGetOwnershipRange(A,&rstart,NULL);CHKERRQ(ierr); >>>> for (ii=0; ii>>> row = ii + rstart; >>>> nnz = i[ii+1]- i[ii]; >>>> ierr = >>>> MatSetValues(A,1,&row,nnz,j+i[ii],values+i[ii],INSERT_VALUES);CHKERRQ(ierr); >>>> } >>>> ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERR(ierr); >>>> ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERR(ierr); >>>> PetscFunctionReturn(0); >>>> } >>>> Plus it is more efficient than your current code because it does not >>>> have to build the A from scratch each time. >>>> Good luck >>>> Barry >>>>>> Barry >>>>>> > On May 11, 2018, at 5:58 PM, Y. Shidi wrote: >>>>>> > >>>>>> > Thank you for your reply. >>>>>> >> For now, if you give the same Mat object back, it will do what you >>>>>> >> expect. >>>>>> > Sorry, I am confused here. >>>>>> > The same Mat object, is it that I do not destroy the Mat at the >>>>>> > end of iteration? >>>>>> > Moreover, which function should I actually call to put the Mat >>>>>> > object back? >>>>>> > Sorry for being stupid on this. >>>>>> > >>>>>> > Kind Regards, >>>>>> > Shidi >>>>>> > >>>>>> > >>>>>> > >>>>>> > On 2018-05-11 18:10, Matthew Knepley wrote: >>>>>> >> On Fri, May 11, 2018 at 1:02 PM, Y. Shidi wrote: >>>>>> >>> Thank you very much for your reply, Barry. >>>>>> >>>> This is a bug in PETSc. Since you are providing a new matrix >>>>>> >>>> with >>>>>> >>>> the same "state" value as the previous matrix the PC code the >>>>>> >>>> following code >>>>>> >>> So what you mean is that every time I change the value in the >>>>>> >>> matrix, >>>>>> >>> the PETSc only determines if the nonzero pattern change but not the >>>>>> >>> values, and if it is unchanged neither of symbolic and numeric >>>>>> >>> happens. >>>>>> >> No, that is not what Barry is saying. >>>>>> >> PETSc looks at the matrix. >>>>>> >> If the structure has changed, it does symbolic and numeric >>>>>> >> factorization. >>>>>> >> If only values have changes, it does numeric factorization. >>>>>> >> HOWEVER, you gave it a new matrix with accidentally the same state >>>>>> >> marker, >>>>>> >> so it thought nothing had changed. We will fix this by also checking >>>>>> >> the pointer. >>>>>> >> For now, if you give the same Mat object back, it will do what you >>>>>> >> expect. >>>>>> >> Matt >>>>>> >>> I found the following code: >>>>>> >>> if (!pc->setupcalled) { >>>>>> >>> ierr = PetscInfo(pc,"Setting up PC for first >>>>>> >>> timen");CHKERRQ(ierr); >>>>>> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>>>>> >>> } else if (matstate == pc->matstate) { >>>>>> >>> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>>>>> >>> since operator is unchangedn");CHKERRQ(ierr); >>>>>> >>> PetscFunctionReturn(0); >>>>>> >>> } else { >>>>>> >>> if (matnonzerostate > pc->matnonzerostate) { >>>>>> >>> ierr = PetscInfo(pc,"Setting up PC with different nonzero >>>>>> >>> patternn");CHKERRQ(ierr); >>>>>> >>> pc->flag = DIFFERENT_NONZERO_PATTERN; >>>>>> >>> } else { >>>>>> >>> ierr = PetscInfo(pc,"Setting up PC with same nonzero >>>>>> >>> patternn");CHKERRQ(ierr); >>>>>> >>> pc->flag = SAME_NONZERO_PATTERN; >>>>>> >>> } >>>>>> >>> } >>>>>> >>> and I commend out "else if (matstate == pc->matstate){}", so it >>>>>> >>> will do "Setting up PC with same nonzero patternn"; and it seems >>>>>> >>> work in my case, only "MatFactorNumeric_MUMPS()" is calling in the >>>>>> >>> subsequent iterations. But I am not quite sure, need some more >>>>>> >>> tests. >>>>>> >>> Thank you very much for your help indeed. >>>>>> >>> Kind Regards, >>>>>> >>> Shidi >>>>>> >>> On 2018-05-11 16:13, Smith, Barry F. wrote: >>>>>> >>> On May 11, 2018, at 8:14 AM, Y. Shidi wrote: >>>>>> >>> Thank you for your reply. >>>>>> >>> How are you changing the matrix? Do you remember to assemble? >>>>>> >>> I use MatCreateMPIAIJWithArrays() to create the matrix, >>>>>> >>> and after that I call MatAssemblyBegin() and MatAssemblyEnd(). >>>>>> >> If you use MatCreateMPIAIJWithArrays() you don't need to call >>>>>> >> MatAssemblyBegin() and MatAssemblyEnd(). >>>>>> >>> But I actually destroy the matrix at the end of each iteration >>>>>> >>> and create the matrix at the beginning of each iteration. >>>>>> >> This is a bug in PETSc. Since you are providing a new matrix with >>>>>> >> the same "state" value as the previous matrix the PC code the >>>>>> >> following code >>>>>> >> kicks in: >>>>>> >> ierr = >>>>>> >> PetscObjectStateGet((PetscObject)pc->pmat,&matstate);CHKERRQ(ierr); >>>>>> >> ierr = MatGetNonzeroState(pc->pmat,&matnonzerostate);CHKERRQ(ierr); >>>>>> >> if (!pc->setupcalled) { >>>>>> >> ierr = PetscInfo(pc,"Setting up PC for first >>>>>> >> timen");CHKERRQ(ierr); >>>>>> >> pc->flag = DIFFERENT_NONZERO_PATTERN; >>>>>> >> } else if (matstate == pc->matstate) { >>>>>> >> ierr = PetscInfo(pc,"Leaving PC with identical preconditioner >>>>>> >> since operator is unchangedn");CHKERRQ(ierr); >>>>>> >> PetscFunctionReturn(0); >>>>>> >> and it returns without refactoring. >>>>>> >> We need an additional check that the matrix also remains the same. >>>>>> >> We will also need a test example that reproduces the problem to >>>>>> >> confirm that we have fixed it. >>>>>> >> Barry >>>>>> >>> Cheers, >>>>>> >>> Shidi >>>>>> >>> On 2018-05-11 12:59, Matthew Knepley wrote: >>>>>> >>> On Fri, May 11, 2018 at 7:14 AM, Y. Shidi wrote: >>>>>> >>> Dear Matt, >>>>>> >>> Thank you for your help last time. >>>>>> >>> I want to get more detail about the Petsc-MUMPS factorisation; >>>>>> >>> so I go to look the code "/src/mat/impls/aij/mpi/mumps/mumps.c". >>>>>> >>> And I found the following functions are quite important to >>>>>> >>> the question: >>>>>> >>> PetscErrorCode MatCholeskyFactorSymbolic_MUMPS(Mat F,Mat A,IS >>>>>> >>> r,const MatFactorInfo *info); >>>>>> >>> PetscErrorCode MatFactorNumeric_MUMPS(Mat F,Mat A,const >>>>>> >>> MatFactorInfo *info); >>>>>> >>> PetscErrorCode MatSolve_MUMPS(Mat A,Vec b,Vec x); >>>>>> >>> I print some sentence to trace when these functions are called. >>>>>> >>> Then I test my code; the values in the matrix is changing but the >>>>>> >>> structure stays the same. Below is the output. >>>>>> >>> We can see that at 0th step, all the symbolic, numeric and solve >>>>>> >>> are called; in the subsequent steps only the solve stage is called, >>>>>> >>> the numeric step is not called. >>>>>> >>> How are you changing the matrix? Do you remember to assemble? >>>>>> >>> Matt >>>>>> >>> Iteration 0 Step 0.0005 Time 0.0005 >>>>>> >>> [INFO]: Direct Solver setup >>>>>> >>> MatCholeskyFactorSymbolic_MUMPS >>>>>> >>> finish MatCholeskyFactorSymbolic_MUMPS >>>>>> >>> MatFactorNumeric_MUMPS >>>>>> >>> finish MatFactorNumeric_MUMPS >>>>>> >>> MatSolve_MUMPS >>>>>> >>> Iteration 1 Step 0.0005 Time 0.0005 >>>>>> >>> MatSolve_MUMPS >>>>>> >>> Iteration 2 Step 0.0005 Time 0.001 >>>>>> >>> MatSolve_MUMPS >>>>>> >>> [INFO]: End of program!!! >>>>>> >>> I am wondering if there is any possibility to split the numeric >>>>>> >>> and solve stage (as you mentioned using KSPSolve). >>>>>> >>> Thank you very much indeed. >>>>>> >>> Kind Regards, >>>>>> >>> Shidi >>>>>> >>> On 2018-05-04 21:10, Y. Shidi wrote: >>>>>> >>> Thank you very much for your reply. >>>>>> >>> That is really clear. >>>>>> >>> Kind Regards, >>>>>> >>> Shidi >>>>>> >>> On 2018-05-04 21:05, Matthew Knepley wrote: >>>>>> >>> On Fri, May 4, 2018 at 3:54 PM, Y. Shidi wrote: >>>>>> >>> Dear Matt, >>>>>> >>> Thank you very much for your reply! >>>>>> >>> So what you mean is that I can just do the KSPSolve() every >>>>>> >>> iteration >>>>>> >>> once the MUMPS is set? >>>>>> >>> Yes. >>>>>> >>> That means inside the KSPSolve() the numerical factorization is >>>>>> >>> performed. If that is the case, it seems that the ksp object is >>>>>> >>> not changed when the values in the matrix are changed. >>>>>> >>> Yes. >>>>>> >>> Or do I need to call both KSPSetOperators() and KSPSolve()? >>>>>> >>> If you do SetOperators, it will redo the factorization. If you do >>>>>> >>> not, >>>>>> >>> it will look >>>>>> >>> at the Mat object, determine that the structure has not changed, >>>>>> >>> and >>>>>> >>> just redo >>>>>> >>> the numerical factorization. >>>>>> >>> Thanks, >>>>>> >>> Matt >>>>>> >>> On 2018-05-04 14:44, Matthew Knepley wrote: >>>>>> >>> On Fri, May 4, 2018 at 9:40 AM, Y. Shidi wrote: >>>>>> >>> Dear PETSc users, >>>>>> >>> I am currently using MUMPS to solve linear systems directly. >>>>>> >>> Generally, we use ICNTL(7) or ICNTL(29) to do the preprocessing >>>>>> >>> step and then solve the system. >>>>>> >>> In my code, the values in the matrix is changed in each iteration, >>>>>> >>> but the structure of the matrix stays the same, which means the >>>>>> >>> performance can be improved if symbolic factorisation is only >>>>>> >>> performed once. Hence, it is necessary to split the symbolic >>>>>> >>> and numeric factorisation. However, I cannot find a specific step >>>>>> >>> (control parameter) to perform the numeric factorisation. >>>>>> >>> I have used ICNTL(3) and ICNTL(4) to print the MUMPS information, >>>>>> >>> it seems that the symbolic and numeric factorisation always perform >>>>>> >>> together. >>>>>> >>> If you use KSPSolve instead, it will automatically preserve the >>>>>> >>> symbolic >>>>>> >>> factorization. >>>>>> >>> Thanks, >>>>>> >>> Matt >>>>>> >>> So I am wondering if anyone has an idea about it. >>>>>> >>> Below is how I set up MUMPS solver: >>>>>> >>> PC pc; >>>>>> >>> PetscBool flg_mumps, flg_mumps_ch; >>>>>> >>> flg_mumps = PETSC_FALSE; >>>>>> >>> flg_mumps_ch = PETSC_FALSE; >>>>>> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_lu", &flg_mumps, >>>>>> >>> NULL); >>>>>> >>> PetscOptionsGetBool(NULL, NULL, "-use_mumps_ch", &flg_mumps_ch, >>>>>> >>> NULL); >>>>>> >>> if(flg_mumps ||flg_mumps_ch) >>>>>> >>> { >>>>>> >>> KSPSetType(_ksp, KSPPREONLY); >>>>>> >>> PetscInt ival,icntl; >>>>>> >>> PetscReal val; >>>>>> >>> KSPGetPC(_ksp, &pc); >>>>>> >>> /// Set preconditioner type >>>>>> >>> if(flg_mumps) >>>>>> >>> { >>>>>> >>> PCSetType(pc, PCLU); >>>>>> >>> } >>>>>> >>> else if(flg_mumps_ch) >>>>>> >>> { >>>>>> >>> MatSetOption(A, MAT_SPD, PETSC_TRUE); >>>>>> >>> PCSetType(pc, PCCHOLESKY); >>>>>> >>> } >>>>>> >>> PCFactorSetMatSolverPackage(pc, MATSOLVERMUMPS); >>>>>> >>> PCFactorSetUpMatSolverPackage(pc); >>>>>> >>> PCFactorGetMatrix(pc, &_F); >>>>>> >>> icntl = 7; ival = 0; >>>>>> >>> MatMumpsSetIcntl( _F, icntl, ival ); >>>>>> >>> MatMumpsSetIcntl(_F, 3, 6); >>>>>> >>> MatMumpsSetIcntl(_F, 4, 2); >>>>>> >>> } >>>>>> >>> KSPSetUp(_ksp); >>>>>> >>> Kind Regards, >>>>>> >>> Shidi >>>>>> >>> -- >>>>>> >>> What most experimenters take for granted before they begin their >>>>>> >>> experiments is infinitely more interesting than any results to >>>>>> >>> which >>>>>> >>> their experiments lead. >>>>>> >>> -- Norbert Wiener >>>>>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [1] [1] >>>>>> >>> Links: >>>>>> >>> ------ >>>>>> >>> [1] http://www.caam.rice.edu/~mk51/ [2] [2] [2] >>>>>> >>> -- >>>>>> >>> What most experimenters take for granted before they begin their >>>>>> >>> experiments is infinitely more interesting than any results to >>>>>> >>> which >>>>>> >>> their experiments lead. >>>>>> >>> -- Norbert Wiener >>>>>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [1] [2] >>>>>> >>> Links: >>>>>> >>> ------ >>>>>> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] [1] >>>>>> >>> [2] http://www.caam.rice.edu/~mk51/ [2] [2] >>>>>> >>> -- >>>>>> >>> What most experimenters take for granted before they begin their >>>>>> >>> experiments is infinitely more interesting than any results to >>>>>> >>> which >>>>>> >>> their experiments lead. >>>>>> >>> -- Norbert Wiener >>>>>> >>> https://www.cse.buffalo.edu/~knepley/ [1] [2] >>>>>> >>> Links: >>>>>> >>> ------ >>>>>> >>> [1] https://www.cse.buffalo.edu/~knepley/ [1] >>>>>> >>> [2] http://www.caam.rice.edu/~mk51/ [2] >>>>>> >> -- >>>>>> >> What most experimenters take for granted before they begin their >>>>>> >> experiments is infinitely more interesting than any results to which >>>>>> >> their experiments lead. >>>>>> >> -- Norbert Wiener >>>>>> >> https://www.cse.buffalo.edu/~knepley/ [2] >>>>>> >> Links: >>>>>> >> ------ >>>>>> >> [1] https://www.cse.buffalo.edu/~knepley/ >>>>>> >> [2] http://www.caam.rice.edu/~mk51/ >>>>> Alp Dener >>>>> Postdoctoral Appointee >>>>> Argonne National Laboratory >>>>> Mathematics and Computer Science Division > From C.Klaij at marin.nl Mon May 14 04:45:15 2018 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 14 May 2018 09:45:15 +0000 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines Message-ID: <1526291115208.90319@marin.nl> With petsc-3.7.5, I had F90 code like this: CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif ----------------------------------------------^ error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif ----------------------------------------------^ What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? Chris dr. ir. Christiaan Klaij | Senior Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm From andreas.hauffe at tu-dresden.de Mon May 14 06:27:10 2018 From: andreas.hauffe at tu-dresden.de (Andreas Hauffe) Date: Mon, 14 May 2018 13:27:10 +0200 Subject: [petsc-users] PETSC without MPI but a direct solver Message-ID: <3c9cb05b-736d-5c56-b9ac-4229c52899bc@tu-dresden.de> Hi, we are using PETSC and as a direct solver MUMPS for some finite element tool. Right now there is a need for a sequantial code without mpi. In the documentation we read ("Installing without MPI") that it is possible to do so. But not in the case of MUMPS as direct solver. Is there any direct solver, where we can compile a sequential library without mpi? -- Regards Andreas Hauffe Leiter des Forschungsfeldes "Auslegungsmethoden f?r Luftfahrzeuge" ---------------------------------------------------------------------------------------------------- Technische Universit?t Dresden Institut f?r Luft- und Raumfahrttechnik / Institute of Aerospace Engineering Lehrstuhl f?r Luftfahrzeugtechnik / Chair of Aircraft Engineering D-01062 Dresden Germany phone : +49 (351) 463 38496 fax : +49 (351) 463 37263 mail : andreas.hauffe at tu-dresden.de Website : http://tu-dresden.de/mw/ilr/lft ---------------------------------------------------------------------------------------------------- Do you know our free laminate analysis code eLamX?? If not, please visit the following web address: http://www.elamx.de -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5247 bytes Desc: S/MIME Cryptographic Signature URL: From knepley at gmail.com Mon May 14 06:32:48 2018 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 14 May 2018 07:32:48 -0400 Subject: [petsc-users] PETSC without MPI but a direct solver In-Reply-To: <3c9cb05b-736d-5c56-b9ac-4229c52899bc@tu-dresden.de> References: <3c9cb05b-736d-5c56-b9ac-4229c52899bc@tu-dresden.de> Message-ID: On Mon, May 14, 2018 at 7:27 AM, Andreas Hauffe < andreas.hauffe at tu-dresden.de> wrote: > Hi, > > we are using PETSC and as a direct solver MUMPS for some finite element > tool. Right now there is a need for a sequantial code without mpi. In the > documentation we read ("Installing without MPI") that it is possible to do > so. But not in the case of MUMPS as direct solver. Is there any direct > solver, where we can compile a sequential library without mpi? > --download-superlu Thanks, Matt > -- > Regards > Andreas Hauffe > Leiter des Forschungsfeldes "Auslegungsmethoden f?r Luftfahrzeuge" > > ------------------------------------------------------------ > ---------------------------------------- > Technische Universit?t Dresden > Institut f?r Luft- und Raumfahrttechnik / Institute of Aerospace > Engineering > Lehrstuhl f?r Luftfahrzeugtechnik / Chair of Aircraft Engineering > > D-01062 Dresden > Germany > > phone : +49 (351) 463 38496 > fax : +49 (351) 463 37263 > mail : andreas.hauffe at tu-dresden.de > Website : http://tu-dresden.de/mw/ilr/lft > ------------------------------------------------------------ > ---------------------------------------- > Do you know our free laminate analysis code eLamX?? If not, please visit > the following web address: > http://www.elamx.de > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon May 14 07:48:24 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 May 2018 07:48:24 -0500 Subject: [petsc-users] PETSC without MPI but a direct solver In-Reply-To: <3c9cb05b-736d-5c56-b9ac-4229c52899bc@tu-dresden.de> References: <3c9cb05b-736d-5c56-b9ac-4229c52899bc@tu-dresden.de> Message-ID: You can use mumps: ./configure --with-mpi=0 --download-mumps=1 --with-mumps-serial=1 Satish On Mon, 14 May 2018, Andreas Hauffe wrote: > Hi, > > we are using PETSC and as a direct solver MUMPS for some finite element tool. > Right now there is a need for a sequantial code without mpi. In the > documentation we read ("Installing without MPI") that it is possible to do so. > But not in the case of MUMPS as direct solver. Is there any direct solver, > where we can compile a sequential library without mpi? > > From bsmith at mcs.anl.gov Mon May 14 10:40:58 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 14 May 2018 15:40:58 +0000 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines In-Reply-To: <1526291115208.90319@marin.nl> References: <1526291115208.90319@marin.nl> Message-ID: Chris, These arguments should never have been PETSC_NULL_INTEGER since they are integers (and not pointers or arrays), you should pass 0 for them. Barry > On May 14, 2018, at 4:45 AM, Klaij, Christiaan wrote: > > With petsc-3.7.5, I had F90 code like this: > > CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) > CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) > > which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: > > error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] > CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif > ----------------------------------------------^ > error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] > CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif > ----------------------------------------------^ > > What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? > > Chris > > > dr. ir. Christiaan Klaij | Senior Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm > From s_g at berkeley.edu Mon May 14 10:45:42 2018 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Mon, 14 May 2018 17:45:42 +0200 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines In-Reply-To: References: <1526291115208.90319@marin.nl> Message-ID: Barry, ? Is it then incorrect (and potentially dangerous) to use PETSC_NULL_INTEGER(1) for these arguments, even if that currently works? -sanjay On 5/14/18 5:40 PM, Smith, Barry F. wrote: > Chris, > > These arguments should never have been PETSC_NULL_INTEGER since they are integers (and not pointers or arrays), you should pass 0 for them. > > Barry > > >> On May 14, 2018, at 4:45 AM, Klaij, Christiaan wrote: >> >> With petsc-3.7.5, I had F90 code like this: >> >> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) >> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) >> >> which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: >> >> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >> ----------------------------------------------^ >> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >> ----------------------------------------------^ >> >> What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? >> >> Chris >> >> >> dr. ir. Christiaan Klaij | Senior Researcher | Research & Development >> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >> >> MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm >> From balay at mcs.anl.gov Mon May 14 10:45:59 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 May 2018 10:45:59 -0500 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines In-Reply-To: References: <1526291115208.90319@marin.nl> Message-ID: Its best to avoid passing 'constant' values directly to petsc fortran routines. For eg: src/ksp/ksp/examples/tests/ex16f.F90 PetscInt izero izero = 0 etc.. Satish On Mon, 14 May 2018, Smith, Barry F. wrote: > Chris, > > These arguments should never have been PETSC_NULL_INTEGER since they are integers (and not pointers or arrays), you should pass 0 for them. > > Barry > > > > On May 14, 2018, at 4:45 AM, Klaij, Christiaan wrote: > > > > With petsc-3.7.5, I had F90 code like this: > > > > CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) > > CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) > > > > which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: > > > > error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] > > CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif > > ----------------------------------------------^ > > error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] > > CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif > > ----------------------------------------------^ > > > > What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? > > > > Chris > > > > > > dr. ir. Christiaan Klaij | Senior Researcher | Research & Development > > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > > > MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm > > > > From bsmith at mcs.anl.gov Mon May 14 10:52:22 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 14 May 2018 15:52:22 +0000 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines In-Reply-To: References: <1526291115208.90319@marin.nl> Message-ID: > On May 14, 2018, at 10:45 AM, Sanjay Govindjee wrote: > > Barry, > Is it then incorrect (and potentially dangerous) to use PETSC_NULL_INTEGER(1) for these arguments, even if that > currently works? I thought you got warnings or errors because you are passing an array instead of a value? Anyways since Fortran is pass by reference passing either a PetscInt II or PetscInt II(1) does the same things but modern Fortran compilers require you to pass a scalar when expected (II) and an array when expected (II(1)) otherwise the compiler warns or errors. Better to pass what is expected, in this case a 0. Barry > -sanjay > > On 5/14/18 5:40 PM, Smith, Barry F. wrote: >> Chris, >> >> These arguments should never have been PETSC_NULL_INTEGER since they are integers (and not pointers or arrays), you should pass 0 for them. >> >> Barry >> >> >>> On May 14, 2018, at 4:45 AM, Klaij, Christiaan wrote: >>> >>> With petsc-3.7.5, I had F90 code like this: >>> >>> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) >>> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) >>> >>> which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: >>> >>> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >>> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >>> ----------------------------------------------^ >>> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >>> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >>> ----------------------------------------------^ >>> >>> What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? >>> >>> Chris >>> >>> >>> dr. ir. Christiaan Klaij | Senior Researcher | Research & Development >>> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >>> >>> MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm >>> > From s_g at berkeley.edu Mon May 14 12:04:36 2018 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Mon, 14 May 2018 19:04:36 +0200 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines In-Reply-To: References: <1526291115208.90319@marin.nl> Message-ID: Yes the error is shape mis-match since PETSC_NULL_INTEGER has been changed to be an array (whereas it used to just be an integer), and the subroutine is expecting an integer not an array. If 0 is what will always be expected, then I agree, just pass 0. But if there is a potential for this to change in the future, then it seems better to hide this under a named entity. -sanjay On 5/14/18 5:52 PM, Smith, Barry F. wrote: > >> On May 14, 2018, at 10:45 AM, Sanjay Govindjee wrote: >> >> Barry, >> Is it then incorrect (and potentially dangerous) to use PETSC_NULL_INTEGER(1) for these arguments, even if that >> currently works? > I thought you got warnings or errors because you are passing an array instead of a value? > > Anyways since Fortran is pass by reference passing either a PetscInt II or PetscInt II(1) does the same things but modern Fortran compilers require you to pass a scalar when expected (II) and an array when expected (II(1)) otherwise the compiler warns or errors. Better to pass what is expected, in this case a 0. > > Barry > >> -sanjay >> >> On 5/14/18 5:40 PM, Smith, Barry F. wrote: >>> Chris, >>> >>> These arguments should never have been PETSC_NULL_INTEGER since they are integers (and not pointers or arrays), you should pass 0 for them. >>> >>> Barry >>> >>> >>>> On May 14, 2018, at 4:45 AM, Klaij, Christiaan wrote: >>>> >>>> With petsc-3.7.5, I had F90 code like this: >>>> >>>> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) >>>> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) >>>> >>>> which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: >>>> >>>> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >>>> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >>>> ----------------------------------------------^ >>>> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >>>> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >>>> ----------------------------------------------^ >>>> >>>> What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? >>>> >>>> Chris >>>> >>>> >>>> dr. ir. Christiaan Klaij | Senior Researcher | Research & Development >>>> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >>>> >>>> MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm >>>> From bsmith at mcs.anl.gov Mon May 14 12:56:21 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 14 May 2018 17:56:21 +0000 Subject: [petsc-users] using PETSC_NULL_INTEGER in preallocation routines In-Reply-To: References: <1526291115208.90319@marin.nl> Message-ID: > On May 14, 2018, at 12:04 PM, Sanjay Govindjee wrote: > > Yes the error is shape mis-match since PETSC_NULL_INTEGER has been changed to be an array (whereas it used to just be an integer), and the subroutine is expecting an integer not an array. > > If 0 is what will always be expected, then I agree, just pass 0. But if there is a potential for this to change > in the future, then it seems better to hide this under a named entity. This argument is actually ignored if you provide the d/o_nnz argument so 0 will always work. Barry > > -sanjay > > On 5/14/18 5:52 PM, Smith, Barry F. wrote: >> >>> On May 14, 2018, at 10:45 AM, Sanjay Govindjee wrote: >>> >>> Barry, >>> Is it then incorrect (and potentially dangerous) to use PETSC_NULL_INTEGER(1) for these arguments, even if that >>> currently works? >> I thought you got warnings or errors because you are passing an array instead of a value? >> >> Anyways since Fortran is pass by reference passing either a PetscInt II or PetscInt II(1) does the same things but modern Fortran compilers require you to pass a scalar when expected (II) and an array when expected (II(1)) otherwise the compiler warns or errors. Better to pass what is expected, in this case a 0. >> >> Barry >> >>> -sanjay >>> >>> On 5/14/18 5:40 PM, Smith, Barry F. wrote: >>>> Chris, >>>> >>>> These arguments should never have been PETSC_NULL_INTEGER since they are integers (and not pointers or arrays), you should pass 0 for them. >>>> >>>> Barry >>>> >>>> >>>>> On May 14, 2018, at 4:45 AM, Klaij, Christiaan wrote: >>>>> >>>>> With petsc-3.7.5, I had F90 code like this: >>>>> >>>>> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); CHKERRQ(ierr) >>>>> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); CHKERRQ(ierr) >>>>> >>>>> which worked fine. Now, with petsc-3.8.4, the same code gives this compilation error: >>>>> >>>>> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >>>>> CALL MatSeqAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >>>>> ----------------------------------------------^ >>>>> error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [PETSC_NULL_INTEGER] >>>>> CALL MatMPIAIJSetPreallocation(aa_symmetric,PETSC_NULL_INTEGER,d_nnz,PETSC_NULL_INTEGER,o_nnz,ierr); if (ierr .ne. 0) then ; call PetscErrorF(ierr); return; endif >>>>> ----------------------------------------------^ >>>>> >>>>> What's the intended usage now, simply 0 instead of PETSC_NULL_INTEGER? >>>>> >>>>> Chris >>>>> >>>>> >>>>> dr. ir. Christiaan Klaij | Senior Researcher | Research & Development >>>>> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >>>>> >>>>> MARIN news: http://www.marin.nl/web/News/News-items/120-papers-presented-at-NAV2018.htm >>>>> > From salazardetro1 at llnl.gov Mon May 14 16:45:31 2018 From: salazardetro1 at llnl.gov (Salazar De Troya, Miguel) Date: Mon, 14 May 2018 21:45:31 +0000 Subject: [petsc-users] Preconditioner for a elasticity problem with Robin boundary conditions Message-ID: <60C775A6-A75B-4227-B253-E229EA3C593B@llnl.gov> Hello, Up until now, I have been solving my elasticity problem using the gamg preconditioner with the following options: -ksp_monitor_true_residual -ksp_converged_reason -ksp_type cg -log_view -mg_levels_esteig_ksp_type cg -mg_levels_ksp_chebyshev_esteig_steps 10 -mg_levels_ksp_type chebyshev -mg_levels_pc_type sor -pc_type gamg -pc_gamg_verbose 7 -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_threshold 0.001 -snes_linesearch_type basic -snes_atol 1e-6 -ksp_atol 1e-7 -ksp_rtol 1e-9 -ksp_norm_type unpreconditioned It has worked great so far, even for my problem where I have two materials with more than 1e3 ratio. Now, when I add robin boundary conditions, the solver fails with DIVERGED_INDEFINITE_PC. Using -ksp_type bcgs makes the solver converge. I imagine it is because it can handle the indefinite PC. Why is the PC indefinite for the Robin problem? Is there any way to make it positive definite? I am also passing the rigid body modes of my mesh to the solver. I am not sure if these modes change with the Robin boundary conditions, they should not, should they? Thanks Miguel -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 14 16:59:32 2018 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 14 May 2018 17:59:32 -0400 Subject: [petsc-users] Preconditioner for a elasticity problem with Robin boundary conditions In-Reply-To: <60C775A6-A75B-4227-B253-E229EA3C593B@llnl.gov> References: <60C775A6-A75B-4227-B253-E229EA3C593B@llnl.gov> Message-ID: On Mon, May 14, 2018 at 5:45 PM, Salazar De Troya, Miguel < salazardetro1 at llnl.gov> wrote: > Hello, > > > > Up until now, I have been solving my elasticity problem using the gamg > preconditioner with the following options: > > > > -ksp_monitor_true_residual > > -ksp_converged_reason > > -ksp_type cg > > -log_view > > -mg_levels_esteig_ksp_type cg > > -mg_levels_ksp_chebyshev_esteig_steps 10 > > -mg_levels_ksp_type chebyshev > > -mg_levels_pc_type sor > > -pc_type gamg > > -pc_gamg_verbose 7 > > -pc_gamg_type agg > > -pc_gamg_agg_nsmooths 1 > > -pc_gamg_threshold 0.001 > > -snes_linesearch_type basic > > -snes_atol 1e-6 > > -ksp_atol 1e-7 > > -ksp_rtol 1e-9 > > -ksp_norm_type unpreconditioned > > > > It has worked great so far, even for my problem where I have two materials > with more than 1e3 ratio. > > > > Now, when I add robin boundary conditions, the solver fails with > DIVERGED_INDEFINITE_PC. Using -ksp_type bcgs makes the solver converge. I > imagine it is because it can handle the indefinite PC. Why is the PC > indefinite for the Robin problem? > It matters how you are enforcing the conditions. Just write out the problem completely for 1 vertex constrained. It should be clear whether its symmetric or not. > Is there any way to make it positive definite? > It should be possible if the BC are imposed in the right way. > I am also passing the rigid body modes of my mesh to the solver. I am not > sure if these modes change with the Robin boundary conditions, they should > not, should they? > No, these are precisely the null modes of the symbol, and not affected by BC. Thanks, Matt > Thanks > > Miguel > > > > -- > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Tue May 15 04:25:01 2018 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 15 May 2018 09:25:01 +0000 Subject: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 In-Reply-To: References: <1428666513941.72745@marin.nl> <75D79823-7AE0-47A7-BE9E-15AB81C3581E@mcs.anl.gov> <1428671243078.94@marin.nl> <9EA3A2C1-5372-44A8-B0B7-4ADAF1D89819@mcs.anl.gov> <1484300795996.50804@marin.nl> <1484552853408.13052@marin.nl> <04C3073A-84AB-419F-A143-ACF10D97B2DE@mcs.anl.gov> <1484639126495.83463@marin.nl> <1484728820045.70097@marin.nl> <1110db5d-ddec-d295-2eba-b15182033b69@imperial.ac.uk> <1484736132149.87779@marin.nl>, Message-ID: <1526376301666.67638@marin.nl> Matt, Just a reminder. With petsc-3.8.4 the issue is still there. Chris dr. ir. Christiaan Klaij | Senior Researcher | Research & Development MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl [LinkedIn] [YouTube] [Twitter] [Facebook] MARIN news: MARIN at WindDays 2018, June 13 & 14, Rotterdam ________________________________ From: Matthew Knepley Sent: Wednesday, January 18, 2017 4:13 PM To: Klaij, Christiaan Cc: Lawrence Mitchell; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 On Wed, Jan 18, 2017 at 4:42 AM, Klaij, Christiaan > wrote: Thanks Lawrence, that nicely explains the unexpected behaviour! I guess in general there ought to be getters for the four ksp(A00)'s that occur in the full factorization. Yes, we will fix it. I think that the default retrieval should get the 00 block, not the inner as well. Matt Chris dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/Verification-and-validation-exercises-for-flow-around-KVLCC2-tanker.htm ________________________________________ From: Lawrence Mitchell > Sent: Wednesday, January 18, 2017 10:59 AM To: petsc-users at mcs.anl.gov Cc: bsmith at mcs.anl.gov; Klaij, Christiaan Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 On 18/01/17 08:40, Klaij, Christiaan wrote: > Barry, > > I've managed to replicate the problem with 3.7.4 > snes/examples/tutorials/ex70.c. Basically I've added > KSPGetTotalIterations to main (file is attached): PCFieldSplitGetSubKSP returns, in the Schur case: MatSchurComplementGet(pc->schur, &ksp); in subksp[0] and pc->schur in subksp[1] In your case, subksp[0] is the (preonly) approximation to A^{-1} *inside* S = D - C A_inner^{-1} B And subksp[1] is the approximation to S^{-1}. Since each application of S to a vector (required in S^{-1}) requires one application of A^{-1}, because you use 225 iterations in total to invert S, you also use 225 applications of the KSP on A_inner. There doesn't appear to be a way to get the KSP used for A^{-1} if you've asked for different approximations to A^{-1} in the 0,0 block and inside S. Cheers, Lawrence > $ diff -u ex70.c.bak ex70.c > --- ex70.c.bak2017-01-18 09:25:46.286174830 +0100 > +++ ex70.c2017-01-18 09:03:40.904483434 +0100 > @@ -669,6 +669,10 @@ > KSP ksp; > PetscErrorCode ierr; > > + KSP *subksp; > + PC pc; > + PetscInt numsplit = 1, nusediter_vv, nusediter_pp; > + > ierr = PetscInitialize(&argc, &argv, NULL, help);CHKERRQ(ierr); > s.nx = 4; > s.ny = 6; > @@ -690,6 +694,13 @@ > ierr = StokesSetupPC(&s, ksp);CHKERRQ(ierr); > ierr = KSPSolve(ksp, s.b, s.x);CHKERRQ(ierr); > > + ierr = KSPGetPC(ksp, &pc);CHKERRQ(ierr); > + ierr = PCFieldSplitGetSubKSP(pc,&numsplit,&subksp); CHKERRQ(ierr); > + ierr = KSPGetTotalIterations(subksp[0],&nusediter_vv); CHKERRQ(ierr); > + ierr = KSPGetTotalIterations(subksp[1],&nusediter_pp); CHKERRQ(ierr); > + ierr = PetscPrintf(PETSC_COMM_WORLD," total u solves = %i\n", nusediter_vv); CHKERRQ(ierr); > + ierr = PetscPrintf(PETSC_COMM_WORLD," total p solves = %i\n", nusediter_pp); CHKERRQ(ierr); > + > /* don't trust, verify! */ > ierr = StokesCalcResidual(&s);CHKERRQ(ierr); > ierr = StokesCalcError(&s);CHKERRQ(ierr); > > Now run as follows: > > $ mpirun -n 2 ./ex70 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -fieldsplit_0_ksp_type gmres -fieldsplit_0_pc_type bjacobi -fieldsplit_1_pc_type jacobi -fieldsplit_1_inner_ksp_type preonly -fieldsplit_1_inner_pc_type jacobi -fieldsplit_1_upper_ksp_type preonly -fieldsplit_1_upper_pc_type jacobi -fieldsplit_0_ksp_converged_reason -fieldsplit_1_ksp_converged_reason > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 14 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 14 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 16 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 16 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 17 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 18 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 20 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 21 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 23 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 22 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 22 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 5 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 22 > total u solves = 225 > total p solves = 225 > residual u = 9.67257e-06 > residual p = 5.42082e-07 > residual [u,p] = 9.68775e-06 > discretization error u = 0.0106464 > discretization error p = 1.85907 > discretization error [u,p] = 1.8591 > > So here again the total of 225 is correct for p, but for u it > should be 60. Hope this helps you find the problem. > > Chris > > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: http://www.marin.nl/web/News/News-items/Few-places-left-for-Offshore-and-Ship-hydrodynamics-courses.htm > > ________________________________________ > From: Klaij, Christiaan > Sent: Tuesday, January 17, 2017 8:45 AM > To: Barry Smith > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 > > Well, that's it, all the rest was hard coded. Here's the relevant part of the code: > > CALL PCSetType(pc_system,PCFIELDSPLIT,ierr); CHKERRQ(ierr) > CALL PCFieldSplitSetType(pc_system,PC_COMPOSITE_SCHUR,ierr); CHKERRQ(ierr) > CALL PCFieldSplitSetIS(pc_system,"0",isgs(1),ierr); CHKERRQ(ierr) > CALL PCFieldSplitSetIS(pc_system,"1",isgs(2),ierr); CHKERRQ(ierr) > CALL PCFieldSplitSetSchurFactType(pc_system,PC_FIELDSPLIT_SCHUR_FACT_FULL,ierr);CHKERRQ(ierr) > CALL PCFieldSplitSetSchurPre(pc_system,PC_FIELDSPLIT_SCHUR_PRE_SELFP,PETSC_NULL_OBJECT,ierr);CHKERRQ(ierr) > > CALL KSPSetTolerances(ksp_system,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_REAL,maxiter,ierr); CHKERRQ(ierr) > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_0_ksp_rtol","0.01",ierr); CHKERRQ(ierr) > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_1_ksp_rtol","0.01",ierr); CHKERRQ(ierr) > > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_0_ksp_pc_side","right",ierr); CHKERRQ(ierr) > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_1_ksp_pc_side","right",ierr); CHKERRQ(ierr) > > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_0_ksp_type","gmres",ierr); CHKERRQ(ierr) > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_1_upper_ksp_type","preonly",ierr); CHKERRQ(ierr) > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_1_upper_pc_type","jacobi",ierr); CHKERRQ(ierr) > > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_1_inner_ksp_type","preonly",ierr); CHKERRQ(ierr) > CALL PetscOptionsSetValue(PETSC_NULL_OBJECT,"-sys_fieldsplit_1_inner_pc_type","jacobi",ierr); CHKERRQ(ierr) > > ________________________________________ > From: Barry Smith > > Sent: Monday, January 16, 2017 9:28 PM > To: Klaij, Christiaan > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 > > Please send all the command line options you use. > > >> On Jan 16, 2017, at 1:47 AM, Klaij, Christiaan > wrote: >> >> Barry, >> >> Sure, here's the output with: >> >> -sys_ksp_view -sys_ksp_converged_reason -sys_fieldsplit_0_ksp_converged_reason -sys_fieldsplit_1_ksp_converged_reason >> >> (In my previous email, I rearranged 0 & 1 for easy summing.) >> >> Chris >> >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 1 >> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 22 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 1 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 2 >> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 2 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 3 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 2 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 2 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 2 >> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >> Linear sys_ solve converged due to CONVERGED_RTOL iterations 6 >> KSP Object:(sys_) 1 MPI processes >> type: fgmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=300, initial guess is zero >> tolerances: relative=0.01, absolute=1e-50, divergence=10000. >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object:(sys_) 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization FULL >> Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (sys_fieldsplit_0_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=0.01, absolute=1e-50, divergence=10000. >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: (sys_fieldsplit_0_) 1 MPI processes >> type: ilu >> ILU: out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=9600, cols=9600 >> package used to perform factorization: petsc >> total: nonzeros=47280, allocated nonzeros=47280 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: (sys_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=9600, cols=9600 >> total: nonzeros=47280, allocated nonzeros=47280 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP solver for upper A00 in upper triangular factor >> KSP Object: (sys_fieldsplit_1_upper_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sys_fieldsplit_1_upper_) 1 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Mat Object: (sys_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=9600, cols=9600 >> total: nonzeros=47280, allocated nonzeros=47280 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (sys_fieldsplit_1_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=0.01, absolute=1e-50, divergence=10000. >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: (sys_fieldsplit_1_) 1 MPI processes >> type: ilu >> ILU: out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=3200, cols=3200 >> package used to perform factorization: petsc >> total: nonzeros=40404, allocated nonzeros=40404 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: (sys_fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=3200, cols=3200 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (sys_fieldsplit_1_) 1 MPI processes >> type: seqaij >> rows=3200, cols=3200 >> total: nonzeros=40404, allocated nonzeros=40404 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> A10 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=3200, cols=9600 >> total: nonzeros=47280, allocated nonzeros=47280 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP of A00 >> KSP Object: (sys_fieldsplit_1_inner_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sys_fieldsplit_1_inner_) 1 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Mat Object: (sys_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=9600, cols=9600 >> total: nonzeros=47280, allocated nonzeros=47280 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> A01 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=9600, cols=3200 >> total: nonzeros=47280, allocated nonzeros=47280 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Mat Object: 1 MPI processes >> type: seqaij >> rows=3200, cols=3200 >> total: nonzeros=40404, allocated nonzeros=40404 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: 1 MPI processes >> type: nest >> rows=12800, cols=12800 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : prefix="mom_", type=seqaij, rows=9600, cols=9600 >> (0,1) : prefix="grad_", type=seqaij, rows=9600, cols=3200 >> (1,0) : prefix="div_", type=seqaij, rows=3200, cols=9600 >> (1,1) : prefix="stab_", type=seqaij, rows=3200, cols=3200 >> Mat Object: 1 MPI processes >> type: nest >> rows=12800, cols=12800 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : prefix="sys_fieldsplit_0_", type=seqaij, rows=9600, cols=9600 >> (0,1) : type=seqaij, rows=9600, cols=3200 >> (1,0) : type=seqaij, rows=3200, cols=9600 >> (1,1) : prefix="sys_fieldsplit_1_", type=seqaij, rows=3200, cols=3200 >> nusediter_vv 37 >> nusediter_pp 37 >> >> >> >> dr. ir. Christiaan Klaij | CFD Researcher | Research & Development >> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >> >> MARIN news: http://www.marin.nl/web/News/News-items/The-Ocean-Cleanup-testing-continues.htm >> >> ________________________________________ >> From: Barry Smith > >> Sent: Friday, January 13, 2017 7:51 PM >> To: Klaij, Christiaan >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 >> >> Yes, I would have expected this to work. Could you send the output from -ksp_view in this case? >> >> >>> On Jan 13, 2017, at 3:46 AM, Klaij, Christiaan > wrote: >>> >>> Barry, >>> >>> It's been a while but I'm finally using this function in >>> 3.7.4. Is it supposed to work with fieldsplit? Here's why. >>> >>> I'm solving a Navier-Stokes system with fieldsplit (pc has one >>> velocity solve and one pressure solve) and trying to retrieve the >>> totals like this: >>> >>> CALL KSPSolve(ksp_system,rr_system,xx_system,ierr); CHKERRQ(ierr) >>> CALL PCFieldSplitGetSubKSP(pc_system,numsplit,subksp,ierr); CHKERRQ(ierr) >>> CALL KSPGetTotalIterations(subksp(1),nusediter_vv,ierr); CHKERRQ(ierr) >>> CALL KSPGetTotalIterations(subksp(2),nusediter_pp,ierr); CHKERRQ(ierr) >>> print *, 'nusediter_vv', nusediter_vv >>> print *, 'nusediter_pp', nusediter_pp >>> >>> Running the code shows this surprise: >>> >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 1 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 1 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 2 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 2 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >>> Linear sys_fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 8 >>> >>> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 22 >>> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 >>> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 3 >>> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 2 >>> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 2 >>> Linear sys_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 2 >>> >>> nusediter_vv 37 >>> nusediter_pp 37 >>> >>> So the value of nusediter_pp is indeed 37, but for nusediter_vv >>> it should be 66. Any idea what went wrong? >>> >>> Chris >>> >>> >>> >>> dr. ir. Christiaan Klaij | CFD Researcher | Research & Development >>> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >>> >>> MARIN news: http://www.marin.nl/web/News/News-items/MARIN-wishes-you-a-challenging-inspiring-2017.htm >>> >>> ________________________________________ >>> From: Barry Smith > >>> Sent: Saturday, April 11, 2015 12:27 AM >>> To: Klaij, Christiaan >>> Cc: petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 >>> >>> Chris, >>> >>> I have added KSPGetTotalIterations() to the branch barry/add-ksp-total-iterations/master and next. After tests it will go into master >>> >>> Barry >>> >>>> On Apr 10, 2015, at 8:07 AM, Klaij, Christiaan > wrote: >>>> >>>> Barry, >>>> >>>> Sure, I can call PCFieldSplitGetSubKSP() to get the fieldsplit_0 >>>> ksp and then KSPGetIterationNumber, but what does this number >>>> mean? >>>> >>>> It appears to be the number of iterations of the last time that >>>> the subsystem was solved, right? If so, this corresponds to the >>>> last iteration of the coupled system, how about all the previous >>>> iterations? >>>> >>>> Chris >>>> ________________________________________ >>>> From: Barry Smith > >>>> Sent: Friday, April 10, 2015 2:48 PM >>>> To: Klaij, Christiaan >>>> Cc: petsc-users at mcs.anl.gov >>>> Subject: Re: [petsc-users] monitoring the convergence of fieldsplit 0 and 1 >>>> >>>> Chris, >>>> >>>> It appears you should call PCFieldSplitGetSubKSP() and then get the information you want out of the individual KSPs. If this doesn't work please let us know. >>>> >>>> Barry >>>> >>>>> On Apr 10, 2015, at 6:48 AM, Klaij, Christiaan > wrote: >>>>> >>>>> A question when using PCFieldSplit: for each linear iteration of >>>>> the system, how many iterations for fielsplit 0 and 1? >>>>> >>>>> One way to find out is to run with -ksp_monitor, >>>>> -fieldsplit_0_ksp_monitor and -fieldsplit_0_ksp_monitor. This >>>>> gives the complete convergence history. >>>>> >>>>> Another way, suggested by Matt, is to use -ksp_monitor, >>>>> -fieldsplit_0_ksp_converged_reason and >>>>> -fieldsplit_1_ksp_converged_reason. This gives only the totals >>>>> for fieldsplit 0 and 1 (but without saying for which one). >>>>> >>>>> Both ways require to somehow process the output, which is a bit >>>>> inconvenient. Could KSPGetResidualHistory perhaps return (some) >>>>> information on the subsystems' convergence for processing inside >>>>> the code? >>>>> >>>>> Chris >>>>> >>>>> >>>>> dr. ir. Christiaan Klaij >>>>> CFD Researcher >>>>> Research & Development >>>>> E mailto:C.Klaij at marin.nl >>>>> T +31 317 49 33 44 >>>>> >>>>> >>>>> MARIN >>>>> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >>>>> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >>>>> >>>> >>> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagec0b128.PNG Type: image/png Size: 293 bytes Desc: imagec0b128.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageb21668.PNG Type: image/png Size: 331 bytes Desc: imageb21668.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagec00de8.PNG Type: image/png Size: 333 bytes Desc: imagec00de8.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagea2abab.PNG Type: image/png Size: 253 bytes Desc: imagea2abab.PNG URL: From rlmackie862 at gmail.com Tue May 15 10:05:20 2018 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 15 May 2018 08:05:20 -0700 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> Message-ID: <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> > On Apr 10, 2018, at 5:40 PM, Satish Balay wrote: > > On Tue, 10 Apr 2018, Jeff Hammond wrote: > >> Can you try on a non-KNL host? It's a bug either way but I want to >> determine if KNL host is the issue. > > Breaks on 'E5-2695 v4' aswell (bebop.lcrc) with '-axcore-avx2' and 'icc (ICC) 18.0.1 20171018' > >> Based only what I see below, Randy doesn't seem to be reporting a >> KNL-specific issue. Is that incorrect? > > Hardware details weren't mentioned in this thread. > >> Again, there is clearly a bug here, but it helps to localize the problem as >> much as possible. > >>>>>> On Thu, 5 Apr 2018, Randall Mackie wrote: > >>>>> so I assume this is an Intel bug, but before we submit a bug >>>>> report I wanted to see if anyone else had similar experiences? > > Randy, > > I'll leave this to you to file a report with Intel. > > Thanks, > Satish Hi Satish, As requested we filed a report with Intel, and this is their response: From: Intel Customer Support > Sent: 14 May 2018 19:56 Subject: Intel Developer Products Support - Update to Service Request#:03369230 Hello, An update was made to service request on May 14, 2018: Thank you for the additional information. Our engineering team investigated this case. Please see the following resolution: There is a bug in customer?s code: At line 157 of ?/src/dm/impls/da/f90-custom/zda1f90.c? *ierr = F90Array3dDestroy(&a,PETSC_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); Here ?&a? should be ?a? because ?a? was passed from call at line 79 of ?test.F90? as the 3rd arguments DMDAVecRestoreArrayF90(da1,vec1,ptr_v1,ierr). In fortran the array descriptor of assumed-shape array will be passed by address so when it is passed to another C function it shouldn?t be taken address again. There are other places in the code having the same error. After removing ?&? the code can be built and run without error. In addition, this issue was discussed 4 years ago https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2014-April/021232.html without noticing the bug in the code, more likely a user coding error. I am going to close out this case as not a compiler defect. You can reopen by posting a reply if you have any problem with the above resolution. Sign in to view and update your request or to get additional information. You can also reply to this email with questions or comments. Regards, Devorah Intel Developer Products Support Intel will use your personal information solely for the purpose it was collected. We will not use your personal information for a different purpose without first asking your permission. In order to fulfill the purpose, we may need to share your personal information within Intel Corporation, Intel subsidiaries worldwide, or with authorized third parties. Privacy ? Cookies Intel may contact you in order to obtain your feedback on the quality of the support you received. We give you many choices regarding our use of your personal information for quality assurance and marketing purposes. You may update and request access to your contact details and communication preferences by using one of the following methods: visit the specific product or service website; use the Contact Us form; or send a letter to the postal address below. Intel Corporation; Mailstop RNB4-145; 2200 Mission College Blvd.; Santa Clara, CA 95054 USA ? Intel Corporation ? Legal Information ? www.intel.com Intel is a registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others. *********************PLEASE DO NOT DELETE********************* Thread ID: ref:_00DU0YT3c._5000PhEKLa:ref You must include this text in any reply to this email. Thank you. *********************PLEASE DO NOT DELETE********************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue May 15 10:17:46 2018 From: jed at jedbrown.org (Jed Brown) Date: Tue, 15 May 2018 09:17:46 -0600 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> Message-ID: <87d0xxx811.fsf@jedbrown.org> Wow, how did this ever work with other compilers? To ensure everything gets fixed, I see we have #define F90Array3d void and then arguments are defined as F90Array3d *ptr We could make this type-safe by typedef struct { void *ptr; } F90Array3d; and use this for arguments: F90Array3d a Randall Mackie writes: >> On Apr 10, 2018, at 5:40 PM, Satish Balay wrote: >> >> On Tue, 10 Apr 2018, Jeff Hammond wrote: >> >>> Can you try on a non-KNL host? It's a bug either way but I want to >>> determine if KNL host is the issue. >> >> Breaks on 'E5-2695 v4' aswell (bebop.lcrc) with '-axcore-avx2' and 'icc (ICC) 18.0.1 20171018' >> >>> Based only what I see below, Randy doesn't seem to be reporting a >>> KNL-specific issue. Is that incorrect? >> >> Hardware details weren't mentioned in this thread. >> >>> Again, there is clearly a bug here, but it helps to localize the problem as >>> much as possible. >> >>>>>>> On Thu, 5 Apr 2018, Randall Mackie wrote: >> >>>>>> so I assume this is an Intel bug, but before we submit a bug >>>>>> report I wanted to see if anyone else had similar experiences? >> >> Randy, >> >> I'll leave this to you to file a report with Intel. >> >> Thanks, >> Satish > > > Hi Satish, > > As requested we filed a report with Intel, and this is their response: > > > > From: Intel Customer Support > > Sent: 14 May 2018 19:56 > Subject: Intel Developer Products Support - Update to Service Request#:03369230 > > > > > Hello, > An update was made to service request on May 14, 2018: > > Thank you for the additional information. Our engineering team investigated this case. Please see the following resolution: > > > > There is a bug in customer?s code: > > > > At line 157 of ?/src/dm/impls/da/f90-custom/zda1f90.c? > > > > *ierr = F90Array3dDestroy(&a,PETSC_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > > > Here ?&a? should be ?a? because ?a? was passed from call at line 79 of ?test.F90? as the 3rd arguments DMDAVecRestoreArrayF90(da1,vec1,ptr_v1,ierr). In fortran the array descriptor of assumed-shape array will be passed by address so when it is passed to another C function it shouldn?t be taken address again. There are other places in the code having the same error. After removing ?&? the code can be built and run without error. > > > > In addition, this issue was discussed 4 years ago https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2014-April/021232.html without noticing the bug in the code, more likely a user coding error. > > I am going to close out this case as not a compiler defect. You can reopen by posting a reply if you have any problem with the above resolution. > > Sign in to view and update your request or to get additional information. You can also reply to this email with questions or comments. > > Regards, > > Devorah > Intel Developer Products Support > > > Intel will use your personal information solely for the purpose it was collected. We will not use your personal information for a different purpose without first asking your permission. In order to fulfill the purpose, we may need to share your personal information within Intel Corporation, Intel subsidiaries worldwide, or with authorized third parties. > Privacy ? Cookies > Intel may contact you in order to obtain your feedback on the quality of the support you received. We give you many choices regarding our use of your personal information for quality assurance and marketing purposes. You may update and request access to your contact details and communication preferences by using one of the following methods: visit the specific product or service website; use the Contact Us form; or send a letter to the postal address below. > Intel Corporation; Mailstop RNB4-145; 2200 Mission College Blvd.; Santa Clara, CA 95054 USA > ? Intel Corporation ? Legal Information ? www.intel.com > Intel is a registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. > *Other names and brands may be claimed as the property of others. > *********************PLEASE DO NOT DELETE********************* > Thread ID: ref:_00DU0YT3c._5000PhEKLa:ref > You must include this text in any reply to this email. Thank you. > *********************PLEASE DO NOT DELETE********************* From balay at mcs.anl.gov Tue May 15 10:38:21 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 May 2018 10:38:21 -0500 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: <87d0xxx811.fsf@jedbrown.org> References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> Message-ID: Looks like this issue got introduced with: http://bitbucket.org/petsc/petsc/commits/fcfd50eb5fb Satish On Tue, 15 May 2018, Jed Brown wrote: > Wow, how did this ever work with other compilers? > > To ensure everything gets fixed, I see we have > > #define F90Array3d void > > and then arguments are defined as > > F90Array3d *ptr > > We could make this type-safe by > > typedef struct { void *ptr; } F90Array3d; > > and use this for arguments: > > F90Array3d a > > > > Randall Mackie writes: > > >> On Apr 10, 2018, at 5:40 PM, Satish Balay wrote: > >> > >> On Tue, 10 Apr 2018, Jeff Hammond wrote: > >> > >>> Can you try on a non-KNL host? It's a bug either way but I want to > >>> determine if KNL host is the issue. > >> > >> Breaks on 'E5-2695 v4' aswell (bebop.lcrc) with '-axcore-avx2' and 'icc (ICC) 18.0.1 20171018' > >> > >>> Based only what I see below, Randy doesn't seem to be reporting a > >>> KNL-specific issue. Is that incorrect? > >> > >> Hardware details weren't mentioned in this thread. > >> > >>> Again, there is clearly a bug here, but it helps to localize the problem as > >>> much as possible. > >> > >>>>>>> On Thu, 5 Apr 2018, Randall Mackie wrote: > >> > >>>>>> so I assume this is an Intel bug, but before we submit a bug > >>>>>> report I wanted to see if anyone else had similar experiences? > >> > >> Randy, > >> > >> I'll leave this to you to file a report with Intel. > >> > >> Thanks, > >> Satish > > > > > > Hi Satish, > > > > As requested we filed a report with Intel, and this is their response: > > > > > > > > From: Intel Customer Support > > > Sent: 14 May 2018 19:56 > > Subject: Intel Developer Products Support - Update to Service Request#:03369230 > > > > > > > > > > Hello, > > An update was made to service request on May 14, 2018: > > > > Thank you for the additional information. Our engineering team investigated this case. Please see the following resolution: > > > > > > > > There is a bug in customer?s code: > > > > > > > > At line 157 of ?/src/dm/impls/da/f90-custom/zda1f90.c? > > > > > > > > *ierr = F90Array3dDestroy(&a,PETSC_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > > > > > > > Here ?&a? should be ?a? because ?a? was passed from call at line 79 of ?test.F90? as the 3rd arguments DMDAVecRestoreArrayF90(da1,vec1,ptr_v1,ierr). In fortran the array descriptor of assumed-shape array will be passed by address so when it is passed to another C function it shouldn?t be taken address again. There are other places in the code having the same error. After removing ?&? the code can be built and run without error. > > > > > > > > In addition, this issue was discussed 4 years ago https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2014-April/021232.html without noticing the bug in the code, more likely a user coding error. > > > > I am going to close out this case as not a compiler defect. You can reopen by posting a reply if you have any problem with the above resolution. > > > > Sign in to view and update your request or to get additional information. You can also reply to this email with questions or comments. > > > > Regards, > > > > Devorah > > Intel Developer Products Support > > > > > > Intel will use your personal information solely for the purpose it was collected. We will not use your personal information for a different purpose without first asking your permission. In order to fulfill the purpose, we may need to share your personal information within Intel Corporation, Intel subsidiaries worldwide, or with authorized third parties. > > Privacy ? Cookies > > Intel may contact you in order to obtain your feedback on the quality of the support you received. We give you many choices regarding our use of your personal information for quality assurance and marketing purposes. You may update and request access to your contact details and communication preferences by using one of the following methods: visit the specific product or service website; use the Contact Us form; or send a letter to the postal address below. > > Intel Corporation; Mailstop RNB4-145; 2200 Mission College Blvd.; Santa Clara, CA 95054 USA > > ? Intel Corporation ? Legal Information ? www.intel.com > > Intel is a registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. > > *Other names and brands may be claimed as the property of others. > > *********************PLEASE DO NOT DELETE********************* > > Thread ID: ref:_00DU0YT3c._5000PhEKLa:ref > > You must include this text in any reply to this email. Thank you. > > *********************PLEASE DO NOT DELETE********************* > From balay at mcs.anl.gov Tue May 15 11:14:01 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 May 2018 11:14:01 -0500 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: <87d0xxx811.fsf@jedbrown.org> References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> Message-ID: On Tue, 15 May 2018, Jed Brown wrote: > Wow, how did this ever work with other compilers? > > To ensure everything gets fixed, I see we have > > #define F90Array3d void > > and then arguments are defined as > > F90Array3d *ptr > > We could make this type-safe by > > typedef struct { void *ptr; } F90Array3d; The following appears to be sufficient. [void* more appropriate than void - as these are pointers anyway?] -#define F90Array1d void -#define F90Array2d void -#define F90Array3d void -#define F90Array4d void +typedef void* F90Array1d; +typedef void* F90Array2d; +typedef void* F90Array3d; +typedef void* F90Array4d; > > and use this for arguments: > > F90Array3d a Don't need this change.. Attaching my current patch.. Satish -------------- next part -------------- diff --git a/include/petsc/private/f90impl.h b/include/petsc/private/f90impl.h index a35efb76bd..4f26c8ffea 100644 --- a/include/petsc/private/f90impl.h +++ b/include/petsc/private/f90impl.h @@ -15,11 +15,10 @@ #endif #if defined(PETSC_USING_F90) - -#define F90Array1d void -#define F90Array2d void -#define F90Array3d void -#define F90Array4d void +typedef void* F90Array1d; +typedef void* F90Array2d; +typedef void* F90Array3d; +typedef void* F90Array4d; PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); PETSC_EXTERN PetscErrorCode F90Array1dAccess(F90Array1d*,MPI_Datatype,void** PETSC_F90_2PTR_PROTO_NOVAR); diff --git a/src/dm/impls/composite/f90-custom/zfddaf90.c b/src/dm/impls/composite/f90-custom/zfddaf90.c index abc0a27a01..2633c9e86a 100644 --- a/src/dm/impls/composite/f90-custom/zfddaf90.c +++ b/src/dm/impls/composite/f90-custom/zfddaf90.c @@ -24,8 +24,8 @@ PETSC_EXTERN void PETSC_STDCALL dmcompositegetaccessvpvp_(DM *dm,Vec *v,Vec *v1, PETSC_EXTERN void PETSC_STDCALL dmcompositerestoreaccessvpvp_(DM *dm,Vec *v,Vec *v1,F90Array1d *p1,Vec *v2,F90Array1d *p2,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd1) PETSC_F90_2PTR_PROTO(ptrd2)) { *ierr = DMCompositeRestoreAccess(*dm,*v,v1,0,v2,0); - *ierr = F90Array1dDestroy(&p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); - *ierr = F90Array1dDestroy(&p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); + *ierr = F90Array1dDestroy(p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); + *ierr = F90Array1dDestroy(p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); } PETSC_EXTERN void PETSC_STDCALL dmcompositegetentriesarray_(DM *dm, DM *dmarray, PetscErrorCode *ierr) diff --git a/src/dm/impls/da/f90-custom/zda1f90.c b/src/dm/impls/da/f90-custom/zda1f90.c index 082027725f..41cc58534f 100644 --- a/src/dm/impls/da/f90-custom/zda1f90.c +++ b/src/dm/impls/da/f90-custom/zda1f90.c @@ -74,7 +74,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf901_(DM *da,Vec *v,F90Array1 PetscScalar *fa; *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -113,7 +113,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf902_(DM *da,Vec *v,F90Array2 PetscScalar *fa; *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -154,7 +154,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf903_(DM *da,Vec *v,F90Array3 PetscScalar *fa; *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -190,7 +190,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf904_(DM *da,Vec *v,F90Array4 */ *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -223,7 +223,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf901_(DM *da,Vec *v,F90Ar const PetscScalar *fa; *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -262,7 +262,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf902_(DM *da,Vec *v,F90Ar const PetscScalar *fa; *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -303,7 +303,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf903_(DM *da,Vec *v,F90Ar const PetscScalar *fa; *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) @@ -339,5 +339,5 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf904_(DM *da,Vec *v,F90Ar */ *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); } From jed at jedbrown.org Tue May 15 11:20:36 2018 From: jed at jedbrown.org (Jed Brown) Date: Tue, 15 May 2018 10:20:36 -0600 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> Message-ID: <87zi10x54b.fsf@jedbrown.org> Satish Balay writes: > On Tue, 15 May 2018, Jed Brown wrote: > >> Wow, how did this ever work with other compilers? >> >> To ensure everything gets fixed, I see we have >> >> #define F90Array3d void >> >> and then arguments are defined as >> >> F90Array3d *ptr >> >> We could make this type-safe by >> >> typedef struct { void *ptr; } F90Array3d; > > The following appears to be sufficient. [void* more appropriate than void - as these are pointers anyway?] They're declared as pointers, so it's "pointer to void" versus "pointer to void*". But the point of my suggestion was to get stronger type checking, including between arrays of different dimensions. > -#define F90Array1d void > -#define F90Array2d void > -#define F90Array3d void > -#define F90Array4d void > +typedef void* F90Array1d; > +typedef void* F90Array2d; > +typedef void* F90Array3d; > +typedef void* F90Array4d; > >> >> and use this for arguments: >> >> F90Array3d a > > Don't need this change.. > > Attaching my current patch.. > > Satish > diff --git a/include/petsc/private/f90impl.h b/include/petsc/private/f90impl.h > index a35efb76bd..4f26c8ffea 100644 > --- a/include/petsc/private/f90impl.h > +++ b/include/petsc/private/f90impl.h > @@ -15,11 +15,10 @@ > #endif > > #if defined(PETSC_USING_F90) > - > -#define F90Array1d void > -#define F90Array2d void > -#define F90Array3d void > -#define F90Array4d void > +typedef void* F90Array1d; > +typedef void* F90Array2d; > +typedef void* F90Array3d; > +typedef void* F90Array4d; > > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); > PETSC_EXTERN PetscErrorCode F90Array1dAccess(F90Array1d*,MPI_Datatype,void** PETSC_F90_2PTR_PROTO_NOVAR); > diff --git a/src/dm/impls/composite/f90-custom/zfddaf90.c b/src/dm/impls/composite/f90-custom/zfddaf90.c > index abc0a27a01..2633c9e86a 100644 > --- a/src/dm/impls/composite/f90-custom/zfddaf90.c > +++ b/src/dm/impls/composite/f90-custom/zfddaf90.c > @@ -24,8 +24,8 @@ PETSC_EXTERN void PETSC_STDCALL dmcompositegetaccessvpvp_(DM *dm,Vec *v,Vec *v1, > PETSC_EXTERN void PETSC_STDCALL dmcompositerestoreaccessvpvp_(DM *dm,Vec *v,Vec *v1,F90Array1d *p1,Vec *v2,F90Array1d *p2,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd1) PETSC_F90_2PTR_PROTO(ptrd2)) > { > *ierr = DMCompositeRestoreAccess(*dm,*v,v1,0,v2,0); > - *ierr = F90Array1dDestroy(&p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); > - *ierr = F90Array1dDestroy(&p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); > + *ierr = F90Array1dDestroy(p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); > + *ierr = F90Array1dDestroy(p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); > } > > PETSC_EXTERN void PETSC_STDCALL dmcompositegetentriesarray_(DM *dm, DM *dmarray, PetscErrorCode *ierr) > diff --git a/src/dm/impls/da/f90-custom/zda1f90.c b/src/dm/impls/da/f90-custom/zda1f90.c > index 082027725f..41cc58534f 100644 > --- a/src/dm/impls/da/f90-custom/zda1f90.c > +++ b/src/dm/impls/da/f90-custom/zda1f90.c > @@ -74,7 +74,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf901_(DM *da,Vec *v,F90Array1 > PetscScalar *fa; > *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -113,7 +113,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf902_(DM *da,Vec *v,F90Array2 > PetscScalar *fa; > *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -154,7 +154,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf903_(DM *da,Vec *v,F90Array3 > PetscScalar *fa; > *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -190,7 +190,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf904_(DM *da,Vec *v,F90Array4 > */ > *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -223,7 +223,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf901_(DM *da,Vec *v,F90Ar > const PetscScalar *fa; > *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -262,7 +262,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf902_(DM *da,Vec *v,F90Ar > const PetscScalar *fa; > *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -303,7 +303,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf903_(DM *da,Vec *v,F90Ar > const PetscScalar *fa; > *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > @@ -339,5 +339,5 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf904_(DM *da,Vec *v,F90Ar > */ > *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > } From balay at mcs.anl.gov Tue May 15 11:33:21 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 May 2018 11:33:21 -0500 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: <87zi10x54b.fsf@jedbrown.org> References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> <87zi10x54b.fsf@jedbrown.org> Message-ID: On Tue, 15 May 2018, Jed Brown wrote: > Satish Balay writes: > > > On Tue, 15 May 2018, Jed Brown wrote: > > > >> Wow, how did this ever work with other compilers? > >> > >> To ensure everything gets fixed, I see we have > >> > >> #define F90Array3d void > >> > >> and then arguments are defined as > >> > >> F90Array3d *ptr > >> > >> We could make this type-safe by > >> > >> typedef struct { void *ptr; } F90Array3d; > > > > The following appears to be sufficient. [void* more appropriate than void - as these are pointers anyway?] > > They're declared as pointers, so it's "pointer to void" versus "pointer > to void*". Your notation requires the following change PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); to: PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d PETSC_F90_2PTR_PROTO_NOVAR); [this doesn't look right.] So I guess 'pointer to address (void*)' is the correct notation - as we pass these back to fortran calling routine wrt dmdavecgetarrayf901_() and an output argument for F90Array1dCreate(). Satish > But the point of my suggestion was to get stronger type > checking, including between arrays of different dimensions. > > > -#define F90Array1d void > > -#define F90Array2d void > > -#define F90Array3d void > > -#define F90Array4d void > > +typedef void* F90Array1d; > > +typedef void* F90Array2d; > > +typedef void* F90Array3d; > > +typedef void* F90Array4d; > > > >> > >> and use this for arguments: > >> > >> F90Array3d a > > > > Don't need this change.. > > > > Attaching my current patch.. > > > > Satish > > diff --git a/include/petsc/private/f90impl.h b/include/petsc/private/f90impl.h > > index a35efb76bd..4f26c8ffea 100644 > > --- a/include/petsc/private/f90impl.h > > +++ b/include/petsc/private/f90impl.h > > @@ -15,11 +15,10 @@ > > #endif > > > > #if defined(PETSC_USING_F90) > > - > > -#define F90Array1d void > > -#define F90Array2d void > > -#define F90Array3d void > > -#define F90Array4d void > > +typedef void* F90Array1d; > > +typedef void* F90Array2d; > > +typedef void* F90Array3d; > > +typedef void* F90Array4d; > > > > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); > > PETSC_EXTERN PetscErrorCode F90Array1dAccess(F90Array1d*,MPI_Datatype,void** PETSC_F90_2PTR_PROTO_NOVAR); > > diff --git a/src/dm/impls/composite/f90-custom/zfddaf90.c b/src/dm/impls/composite/f90-custom/zfddaf90.c > > index abc0a27a01..2633c9e86a 100644 > > --- a/src/dm/impls/composite/f90-custom/zfddaf90.c > > +++ b/src/dm/impls/composite/f90-custom/zfddaf90.c > > @@ -24,8 +24,8 @@ PETSC_EXTERN void PETSC_STDCALL dmcompositegetaccessvpvp_(DM *dm,Vec *v,Vec *v1, > > PETSC_EXTERN void PETSC_STDCALL dmcompositerestoreaccessvpvp_(DM *dm,Vec *v,Vec *v1,F90Array1d *p1,Vec *v2,F90Array1d *p2,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd1) PETSC_F90_2PTR_PROTO(ptrd2)) > > { > > *ierr = DMCompositeRestoreAccess(*dm,*v,v1,0,v2,0); > > - *ierr = F90Array1dDestroy(&p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); > > - *ierr = F90Array1dDestroy(&p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); > > + *ierr = F90Array1dDestroy(p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); > > + *ierr = F90Array1dDestroy(p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmcompositegetentriesarray_(DM *dm, DM *dmarray, PetscErrorCode *ierr) > > diff --git a/src/dm/impls/da/f90-custom/zda1f90.c b/src/dm/impls/da/f90-custom/zda1f90.c > > index 082027725f..41cc58534f 100644 > > --- a/src/dm/impls/da/f90-custom/zda1f90.c > > +++ b/src/dm/impls/da/f90-custom/zda1f90.c > > @@ -74,7 +74,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf901_(DM *da,Vec *v,F90Array1 > > PetscScalar *fa; > > *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > > - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -113,7 +113,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf902_(DM *da,Vec *v,F90Array2 > > PetscScalar *fa; > > *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > > - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -154,7 +154,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf903_(DM *da,Vec *v,F90Array3 > > PetscScalar *fa; > > *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > > - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -190,7 +190,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf904_(DM *da,Vec *v,F90Array4 > > */ > > *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; > > - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -223,7 +223,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf901_(DM *da,Vec *v,F90Ar > > const PetscScalar *fa; > > *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > > - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -262,7 +262,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf902_(DM *da,Vec *v,F90Ar > > const PetscScalar *fa; > > *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > > - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -303,7 +303,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf903_(DM *da,Vec *v,F90Ar > > const PetscScalar *fa; > > *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > > - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > @@ -339,5 +339,5 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf904_(DM *da,Vec *v,F90Ar > > */ > > *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); > > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; > > - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); > > } > From jed at jedbrown.org Tue May 15 11:48:46 2018 From: jed at jedbrown.org (Jed Brown) Date: Tue, 15 May 2018 10:48:46 -0600 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> <87zi10x54b.fsf@jedbrown.org> Message-ID: <87sh6sx3td.fsf@jedbrown.org> Satish Balay writes: > On Tue, 15 May 2018, Jed Brown wrote: > >> Satish Balay writes: >> >> > On Tue, 15 May 2018, Jed Brown wrote: >> > >> >> Wow, how did this ever work with other compilers? >> >> >> >> To ensure everything gets fixed, I see we have >> >> >> >> #define F90Array3d void >> >> >> >> and then arguments are defined as >> >> >> >> F90Array3d *ptr >> >> >> >> We could make this type-safe by >> >> >> >> typedef struct { void *ptr; } F90Array3d; >> > >> > The following appears to be sufficient. [void* more appropriate than void - as these are pointers anyway?] >> >> They're declared as pointers, so it's "pointer to void" versus "pointer >> to void*". > > Your notation requires the following change > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); > > to: > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d PETSC_F90_2PTR_PROTO_NOVAR); Yes. > [this doesn't look right.] I don't know why not, but I don't care about this narrow bit of compiler-enforced type safety enough to push for the more intrusive fix. > So I guess 'pointer to address (void*)' is the correct notation - as > we pass these back to fortran calling routine wrt > dmdavecgetarrayf901_() and an output argument for F90Array1dCreate(). It's actually just the pointer that we have. The thing it points to is of some type that is not known statically in F90Array1dCreate/Destroy. > Satish > >> But the point of my suggestion was to get stronger type >> checking, including between arrays of different dimensions. >> >> > -#define F90Array1d void >> > -#define F90Array2d void >> > -#define F90Array3d void >> > -#define F90Array4d void >> > +typedef void* F90Array1d; >> > +typedef void* F90Array2d; >> > +typedef void* F90Array3d; >> > +typedef void* F90Array4d; >> > >> >> >> >> and use this for arguments: >> >> >> >> F90Array3d a >> > >> > Don't need this change.. >> > >> > Attaching my current patch.. >> > >> > Satish >> > diff --git a/include/petsc/private/f90impl.h b/include/petsc/private/f90impl.h >> > index a35efb76bd..4f26c8ffea 100644 >> > --- a/include/petsc/private/f90impl.h >> > +++ b/include/petsc/private/f90impl.h >> > @@ -15,11 +15,10 @@ >> > #endif >> > >> > #if defined(PETSC_USING_F90) >> > - >> > -#define F90Array1d void >> > -#define F90Array2d void >> > -#define F90Array3d void >> > -#define F90Array4d void >> > +typedef void* F90Array1d; >> > +typedef void* F90Array2d; >> > +typedef void* F90Array3d; >> > +typedef void* F90Array4d; >> > >> > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); >> > PETSC_EXTERN PetscErrorCode F90Array1dAccess(F90Array1d*,MPI_Datatype,void** PETSC_F90_2PTR_PROTO_NOVAR); >> > diff --git a/src/dm/impls/composite/f90-custom/zfddaf90.c b/src/dm/impls/composite/f90-custom/zfddaf90.c >> > index abc0a27a01..2633c9e86a 100644 >> > --- a/src/dm/impls/composite/f90-custom/zfddaf90.c >> > +++ b/src/dm/impls/composite/f90-custom/zfddaf90.c >> > @@ -24,8 +24,8 @@ PETSC_EXTERN void PETSC_STDCALL dmcompositegetaccessvpvp_(DM *dm,Vec *v,Vec *v1, >> > PETSC_EXTERN void PETSC_STDCALL dmcompositerestoreaccessvpvp_(DM *dm,Vec *v,Vec *v1,F90Array1d *p1,Vec *v2,F90Array1d *p2,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd1) PETSC_F90_2PTR_PROTO(ptrd2)) >> > { >> > *ierr = DMCompositeRestoreAccess(*dm,*v,v1,0,v2,0); >> > - *ierr = F90Array1dDestroy(&p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); >> > - *ierr = F90Array1dDestroy(&p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); >> > + *ierr = F90Array1dDestroy(p1,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd1)); >> > + *ierr = F90Array1dDestroy(p2,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd2)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmcompositegetentriesarray_(DM *dm, DM *dmarray, PetscErrorCode *ierr) >> > diff --git a/src/dm/impls/da/f90-custom/zda1f90.c b/src/dm/impls/da/f90-custom/zda1f90.c >> > index 082027725f..41cc58534f 100644 >> > --- a/src/dm/impls/da/f90-custom/zda1f90.c >> > +++ b/src/dm/impls/da/f90-custom/zda1f90.c >> > @@ -74,7 +74,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf901_(DM *da,Vec *v,F90Array1 >> > PetscScalar *fa; >> > *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -113,7 +113,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf902_(DM *da,Vec *v,F90Array2 >> > PetscScalar *fa; >> > *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -154,7 +154,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf903_(DM *da,Vec *v,F90Array3 >> > PetscScalar *fa; >> > *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -190,7 +190,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayf904_(DM *da,Vec *v,F90Array4 >> > */ >> > *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArray(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -223,7 +223,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf901_(DM *da,Vec *v,F90Ar >> > const PetscScalar *fa; >> > *ierr = F90Array1dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array1dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array1dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf902_(DM *da,Vec *v,F90Array2d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -262,7 +262,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf902_(DM *da,Vec *v,F90Ar >> > const PetscScalar *fa; >> > *ierr = F90Array2dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array2dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array2dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf903_(DM *da,Vec *v,F90Array3d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -303,7 +303,7 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf903_(DM *da,Vec *v,F90Ar >> > const PetscScalar *fa; >> > *ierr = F90Array3dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array3dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array3dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayreadf904_(DM *da,Vec *v,F90Array4d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > @@ -339,5 +339,5 @@ PETSC_EXTERN void PETSC_STDCALL dmdavecrestorearrayreadf904_(DM *da,Vec *v,F90Ar >> > */ >> > *ierr = F90Array4dAccess(a,MPIU_SCALAR,(void**)&fa PETSC_F90_2PTR_PARAM(ptrd)); >> > *ierr = VecRestoreArrayRead(*v,&fa);if (*ierr) return; >> > - *ierr = F90Array4dDestroy(&a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > + *ierr = F90Array4dDestroy(a,MPIU_SCALAR PETSC_F90_2PTR_PARAM(ptrd)); >> > } >> From balay at mcs.anl.gov Tue May 15 16:26:35 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 May 2018 16:26:35 -0500 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: <87sh6sx3td.fsf@jedbrown.org> References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> <87zi10x54b.fsf@jedbrown.org> <87sh6sx3td.fsf@jedbrown.org> Message-ID: On Tue, 15 May 2018, Jed Brown wrote: > > Your notation requires the following change > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); > > > > to: > > > > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d PETSC_F90_2PTR_PROTO_NOVAR); > > Yes. > > > [this doesn't look right.] > > I don't know why not, output argument without a explicit pointer in definition is weird for c - and can result in future bugs [when this code is changed] > but I don't care about this narrow bit of > compiler-enforced type safety enough to push for the more intrusive fix. I'm not sure I understand what extra type safety is obtained by 'F90Array1d*' vs 'F90Array1d' usage Or does 'typedef struct { void *ptr; } F90Array3d;' give the extra safety? I could change this part of typedef. > > > So I guess 'pointer to address (void*)' is the correct notation - as > > we pass these back to fortran calling routine wrt > > dmdavecgetarrayf901_() and an output argument for F90Array1dCreate(). > > It's actually just the pointer that we have. The thing it points to is > of some type that is not known statically in F90Array1dCreate/Destroy. Yes. we get something opaque - and we pass it down to fortran utility routine or up to the calling fortran routine - unmodified. My current patch is at balay/fix-F90Array__Destroy/maint Satish From jed at jedbrown.org Tue May 15 17:21:56 2018 From: jed at jedbrown.org (Jed Brown) Date: Tue, 15 May 2018 16:21:56 -0600 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> <87zi10x54b.fsf@jedbrown.org> <87sh6sx3td.fsf@jedbrown.org> Message-ID: <87mux0v9tn.fsf@jedbrown.org> Satish Balay writes: > On Tue, 15 May 2018, Jed Brown wrote: > >> > Your notation requires the following change >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); >> > >> > to: >> > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) >> > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d PETSC_F90_2PTR_PROTO_NOVAR); >> >> Yes. >> >> > [this doesn't look right.] >> >> I don't know why not, > > output argument without a explicit pointer in definition is weird for c - and can result in future bugs [when this code is changed] The pointer is inside a typedef for all the basic PETSc types. >> but I don't care about this narrow bit of >> compiler-enforced type safety enough to push for the more intrusive fix. > > I'm not sure I understand what extra type safety is obtained by 'F90Array1d*' vs 'F90Array1d' usage > > Or does 'typedef struct { void *ptr; } F90Array3d;' give the extra safety? I could change this part of typedef. Yeah, that's an alternative way to get a strong typedef, and you could keep the other code unmodified. typedef struct { char dummy; } F90Array1d; typedef struct { char dummy; } F90Array2d; typedef struct { char dummy; } F90Array3d; These are all distinct types. Better to use char for the dummy because we don't have reason to know that the pointer is word aligned, but the standard says that any data pointer can be cast to a char*. >> >> > So I guess 'pointer to address (void*)' is the correct notation - as >> > we pass these back to fortran calling routine wrt >> > dmdavecgetarrayf901_() and an output argument for F90Array1dCreate(). >> >> It's actually just the pointer that we have. The thing it points to is >> of some type that is not known statically in F90Array1dCreate/Destroy. > > Yes. we get something opaque - and we pass it down to fortran utility routine or up to the calling fortran routine - unmodified. > > My current patch is at balay/fix-F90Array__Destroy/maint > > Satish From balay at mcs.anl.gov Tue May 15 17:58:55 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 May 2018 17:58:55 -0500 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: <87mux0v9tn.fsf@jedbrown.org> References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> <87zi10x54b.fsf@jedbrown.org> <87sh6sx3td.fsf@jedbrown.org> <87mux0v9tn.fsf@jedbrown.org> Message-ID: On Tue, 15 May 2018, Jed Brown wrote: > Satish Balay writes: > > > On Tue, 15 May 2018, Jed Brown wrote: > > > >> > Your notation requires the following change > >> > > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d *a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > >> > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d* PETSC_F90_2PTR_PROTO_NOVAR); > >> > > >> > to: > >> > > >> > PETSC_EXTERN void PETSC_STDCALL dmdavecgetarrayf901_(DM *da,Vec *v,F90Array1d a,PetscErrorCode *ierr PETSC_F90_2PTR_PROTO(ptrd)) > >> > PETSC_EXTERN PetscErrorCode F90Array1dCreate(void*,MPI_Datatype,PetscInt,PetscInt,F90Array1d PETSC_F90_2PTR_PROTO_NOVAR); > >> > >> Yes. > >> > >> > [this doesn't look right.] > >> > >> I don't know why not, > > > > output argument without a explicit pointer in definition is weird for c - and can result in future bugs [when this code is changed] > > The pointer is inside a typedef for all the basic PETSc types. Sure - but we have: 'MatCreate(MPI_Comm,Mat*)' and not 'MatCreate(MPI_Comm,Mat)' - which the above prototype would suggest. > > >> but I don't care about this narrow bit of > >> compiler-enforced type safety enough to push for the more intrusive fix. > > > > I'm not sure I understand what extra type safety is obtained by 'F90Array1d*' vs 'F90Array1d' usage > > > > Or does 'typedef struct { void *ptr; } F90Array3d;' give the extra safety? I could change this part of typedef. > > Yeah, that's an alternative way to get a strong typedef, and you could > keep the other code unmodified. > > typedef struct { char dummy; } F90Array1d; > typedef struct { char dummy; } F90Array2d; > typedef struct { char dummy; } F90Array3d; > > These are all distinct types. I'm not sure if they need to be distinct. Perhaps they are collapsible into a single type. [I suspect its a design we used to keep it in sync with our old f90 interface code. We had both versions for a while] For now - I've updated the commit to use the above typedef. Satish > Better to use char for the dummy because > we don't have reason to know that the pointer is word aligned, but the > standard says that any data pointer can be cast to a char*. > > >> > >> > So I guess 'pointer to address (void*)' is the correct notation - as > >> > we pass these back to fortran calling routine wrt > >> > dmdavecgetarrayf901_() and an output argument for F90Array1dCreate(). > >> > >> It's actually just the pointer that we have. The thing it points to is > >> of some type that is not known statically in F90Array1dCreate/Destroy. > > > > Yes. we get something opaque - and we pass it down to fortran utility routine or up to the calling fortran routine - unmodified. > > > > My current patch is at balay/fix-F90Array__Destroy/maint > > > > Satish > From jed at jedbrown.org Tue May 15 18:03:19 2018 From: jed at jedbrown.org (Jed Brown) Date: Tue, 15 May 2018 17:03:19 -0600 Subject: [petsc-users] Problems with DMDAVecGetArrayF90 + Intel In-Reply-To: References: <4B547C08-A6D3-428B-89B7-48AA59BA697F@gmail.com> <78A9CCB4-70F1-4CA0-A7BF-9C94A093B90F@gmail.com> <87d0xxx811.fsf@jedbrown.org> <87zi10x54b.fsf@jedbrown.org> <87sh6sx3td.fsf@jedbrown.org> <87mux0v9tn.fsf@jedbrown.org> Message-ID: <87zi10ttc8.fsf@jedbrown.org> Satish Balay writes: >> typedef struct { char dummy; } F90Array1d; >> typedef struct { char dummy; } F90Array2d; >> typedef struct { char dummy; } F90Array3d; >> >> These are all distinct types. > > I'm not sure if they need to be distinct. Perhaps they are collapsible > into a single type. [I suspect its a design we used to keep it in sync > with our old f90 interface code. We had both versions for a while] If our current code does not mix them, then the types should be distinct so that they aren't accidentally mixed. > For now - I've updated the commit to use the above typedef. From rlwalker at usc.edu Tue May 15 19:11:03 2018 From: rlwalker at usc.edu (Robert Walker) Date: Tue, 15 May 2018 17:11:03 -0700 Subject: [petsc-users] PETSc DS Auxiliary Variable Question Message-ID: Hello, I am writing a program using PETScDS along with PETScFE and PETScTS in a similar vein to TS examples 45 - 48. I seek to use the auxiliary variables accessed by the pointwise functions in a manner similar to how I understand that MOOSE handles auxiliary variables, that is that in the following: f0(PetscInt dim, PetscInt Nf, PetscInt NfAux, const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], PetscReal t, const PetscReal x[], PetscScalar f0[]) some indices of the a[] array may hold values generated using the primary variable values of the prior iteration, or may simply point to the primary variables themselves. What is the best way to go about this? Apologies if this is either quite obvious or has been extensively addressed before. Thank you Robert L. Walker MS Petroleum Engineering Mork Family Department of Chemicals and Materials Sciences University of Southern California ---------------------------------------------- Mobile US: +1 (213) - 290 -7101 Mobile EU: +34 62 274 66 40 rlwalker at usc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 15 19:21:21 2018 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 May 2018 20:21:21 -0400 Subject: [petsc-users] PETSc DS Auxiliary Variable Question In-Reply-To: References: Message-ID: On Tue, May 15, 2018 at 8:11 PM, Robert Walker wrote: > Hello, > > I am writing a program using PETScDS along with PETScFE and PETScTS in a > similar vein to TS examples 45 - 48. I seek to use the auxiliary variables > accessed by the pointwise functions in a manner similar to how I understand > that MOOSE handles auxiliary variables, that is that in the following: > > f0(PetscInt dim, PetscInt Nf, PetscInt NfAux, > > const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], > > const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], > > PetscReal t, const PetscReal x[], PetscScalar f0[]) > > > some indices of the a[] array may hold values generated using the primary > variable values of the prior iteration, or may simply point to the primary > variables themselves. > > What is the best way to go about this? Apologies if this is either quite > obvious or has been extensively addressed before. > Not sure where I need to start. First, do you understand how the auxiliary variables are being used in SNES ex12 or ex69? Thanks, Matt > Thank you > > > Robert L. Walker > MS Petroleum Engineering > Mork Family Department of Chemicals and Materials Sciences > University of Southern California > ---------------------------------------------- > Mobile US: +1 (213) - 290 -7101 > Mobile EU: +34 62 274 66 40 > rlwalker at usc.edu > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlwalker at usc.edu Tue May 15 19:25:23 2018 From: rlwalker at usc.edu (Robert Walker) Date: Tue, 15 May 2018 17:25:23 -0700 Subject: [petsc-users] PETSc DS Auxiliary Variable Question In-Reply-To: References: Message-ID: Sounds like a lovely place to look again. sent from an overpriced pocket calculator On Tue, May 15, 2018, 17:21 Matthew Knepley wrote: > On Tue, May 15, 2018 at 8:11 PM, Robert Walker wrote: > >> Hello, >> >> I am writing a program using PETScDS along with PETScFE and PETScTS in a >> similar vein to TS examples 45 - 48. I seek to use the auxiliary variables >> accessed by the pointwise functions in a manner similar to how I understand >> that MOOSE handles auxiliary variables, that is that in the following: >> >> f0(PetscInt dim, PetscInt Nf, PetscInt NfAux, >> >> const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], >> >> const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], >> >> PetscReal t, const PetscReal x[], PetscScalar f0[]) >> >> >> some indices of the a[] array may hold values generated using the primary >> variable values of the prior iteration, or may simply point to the primary >> variables themselves. >> >> What is the best way to go about this? Apologies if this is either quite >> obvious or has been extensively addressed before. >> > > Not sure where I need to start. First, do you understand how the auxiliary > variables are being used in SNES ex12 or ex69? > > Thanks, > > Matt > > >> Thank you >> >> >> Robert L. Walker >> MS Petroleum Engineering >> Mork Family Department of Chemicals and Materials Sciences >> University of Southern California >> ---------------------------------------------- >> Mobile US: +1 (213) - 290 -7101 >> Mobile EU: +34 62 274 66 40 >> rlwalker at usc.edu >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlwalker at usc.edu Tue May 15 19:53:24 2018 From: rlwalker at usc.edu (Robert Walker) Date: Tue, 15 May 2018 17:53:24 -0700 Subject: [petsc-users] PETSc DS Auxiliary Variable Question In-Reply-To: References: Message-ID: OK, I may as well just say that this is something that has never been quite clear to me. From what it looks, the auxiliary terms tend to denote material properties, or similar spatial varying terms. However, judging from the existence of both the 'constants' vector, as well as the derivative vectors of the auxiliary variables with respect to time and space in the callback call, I am going to guess that it can be used with time dependent terms as well. What I am looking to do is find some way to take the gradient of a first temporal derivative of a variable (displacement), and access this result within one of the residual callback functions that provides input to PetscDSSetResidual. Maybe there is another way to do this (probably.) Any thoughts would be appreciated. Robert L. Walker MS Petroleum Engineering Mork Family Department of Chemicals and Materials Sciences University of Southern California ---------------------------------------------- Mobile US: +1 (213) - 290 -7101 Mobile EU: +34 62 274 66 40 rlwalker at usc.edu On Tue, May 15, 2018 at 5:25 PM, Robert Walker wrote: > Sounds like a lovely place to look again. > > sent from an overpriced pocket calculator > > On Tue, May 15, 2018, 17:21 Matthew Knepley wrote: > >> On Tue, May 15, 2018 at 8:11 PM, Robert Walker wrote: >> >>> Hello, >>> >>> I am writing a program using PETScDS along with PETScFE and PETScTS in a >>> similar vein to TS examples 45 - 48. I seek to use the auxiliary variables >>> accessed by the pointwise functions in a manner similar to how I understand >>> that MOOSE handles auxiliary variables, that is that in the following: >>> >>> f0(PetscInt dim, PetscInt Nf, PetscInt NfAux, >>> >>> const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], >>> >>> const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], >>> >>> PetscReal t, const PetscReal x[], PetscScalar f0[]) >>> >>> >>> some indices of the a[] array may hold values generated using the >>> primary variable values of the prior iteration, or may simply point to the >>> primary variables themselves. >>> >>> What is the best way to go about this? Apologies if this is either quite >>> obvious or has been extensively addressed before. >>> >> >> Not sure where I need to start. First, do you understand how the >> auxiliary variables are being used in SNES ex12 or ex69? >> >> Thanks, >> >> Matt >> >> >>> Thank you >>> >>> >>> Robert L. Walker >>> MS Petroleum Engineering >>> Mork Family Department of Chemicals and Materials Sciences >>> University of Southern California >>> ---------------------------------------------- >>> Mobile US: +1 (213) - 290 -7101 >>> Mobile EU: +34 62 274 66 40 >>> rlwalker at usc.edu >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 15 20:08:47 2018 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 May 2018 21:08:47 -0400 Subject: [petsc-users] PETSc DS Auxiliary Variable Question In-Reply-To: References: Message-ID: On Tue, May 15, 2018 at 8:53 PM, Robert Walker wrote: > OK, I may as well just say that this is something that has never been > quite clear to me. From what it looks, the auxiliary terms tend to denote > material properties, or similar spatial varying terms. However, judging > from the existence of both the 'constants' vector, as well as the > derivative vectors of the auxiliary variables with respect to time and > space in the callback call, I am going to guess that it can be used with > time dependent terms as well. > The auxiliary fields are really for FIELDS, meaning stuff discretized just like the solution vectors, which can also be time/space dependent. For constant properties, use the constants[] array. > What I am looking to do is find some way to take the gradient of a first > temporal derivative of a variable (displacement), and access this result > within one of the residual callback functions that provides input to > PetscDSSetResidual. Maybe there is another way to do this (probably.) Any > thoughts would be appreciated. > You are right, that is not easy. The information is there, I just have not made a nice way to access it. It we did everything covariantly in 4D, it would just be there. I will think about it, and get it done. However, it will probably take me until the end of the month since I have a bunch of proposal deadlines. You could conceivably do it yourself using the discretization tools. Basically, you feed in the u_t Vec to the EvaluateFieldJet() internal function, which would give derivatives at a point. Thanks, Matt > Robert L. Walker > MS Petroleum Engineering > Mork Family Department of Chemicals and Materials Sciences > University of Southern California > ---------------------------------------------- > Mobile US: +1 (213) - 290 -7101 > Mobile EU: +34 62 274 66 40 > rlwalker at usc.edu > > On Tue, May 15, 2018 at 5:25 PM, Robert Walker wrote: > >> Sounds like a lovely place to look again. >> >> sent from an overpriced pocket calculator >> >> On Tue, May 15, 2018, 17:21 Matthew Knepley wrote: >> >>> On Tue, May 15, 2018 at 8:11 PM, Robert Walker wrote: >>> >>>> Hello, >>>> >>>> I am writing a program using PETScDS along with PETScFE and PETScTS in >>>> a similar vein to TS examples 45 - 48. I seek to use the auxiliary >>>> variables accessed by the pointwise functions in a manner similar to how I >>>> understand that MOOSE handles auxiliary variables, that is that in the >>>> following: >>>> >>>> f0(PetscInt dim, PetscInt Nf, PetscInt NfAux, >>>> >>>> const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], >>>> >>>> const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], >>>> >>>> PetscReal t, const PetscReal x[], PetscScalar f0[]) >>>> >>>> >>>> some indices of the a[] array may hold values generated using the >>>> primary variable values of the prior iteration, or may simply point to the >>>> primary variables themselves. >>>> >>>> What is the best way to go about this? Apologies if this is either >>>> quite obvious or has been extensively addressed before. >>>> >>> >>> Not sure where I need to start. First, do you understand how the >>> auxiliary variables are being used in SNES ex12 or ex69? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you >>>> >>>> >>>> Robert L. Walker >>>> MS Petroleum Engineering >>>> Mork Family Department of Chemicals and Materials Sciences >>>> University of Southern California >>>> ---------------------------------------------- >>>> Mobile US: +1 (213) - 290 -7101 >>>> Mobile EU: +34 62 274 66 40 >>>> rlwalker at usc.edu >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Thu May 17 03:50:24 2018 From: cpraveen at gmail.com (Praveen C) Date: Thu, 17 May 2018 14:20:24 +0530 Subject: [petsc-users] Duplicate vector without memory allocation Message-ID: <103F5151-9CBE-4D7D-AAAC-F183CE0E7F3E@gmail.com> Dear all I have a vector created with VecCreateGhostBlockWithArray. Is it possible to duplicate this vector without allocating memory, so that I can then place an existing array into it ? Thanks praveen From knepley at gmail.com Thu May 17 04:17:35 2018 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 May 2018 05:17:35 -0400 Subject: [petsc-users] Duplicate vector without memory allocation In-Reply-To: <103F5151-9CBE-4D7D-AAAC-F183CE0E7F3E@gmail.com> References: <103F5151-9CBE-4D7D-AAAC-F183CE0E7F3E@gmail.com> Message-ID: On Thu, May 17, 2018 at 4:50 AM, Praveen C wrote: > Dear all > > I have a vector created with VecCreateGhostBlockWithArray. Is it possible > to duplicate this vector without allocating memory, so that I can then > place an existing array into it ? > No, you just have to create it the same way again. Thanks, Matt > Thanks > praveen -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Fri May 18 04:54:54 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 18 May 2018 10:54:54 +0100 Subject: [petsc-users] Linear iterative solver cannot converge Message-ID: <7bab65849037a08d487af6f860bc2658@cam.ac.uk> Hello all, I do not have much knowledge on linear iterative solvers, so when use PETSc krylov solvers, I try several combinations (e.g. cg+jacobi, gmres+hypre, etc). But I cannot get a converged solution. I have used 'preonly' and 'lu' check that the system is correctly constructed. Below is the ksp log output with 500 iterations. 0 KSP preconditioned resid norm 7.082525933041e-04 true resid norm 6.326394271849e-01 ||r(i)||/||b|| 1.000000000000e+00 0 KSP Residual norm 7.082525933041e-04 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00 1 KSP preconditioned resid norm 7.826722253922e-05 true resid norm 1.564088399639e+00 ||r(i)||/||b|| 2.472322040690e+00 1 KSP Residual norm 7.826722253922e-05 % max 9.487098101782e-01 min 9.487098101782e-01 max/min 1.000000000000e+00 2 KSP preconditioned resid norm 1.160749405043e-05 true resid norm 6.895674436780e+00 ||r(i)||/||b|| 1.089984933039e+01 2 KSP Residual norm 1.160749405043e-05 % max 1.047419854967e+00 min 8.139171190265e-01 max/min 1.286887608679e+00 3 KSP preconditioned resid norm 2.302649108076e-06 true resid norm 3.399636614710e+01 ||r(i)||/||b|| 5.373734972285e+01 3 KSP Residual norm 2.302649108076e-06 % max 1.095412865033e+00 min 7.524989114835e-01 max/min 1.455700265231e+00 4 KSP preconditioned resid norm 1.515706794298e-06 true resid norm 7.915135424703e+00 ||r(i)||/||b|| 1.251129013556e+01 4 KSP Residual norm 1.515706794298e-06 % max 4.259813886522e+00 min 7.476020986409e-01 max/min 5.697969406808e+00 5 KSP preconditioned resid norm 1.062659539539e-06 true resid norm 1.291636850619e+01 ||r(i)||/||b|| 2.041663537106e+01 5 KSP Residual norm 1.062659539539e-06 % max 6.979478925762e+00 min 7.421457344352e-01 max/min 9.404458722751e+00 6 KSP preconditioned resid norm 4.248015453585e-07 true resid norm 7.273048463349e+00 ||r(i)||/||b|| 1.149635661456e+01 6 KSP Residual norm 4.248015453585e-07 % max 8.197134494997e+00 min 7.249135442935e-01 max/min 1.130774084651e+01 7 KSP preconditioned resid norm 1.844877736368e-07 true resid norm 1.800911822307e+00 ||r(i)||/||b|| 2.846663905095e+00 7 KSP Residual norm 1.844877736368e-07 % max 8.617693175971e+00 min 7.171590601865e-01 max/min 1.201643213394e+01 8 KSP preconditioned resid norm 1.343329736018e-07 true resid norm 8.308030162167e-01 ||r(i)||/||b|| 1.313233068501e+00 8 KSP Residual norm 1.343329736018e-07 % max 8.797419513003e+00 min 7.151810153432e-01 max/min 1.230096901941e+01 9 KSP preconditioned resid norm 8.435244868263e-08 true resid norm 1.039165561694e+00 ||r(i)||/||b|| 1.642587415580e+00 9 KSP Residual norm 8.435244868263e-08 % max 8.954896895684e+00 min 7.106664887897e-01 max/min 1.260070235046e+01 10 KSP preconditioned resid norm 3.966928154020e-08 true resid norm 5.935564227055e-01 ||r(i)||/||b|| 9.382223067360e-01 10 KSP Residual norm 3.966928154020e-08 % max 9.031210516030e+00 min 7.020732475337e-01 max/min 1.286363003826e+01 11 KSP preconditioned resid norm 2.465535486554e-08 true resid norm 4.008826554548e-01 ||r(i)||/||b|| 6.336668854779e-01 11 KSP Residual norm 2.465535486554e-08 % max 9.085166722582e+00 min 6.977273003895e-01 max/min 1.302108534023e+01 12 KSP preconditioned resid norm 2.030859749117e-08 true resid norm 3.813432885130e-01 ||r(i)||/||b|| 6.027814140670e-01 12 KSP Residual norm 2.030859749117e-08 % max 9.108740437496e+00 min 6.953332846189e-01 max/min 1.309981938012e+01 13 KSP preconditioned resid norm 1.822679962077e-08 true resid norm 3.872430711288e-01 ||r(i)||/||b|| 6.121070778847e-01 13 KSP Residual norm 1.822679962077e-08 % max 9.136924983475e+00 min 6.787423686921e-01 max/min 1.346155095796e+01 14 KSP preconditioned resid norm 1.695370146944e-08 true resid norm 3.791469790845e-01 ||r(i)||/||b|| 5.993097533799e-01 14 KSP Residual norm 1.695370146944e-08 % max 9.150838240684e+00 min 4.836016572082e-01 max/min 1.892226402513e+01 15 KSP preconditioned resid norm 1.670891990697e-08 true resid norm 3.742448806985e-01 ||r(i)||/||b|| 5.915611082981e-01 15 KSP Residual norm 1.670891990697e-08 % max 9.168112723795e+00 min 3.589247137480e-01 max/min 2.554327515667e+01 16 KSP preconditioned resid norm 1.664236372933e-08 true resid norm 3.722478007586e-01 ||r(i)||/||b|| 5.884043655247e-01 16 KSP Residual norm 1.664236372933e-08 % max 9.178525399198e+00 min 2.321963324507e-01 max/min 3.952915751220e+01 17 KSP preconditioned resid norm 1.663949610587e-08 true resid norm 3.719568951825e-01 ||r(i)||/||b|| 5.879445371238e-01 17 KSP Residual norm 1.663949610587e-08 % max 9.184089312561e+00 min 1.131861415164e-01 max/min 8.114146475459e+01 18 KSP preconditioned resid norm 1.663337572879e-08 true resid norm 3.719017039287e-01 ||r(i)||/||b|| 5.878572974555e-01 18 KSP Residual norm 1.663337572879e-08 % max 9.189859567814e+00 min 5.648232511467e-02 max/min 1.627032801705e+02 19 KSP preconditioned resid norm 1.662539368980e-08 true resid norm 3.720300901295e-01 ||r(i)||/||b|| 5.880602348560e-01 19 KSP Residual norm 1.662539368980e-08 % max 9.194407242900e+00 min 3.409368279927e-02 max/min 2.696806706695e+02 20 KSP preconditioned resid norm 1.660752961666e-08 true resid norm 3.720476648171e-01 ||r(i)||/||b|| 5.880880147997e-01 20 KSP Residual norm 1.660752961666e-08 % max 9.194602239844e+00 min 2.365991044206e-02 max/min 3.886152596545e+02 21 KSP preconditioned resid norm 1.660447026728e-08 true resid norm 3.719370961482e-01 ||r(i)||/||b|| 5.879132412016e-01 21 KSP Residual norm 1.660447026728e-08 % max 9.198469035476e+00 min 1.563110293107e-02 max/min 5.884721683453e+02 22 KSP preconditioned resid norm 1.660446294321e-08 true resid norm 3.719253643667e-01 ||r(i)||/||b|| 5.878946970183e-01 22 KSP Residual norm 1.660446294321e-08 % max 9.198477237204e+00 min 1.078305273545e-02 max/min 8.530494529592e+02 23 KSP preconditioned resid norm 1.660330040466e-08 true resid norm 3.721217486420e-01 ||r(i)||/||b|| 5.882051175626e-01 23 KSP Residual norm 1.660330040466e-08 % max 9.202933332869e+00 min 9.205686060935e-03 max/min 9.997009752398e+02 24 KSP preconditioned resid norm 1.660327295939e-08 true resid norm 3.721432241387e-01 ||r(i)||/||b|| 5.882390634340e-01 24 KSP Residual norm 1.660327295939e-08 % max 9.203266640610e+00 min 8.725099522239e-03 max/min 1.054803629134e+03 25 KSP preconditioned resid norm 1.660322661301e-08 true resid norm 3.721299294797e-01 ||r(i)||/||b|| 5.882180488428e-01 25 KSP Residual norm 1.660322661301e-08 % max 9.203403735734e+00 min 8.464109615459e-03 max/min 1.087344582462e+03 26 KSP preconditioned resid norm 1.660321583344e-08 true resid norm 3.721333122477e-01 ||r(i)||/||b|| 5.882233959139e-01 26 KSP Residual norm 1.660321583344e-08 % max 9.203842444616e+00 min 8.185670305944e-03 max/min 1.124384699190e+03 27 KSP preconditioned resid norm 1.659716957149e-08 true resid norm 3.722220312184e-01 ||r(i)||/||b|| 5.883636321479e-01 27 KSP Residual norm 1.659716957149e-08 % max 9.203846515427e+00 min 7.896672970975e-03 max/min 1.165534719401e+03 28 KSP preconditioned resid norm 1.659174589261e-08 true resid norm 3.723499490782e-01 ||r(i)||/||b|| 5.885658292514e-01 28 KSP Residual norm 1.659174589261e-08 % max 9.204684093512e+00 min 7.626673798499e-03 max/min 1.206906750794e+03 29 KSP preconditioned resid norm 1.658497009797e-08 true resid norm 3.724237904064e-01 ||r(i)||/||b|| 5.886825487050e-01 29 KSP Residual norm 1.658497009797e-08 % max 9.204686109552e+00 min 7.248769500402e-03 max/min 1.269827397470e+03 30 KSP preconditioned resid norm 1.658420583638e-08 true resid norm 3.724329227626e-01 ||r(i)||/||b|| 5.886969840306e-01 30 KSP Residual norm 1.658420583638e-08 % max 9.205127707991e+00 min 6.913923297310e-03 max/min 1.331389908762e+03 31 KSP preconditioned resid norm 1.658394997001e-08 true resid norm 3.724106820061e-01 ||r(i)||/||b|| 5.886618285289e-01 31 KSP Residual norm 1.658394997001e-08 % max 9.205128428478e+00 min 6.644030177187e-03 max/min 1.385473603068e+03 32 KSP preconditioned resid norm 1.658394544037e-08 true resid norm 3.724013538135e-01 ||r(i)||/||b|| 5.886470836487e-01 32 KSP Residual norm 1.658394544037e-08 % max 9.205133034961e+00 min 6.415376763387e-03 max/min 1.434854627322e+03 33 KSP preconditioned resid norm 1.658256686670e-08 true resid norm 3.725160103788e-01 ||r(i)||/||b|| 5.888283188996e-01 33 KSP Residual norm 1.658256686670e-08 % max 9.205322724899e+00 min 6.129032159846e-03 max/min 1.501921100236e+03 34 KSP preconditioned resid norm 1.658198817115e-08 true resid norm 3.725196207066e-01 ||r(i)||/||b|| 5.888340256696e-01 34 KSP Residual norm 1.658198817115e-08 % max 9.205896491877e+00 min 5.825724791452e-03 max/min 1.580214792396e+03 35 KSP preconditioned resid norm 1.658171734588e-08 true resid norm 3.725087580299e-01 ||r(i)||/||b|| 5.888168552622e-01 35 KSP Residual norm 1.658171734588e-08 % max 9.205945192996e+00 min 5.605660067611e-03 max/min 1.642258910095e+03 36 KSP preconditioned resid norm 1.658129588997e-08 true resid norm 3.724965685568e-01 ||r(i)||/||b|| 5.887975876153e-01 36 KSP Residual norm 1.658129588997e-08 % max 9.206318976719e+00 min 5.485984644166e-03 max/min 1.678152523906e+03 37 KSP preconditioned resid norm 1.658089864613e-08 true resid norm 3.725244893116e-01 ||r(i)||/||b|| 5.888417213724e-01 37 KSP Residual norm 1.658089864613e-08 % max 9.206643625524e+00 min 5.409333758549e-03 max/min 1.701992155868e+03 38 KSP preconditioned resid norm 1.658010080967e-08 true resid norm 3.725552388317e-01 ||r(i)||/||b|| 5.888903265001e-01 38 KSP Residual norm 1.658010080967e-08 % max 9.222413265296e+00 min 5.329794244440e-03 max/min 1.730350711928e+03 39 KSP preconditioned resid norm 1.657990108602e-08 true resid norm 3.725613932319e-01 ||r(i)||/||b|| 5.889000546327e-01 39 KSP Residual norm 1.657990108602e-08 % max 9.265183049042e+00 min 5.255818039127e-03 max/min 1.762843192072e+03 40 KSP preconditioned resid norm 1.657960808250e-08 true resid norm 3.725537447923e-01 ||r(i)||/||b|| 5.888879649031e-01 40 KSP Residual norm 1.657960808250e-08 % max 9.295348987310e+00 min 5.172817833368e-03 max/min 1.796960435635e+03 41 KSP preconditioned resid norm 1.657958687100e-08 true resid norm 3.725506176790e-01 ||r(i)||/||b|| 5.888830219400e-01 41 KSP Residual norm 1.657958687100e-08 % max 9.313105724221e+00 min 5.071725615663e-03 max/min 1.836279489462e+03 42 KSP preconditioned resid norm 1.657921742619e-08 true resid norm 3.725582418972e-01 ||r(i)||/||b|| 5.888950733833e-01 42 KSP Residual norm 1.657921742619e-08 % max 9.322955878206e+00 min 4.963104015890e-03 max/min 1.878452647447e+03 43 KSP preconditioned resid norm 1.657684473798e-08 true resid norm 3.725895709563e-01 ||r(i)||/||b|| 5.889445945762e-01 43 KSP Residual norm 1.657684473798e-08 % max 9.329772550486e+00 min 4.863208011283e-03 max/min 1.918439953389e+03 44 KSP preconditioned resid norm 1.657659296508e-08 true resid norm 3.725990953050e-01 ||r(i)||/||b|| 5.889596495163e-01 44 KSP Residual norm 1.657659296508e-08 % max 9.338931498127e+00 min 4.781973618640e-03 max/min 1.952945006163e+03 45 KSP preconditioned resid norm 1.657642406464e-08 true resid norm 3.726031709829e-01 ||r(i)||/||b|| 5.889660918557e-01 45 KSP Residual norm 1.657642406464e-08 % max 9.367754233355e+00 min 4.692163450780e-03 max/min 1.996468011318e+03 46 KSP preconditioned resid norm 1.657554485219e-08 true resid norm 3.725847080302e-01 ||r(i)||/||b|| 5.889369078499e-01 46 KSP Residual norm 1.657554485219e-08 % max 9.525984789796e+00 min 4.602804961008e-03 max/min 2.069604267505e+03 47 KSP preconditioned resid norm 1.657502492412e-08 true resid norm 3.726033324164e-01 ||r(i)||/||b|| 5.889663470303e-01 47 KSP Residual norm 1.657502492412e-08 % max 9.649033152731e+00 min 4.528495247014e-03 max/min 2.130737171270e+03 48 KSP preconditioned resid norm 1.657431464097e-08 true resid norm 3.726349240386e-01 ||r(i)||/||b|| 5.890162832513e-01 48 KSP Residual norm 1.657431464097e-08 % max 9.706832936743e+00 min 4.483584175077e-03 max/min 2.164971718542e+03 49 KSP preconditioned resid norm 1.657430235052e-08 true resid norm 3.726384684009e-01 ||r(i)||/||b|| 5.890218857511e-01 49 KSP Residual norm 1.657430235052e-08 % max 9.727915700790e+00 min 4.452026131090e-03 max/min 2.185053594555e+03 50 KSP preconditioned resid norm 1.657409874525e-08 true resid norm 3.726369422798e-01 ||r(i)||/||b|| 5.890194734432e-01 50 KSP Residual norm 1.657409874525e-08 % max 9.739354480687e+00 min 4.417847429271e-03 max/min 2.204547494365e+03 51 KSP preconditioned resid norm 1.657402591687e-08 true resid norm 3.726432094382e-01 ||r(i)||/||b|| 5.890293798102e-01 51 KSP Residual norm 1.657402591687e-08 % max 9.745116733535e+00 min 4.373866876163e-03 max/min 2.228032313156e+03 52 KSP preconditioned resid norm 1.657322689294e-08 true resid norm 3.726746392101e-01 ||r(i)||/||b|| 5.890790601978e-01 52 KSP Residual norm 1.657322689294e-08 % max 9.748668028325e+00 min 4.324537557867e-03 max/min 2.254268323000e+03 53 KSP preconditioned resid norm 1.657285683191e-08 true resid norm 3.726927767005e-01 ||r(i)||/||b|| 5.891077297521e-01 53 KSP Residual norm 1.657285683191e-08 % max 9.752597336655e+00 min 4.266519999816e-03 max/min 2.285843576750e+03 54 KSP preconditioned resid norm 1.657191968374e-08 true resid norm 3.727170337719e-01 ||r(i)||/||b|| 5.891460724009e-01 54 KSP Residual norm 1.657191968374e-08 % max 9.754248539341e+00 min 4.180253974132e-03 max/min 2.333410505606e+03 55 KSP preconditioned resid norm 1.657127209230e-08 true resid norm 3.727281752171e-01 ||r(i)||/||b|| 5.891636834519e-01 55 KSP Residual norm 1.657127209230e-08 % max 9.755352416617e+00 min 4.058383922487e-03 max/min 2.403752972351e+03 56 KSP preconditioned resid norm 1.657120137140e-08 true resid norm 3.727397473887e-01 ||r(i)||/||b|| 5.891819753432e-01 56 KSP Residual norm 1.657120137140e-08 % max 9.756044425698e+00 min 3.878156481600e-03 max/min 2.515639704583e+03 57 KSP preconditioned resid norm 1.657119611240e-08 true resid norm 3.727422879145e-01 ||r(i)||/||b|| 5.891859910995e-01 57 KSP Residual norm 1.657119611240e-08 % max 9.756250547216e+00 min 3.666552757137e-03 max/min 2.660878267257e+03 58 KSP preconditioned resid norm 1.657111258731e-08 true resid norm 3.727518818995e-01 ||r(i)||/||b|| 5.892011561122e-01 58 KSP Residual norm 1.657111258731e-08 % max 9.756370020073e+00 min 3.468718193201e-03 max/min 2.812673003877e+03 59 KSP preconditioned resid norm 1.657094596592e-08 true resid norm 3.727705204449e-01 ||r(i)||/||b|| 5.892306176738e-01 59 KSP Residual norm 1.657094596592e-08 % max 9.756371302765e+00 min 3.286906229078e-03 max/min 2.968253616867e+03 60 KSP preconditioned resid norm 1.657094109276e-08 true resid norm 3.727697067525e-01 ||r(i)||/||b|| 5.892293314871e-01 60 KSP Residual norm 1.657094109276e-08 % max 9.756462036212e+00 min 3.176741626112e-03 max/min 3.071216732270e+03 61 KSP preconditioned resid norm 1.657074458504e-08 true resid norm 3.727879098160e-01 ||r(i)||/||b|| 5.892581046914e-01 61 KSP Residual norm 1.657074458504e-08 % max 9.756473123113e+00 min 3.125891922048e-03 max/min 3.121180567471e+03 62 KSP preconditioned resid norm 1.657036224624e-08 true resid norm 3.728005388067e-01 ||r(i)||/||b|| 5.892780670746e-01 62 KSP Residual norm 1.657036224624e-08 % max 9.756473207720e+00 min 3.096484756596e-03 max/min 3.150822295164e+03 63 KSP preconditioned resid norm 1.657008856448e-08 true resid norm 3.728158759902e-01 ||r(i)||/||b|| 5.893023102419e-01 63 KSP Residual norm 1.657008856448e-08 % max 9.756531519156e+00 min 3.069291385183e-03 max/min 3.178757014161e+03 64 KSP preconditioned resid norm 1.656933678892e-08 true resid norm 3.728618516341e-01 ||r(i)||/||b|| 5.893749829871e-01 64 KSP Residual norm 1.656933678892e-08 % max 9.756534097563e+00 min 3.030347211416e-03 max/min 3.219609311041e+03 65 KSP preconditioned resid norm 1.656929382536e-08 true resid norm 3.728620592926e-01 ||r(i)||/||b|| 5.893753112286e-01 65 KSP Residual norm 1.656929382536e-08 % max 9.756542682563e+00 min 2.980473130456e-03 max/min 3.273487884479e+03 66 KSP preconditioned resid norm 1.656925076679e-08 true resid norm 3.728588689809e-01 ||r(i)||/||b|| 5.893702683692e-01 66 KSP Residual norm 1.656925076679e-08 % max 9.756768124123e+00 min 2.899706071757e-03 max/min 3.364743833574e+03 67 KSP preconditioned resid norm 1.656919627951e-08 true resid norm 3.728539364727e-01 ||r(i)||/||b|| 5.893624716560e-01 67 KSP Residual norm 1.656919627951e-08 % max 9.756860985294e+00 min 2.765552951215e-03 max/min 3.527996446790e+03 68 KSP preconditioned resid norm 1.656919067597e-08 true resid norm 3.728560345126e-01 ||r(i)||/||b|| 5.893657879842e-01 68 KSP Residual norm 1.656919067597e-08 % max 9.756861143601e+00 min 2.585147683296e-03 max/min 3.774198745644e+03 69 KSP preconditioned resid norm 1.656919023495e-08 true resid norm 3.728555414719e-01 ||r(i)||/||b|| 5.893650086449e-01 69 KSP Residual norm 1.656919023495e-08 % max 9.756912528517e+00 min 2.353080610226e-03 max/min 4.146442109171e+03 70 KSP preconditioned resid norm 1.656817007273e-08 true resid norm 3.728377336313e-01 ||r(i)||/||b|| 5.893368601612e-01 70 KSP Residual norm 1.656817007273e-08 % max 9.756913153950e+00 min 2.015051177738e-03 max/min 4.842017543646e+03 71 KSP preconditioned resid norm 1.656803992911e-08 true resid norm 3.728311872625e-01 ||r(i)||/||b|| 5.893265124520e-01 71 KSP Residual norm 1.656803992911e-08 % max 9.756913653213e+00 min 1.699244792218e-03 max/min 5.741911758620e+03 72 KSP preconditioned resid norm 1.656718546555e-08 true resid norm 3.728063898356e-01 ||r(i)||/||b|| 5.892873156745e-01 72 KSP Residual norm 1.656718546555e-08 % max 9.756940076063e+00 min 1.425792038708e-03 max/min 6.843171943156e+03 73 KSP preconditioned resid norm 1.656661764862e-08 true resid norm 3.727786721034e-01 ||r(i)||/||b|| 5.892435028310e-01 73 KSP Residual norm 1.656661764862e-08 % max 9.756965888951e+00 min 1.158185964332e-03 max/min 8.424351692586e+03 74 KSP preconditioned resid norm 1.656636420620e-08 true resid norm 3.727518398420e-01 ||r(i)||/||b|| 5.892010896329e-01 74 KSP Residual norm 1.656636420620e-08 % max 9.757003390346e+00 min 9.438943306994e-04 max/min 1.033696577361e+04 75 KSP preconditioned resid norm 1.656520179659e-08 true resid norm 3.726868430624e-01 ||r(i)||/||b|| 5.890983505736e-01 75 KSP Residual norm 1.656520179659e-08 % max 9.757006096169e+00 min 7.903213567346e-04 max/min 1.234561866895e+04 76 KSP preconditioned resid norm 1.656424480034e-08 true resid norm 3.726173157679e-01 ||r(i)||/||b|| 5.889884502235e-01 76 KSP Residual norm 1.656424480034e-08 % max 9.757008722883e+00 min 6.669508794331e-04 max/min 1.462927634367e+04 77 KSP preconditioned resid norm 1.656223746236e-08 true resid norm 3.725149253621e-01 ||r(i)||/||b|| 5.888266038361e-01 77 KSP Residual norm 1.656223746236e-08 % max 9.757108556440e+00 min 5.550476575711e-04 max/min 1.757886628896e+04 78 KSP preconditioned resid norm 1.655570645923e-08 true resid norm 3.723230681871e-01 ||r(i)||/||b|| 5.885233391852e-01 78 KSP Residual norm 1.655570645923e-08 % max 9.757706826959e+00 min 4.399820830439e-04 max/min 2.217750950096e+04 79 KSP preconditioned resid norm 1.654641744591e-08 true resid norm 3.720276670884e-01 ||r(i)||/||b|| 5.880564048053e-01 79 KSP Residual norm 1.654641744591e-08 % max 9.794972447526e+00 min 3.582518455199e-04 max/min 2.734102439392e+04 80 KSP preconditioned resid norm 1.653478342204e-08 true resid norm 3.716527795590e-01 ||r(i)||/||b|| 5.874638278755e-01 80 KSP Residual norm 1.653478342204e-08 % max 9.836802284188e+00 min 3.026308703663e-04 max/min 3.250429234890e+04 81 KSP preconditioned resid norm 1.652174198034e-08 true resid norm 3.711834567643e-01 ||r(i)||/||b|| 5.867219790838e-01 81 KSP Residual norm 1.652174198034e-08 % max 9.856229987881e+00 min 2.557127305560e-04 max/min 3.854415056477e+04 82 KSP preconditioned resid norm 1.649703927678e-08 true resid norm 3.702808970873e-01 ||r(i)||/||b|| 5.852953217522e-01 82 KSP Residual norm 1.649703927678e-08 % max 9.865801022673e+00 min 2.113312927870e-04 max/min 4.668405181535e+04 83 KSP preconditioned resid norm 1.647272703733e-08 true resid norm 3.691356976009e-01 ||r(i)||/||b|| 5.834851287146e-01 83 KSP Residual norm 1.647272703733e-08 % max 9.870514331986e+00 min 1.772107156780e-04 max/min 5.569930855605e+04 84 KSP preconditioned resid norm 1.644833370624e-08 true resid norm 3.678694491788e-01 ||r(i)||/||b|| 5.814835961391e-01 84 KSP Residual norm 1.644833370624e-08 % max 9.875179985385e+00 min 1.531203216170e-04 max/min 6.449294176698e+04 85 KSP preconditioned resid norm 1.641503406416e-08 true resid norm 3.663991314975e-01 ||r(i)||/||b|| 5.791594955248e-01 85 KSP Residual norm 1.641503406416e-08 % max 9.878712452345e+00 min 1.354454765823e-04 max/min 7.293497502916e+04 86 KSP preconditioned resid norm 1.638631902117e-08 true resid norm 3.650020055391e-01 ||r(i)||/||b|| 5.769510875465e-01 86 KSP Residual norm 1.638631902117e-08 % max 9.884296885865e+00 min 1.217649759756e-04 max/min 8.117520499364e+04 87 KSP preconditioned resid norm 1.635837174633e-08 true resid norm 3.637826153071e-01 ||r(i)||/||b|| 5.750236227386e-01 87 KSP Residual norm 1.635837174633e-08 % max 9.888595738642e+00 min 1.129410821511e-04 max/min 8.755534788848e+04 88 KSP preconditioned resid norm 1.633085122612e-08 true resid norm 3.627922390431e-01 ||r(i)||/||b|| 5.734581555524e-01 88 KSP Residual norm 1.633085122612e-08 % max 9.892636665772e+00 min 1.069969658571e-04 max/min 9.245717003771e+04 89 KSP preconditioned resid norm 1.630349445652e-08 true resid norm 3.617701380674e-01 ||r(i)||/||b|| 5.718425417733e-01 89 KSP Residual norm 1.630349445652e-08 % max 9.895013763062e+00 min 1.017466157066e-04 max/min 9.725152718198e+04 90 KSP preconditioned resid norm 1.627361015004e-08 true resid norm 3.606744164091e-01 ||r(i)||/||b|| 5.701105573107e-01 90 KSP Residual norm 1.627361015004e-08 % max 9.896088200035e+00 min 9.710482756506e-05 max/min 1.019113925454e+05 91 KSP preconditioned resid norm 1.624399979693e-08 true resid norm 3.595515169218e-01 ||r(i)||/||b|| 5.683356134185e-01 91 KSP Residual norm 1.624399979693e-08 % max 9.898456960605e+00 min 9.277670680089e-05 max/min 1.066911868498e+05 92 KSP preconditioned resid norm 1.621187417344e-08 true resid norm 3.581993711865e-01 ||r(i)||/||b|| 5.661983047444e-01 92 KSP Residual norm 1.621187417344e-08 % max 9.900133841601e+00 min 8.846853223877e-05 max/min 1.119057091948e+05 93 KSP preconditioned resid norm 1.617899333556e-08 true resid norm 3.567416069788e-01 ||r(i)||/||b|| 5.638940471450e-01 93 KSP Residual norm 1.617899333556e-08 % max 9.903319235139e+00 min 8.477730005240e-05 max/min 1.168156951096e+05 94 KSP preconditioned resid norm 1.613800415846e-08 true resid norm 3.548922412073e-01 ||r(i)||/||b|| 5.609707930891e-01 94 KSP Residual norm 1.613800415846e-08 % max 9.906628476064e+00 min 8.053311577407e-05 max/min 1.230131031296e+05 95 KSP preconditioned resid norm 1.605124840571e-08 true resid norm 3.510853485394e-01 ||r(i)||/||b|| 5.549533169339e-01 95 KSP Residual norm 1.605124840571e-08 % max 9.909345975071e+00 min 7.312347368914e-05 max/min 1.355152521500e+05 96 KSP preconditioned resid norm 1.588326672478e-08 true resid norm 3.439633921057e-01 ||r(i)||/||b|| 5.436957883518e-01 96 KSP Residual norm 1.588326672478e-08 % max 9.911160469884e+00 min 6.360279448537e-05 max/min 1.558290095597e+05 97 KSP preconditioned resid norm 1.563012116603e-08 true resid norm 3.340016328594e-01 ||r(i)||/||b|| 5.279494424584e-01 97 KSP Residual norm 1.563012116603e-08 % max 9.912448943622e+00 min 5.488437852775e-05 max/min 1.806060159470e+05 98 KSP preconditioned resid norm 1.532053893409e-08 true resid norm 3.219757242443e-01 ||r(i)||/||b|| 5.089403385385e-01 98 KSP Residual norm 1.532053893409e-08 % max 9.912462355327e+00 min 4.767224242002e-05 max/min 2.079294334005e+05 99 KSP preconditioned resid norm 1.497377114129e-08 true resid norm 3.092597005441e-01 ||r(i)||/||b|| 4.888403840403e-01 99 KSP Residual norm 1.497377114129e-08 % max 9.912980027078e+00 min 4.250681360104e-05 max/min 2.332092007677e+05 100 KSP preconditioned resid norm 1.467327275303e-08 true resid norm 2.986174217023e-01 ||r(i)||/||b|| 4.720183549595e-01 100 KSP Residual norm 1.467327275303e-08 % max 9.912991792085e+00 min 3.923846350091e-05 max/min 2.526345556791e+05 101 KSP preconditioned resid norm 1.442486143582e-08 true resid norm 2.895399487479e-01 ||r(i)||/||b|| 4.576697820373e-01 101 KSP Residual norm 1.442486143582e-08 % max 9.913176892700e+00 min 3.690617907051e-05 max/min 2.686048012112e+05 102 KSP preconditioned resid norm 1.420878540463e-08 true resid norm 2.818595095401e-01 ||r(i)||/||b|| 4.455294713363e-01 102 KSP Residual norm 1.420878540463e-08 % max 9.913184584644e+00 min 3.525622450935e-05 max/min 2.811754441266e+05 103 KSP preconditioned resid norm 1.394867857384e-08 true resid norm 2.728700074720e-01 ||r(i)||/||b|| 4.313199521665e-01 103 KSP Residual norm 1.394867857384e-08 % max 9.913184754083e+00 min 3.359065177627e-05 max/min 2.951173683711e+05 104 KSP preconditioned resid norm 1.361967798602e-08 true resid norm 2.618731154865e-01 ||r(i)||/||b|| 4.139373934561e-01 104 KSP Residual norm 1.361967798602e-08 % max 9.913195023405e+00 min 3.179302216209e-05 max/min 3.118041113822e+05 105 KSP preconditioned resid norm 1.325728782686e-08 true resid norm 2.505110349419e-01 ||r(i)||/||b|| 3.959775887769e-01 105 KSP Residual norm 1.325728782686e-08 % max 9.913373149107e+00 min 3.017902902755e-05 max/min 3.284854903734e+05 106 KSP preconditioned resid norm 1.274686747953e-08 true resid norm 2.359244406376e-01 ||r(i)||/||b|| 3.729208621843e-01 106 KSP Residual norm 1.274686747953e-08 % max 9.913956033293e+00 min 2.841281476492e-05 max/min 3.489255153112e+05 107 KSP preconditioned resid norm 1.226462735794e-08 true resid norm 2.229402253412e-01 ||r(i)||/||b|| 3.523969828015e-01 107 KSP Residual norm 1.226462735794e-08 % max 9.914139670727e+00 min 2.702981177754e-05 max/min 3.667853758037e+05 108 KSP preconditioned resid norm 1.178822925406e-08 true resid norm 2.110417020056e-01 ||r(i)||/||b|| 3.335892341467e-01 108 KSP Residual norm 1.178822925406e-08 % max 9.914145602125e+00 min 2.587391774620e-05 max/min 3.831714122064e+05 109 KSP preconditioned resid norm 1.124711770554e-08 true resid norm 1.987259952756e-01 ||r(i)||/||b|| 3.141220523670e-01 109 KSP Residual norm 1.124711770554e-08 % max 9.914354741568e+00 min 2.476929124966e-05 max/min 4.002680028927e+05 110 KSP preconditioned resid norm 1.066970872129e-08 true resid norm 1.861895786153e-01 ||r(i)||/||b|| 2.943059989856e-01 110 KSP Residual norm 1.066970872129e-08 % max 9.914357883390e+00 min 2.374270452184e-05 max/min 4.175749175613e+05 111 KSP preconditioned resid norm 1.012599205842e-08 true resid norm 1.757986089120e-01 ||r(i)||/||b|| 2.778812090393e-01 111 KSP Residual norm 1.012599205842e-08 % max 9.914370592801e+00 min 2.292071171331e-05 max/min 4.325507303965e+05 112 KSP preconditioned resid norm 9.672164972334e-09 true resid norm 1.683403272714e-01 ||r(i)||/||b|| 2.660920581894e-01 112 KSP Residual norm 9.672164972334e-09 % max 9.914736313673e+00 min 2.232161659891e-05 max/min 4.441764452741e+05 113 KSP preconditioned resid norm 9.299198823689e-09 true resid norm 1.629158622006e-01 ||r(i)||/||b|| 2.575177189407e-01 113 KSP Residual norm 9.299198823689e-09 % max 9.914935691241e+00 min 2.188325766675e-05 max/min 4.530831671515e+05 114 KSP preconditioned resid norm 9.034077005430e-09 true resid norm 1.595873484943e-01 ||r(i)||/||b|| 2.522564064722e-01 114 KSP Residual norm 9.034077005430e-09 % max 9.914996347981e+00 min 2.160694766564e-05 max/min 4.588800094031e+05 115 KSP preconditioned resid norm 8.888865878467e-09 true resid norm 1.580113210268e-01 ||r(i)||/||b|| 2.497652125950e-01 115 KSP Residual norm 8.888865878467e-09 % max 9.915030824457e+00 min 2.147558458967e-05 max/min 4.616885180964e+05 116 KSP preconditioned resid norm 8.771135905853e-09 true resid norm 1.568621931485e-01 ||r(i)||/||b|| 2.479488100300e-01 116 KSP Residual norm 8.771135905853e-09 % max 9.915048169257e+00 min 2.137210965771e-05 max/min 4.639246348655e+05 117 KSP preconditioned resid norm 8.659171467993e-09 true resid norm 1.558355526791e-01 ||r(i)||/||b|| 2.463260207675e-01 117 KSP Residual norm 8.659171467993e-09 % max 9.915307398147e+00 min 2.127637945733e-05 max/min 4.660241850844e+05 118 KSP preconditioned resid norm 8.575179217480e-09 true resid norm 1.550972317778e-01 ||r(i)||/||b|| 2.451589722568e-01 118 KSP Residual norm 8.575179217480e-09 % max 9.916072938441e+00 min 2.121014916472e-05 max/min 4.675154739098e+05 119 KSP preconditioned resid norm 8.518079720132e-09 true resid norm 1.544593367397e-01 ||r(i)||/||b|| 2.441506648218e-01 119 KSP Residual norm 8.518079720132e-09 % max 9.916686583796e+00 min 2.115333030509e-05 max/min 4.688002522897e+05 120 KSP preconditioned resid norm 8.478238591051e-09 true resid norm 1.539772069403e-01 ||r(i)||/||b|| 2.433885722637e-01 120 KSP Residual norm 8.478238591051e-09 % max 9.916686588373e+00 min 2.111527934504e-05 max/min 4.696450577957e+05 121 KSP preconditioned resid norm 8.444941671542e-09 true resid norm 1.535936004591e-01 ||r(i)||/||b|| 2.427822134680e-01 121 KSP Residual norm 8.444941671542e-09 % max 9.916688169159e+00 min 2.108428257901e-05 max/min 4.703355749478e+05 122 KSP preconditioned resid norm 8.427676435678e-09 true resid norm 1.533605832162e-01 ||r(i)||/||b|| 2.424138879530e-01 122 KSP Residual norm 8.427676435678e-09 % max 9.916689320407e+00 min 2.106414747089e-05 max/min 4.707852209122e+05 123 KSP preconditioned resid norm 8.415410922025e-09 true resid norm 1.532305930189e-01 ||r(i)||/||b|| 2.422084151485e-01 123 KSP Residual norm 8.415410922025e-09 % max 9.917057372015e+00 min 2.105353113559e-05 max/min 4.710400981264e+05 124 KSP preconditioned resid norm 8.408154019827e-09 true resid norm 1.531806523846e-01 ||r(i)||/||b|| 2.421294750254e-01 124 KSP Residual norm 8.408154019827e-09 % max 9.917120919215e+00 min 2.104834192197e-05 max/min 4.711592464612e+05 125 KSP preconditioned resid norm 8.393802244658e-09 true resid norm 1.530793493932e-01 ||r(i)||/||b|| 2.419693474913e-01 125 KSP Residual norm 8.393802244658e-09 % max 9.917132534500e+00 min 2.103675625602e-05 max/min 4.714192822224e+05 126 KSP preconditioned resid norm 8.379753801792e-09 true resid norm 1.529688114683e-01 ||r(i)||/||b|| 2.417946224898e-01 126 KSP Residual norm 8.379753801792e-09 % max 9.917431785146e+00 min 2.102585361587e-05 max/min 4.716779621095e+05 127 KSP preconditioned resid norm 8.371571974323e-09 true resid norm 1.528853003941e-01 ||r(i)||/||b|| 2.416626182697e-01 127 KSP Residual norm 8.371571974323e-09 % max 9.917505129146e+00 min 2.101791298538e-05 max/min 4.718596530514e+05 128 KSP preconditioned resid norm 8.367539460475e-09 true resid norm 1.528216413116e-01 ||r(i)||/||b|| 2.415619936803e-01 128 KSP Residual norm 8.367539460475e-09 % max 9.917526993547e+00 min 2.101234223878e-05 max/min 4.719857920096e+05 129 KSP preconditioned resid norm 8.356446303986e-09 true resid norm 1.526541189646e-01 ||r(i)||/||b|| 2.412971945866e-01 129 KSP Residual norm 8.356446303986e-09 % max 9.917537226333e+00 min 2.099928136381e-05 max/min 4.722798392247e+05 130 KSP preconditioned resid norm 8.331561543348e-09 true resid norm 1.523523830474e-01 ||r(i)||/||b|| 2.408202468906e-01 130 KSP Residual norm 8.331561543348e-09 % max 9.917571522876e+00 min 2.097331245945e-05 max/min 4.728662457134e+05 131 KSP preconditioned resid norm 8.300069240991e-09 true resid norm 1.520236092506e-01 ||r(i)||/||b|| 2.403005609800e-01 131 KSP Residual norm 8.300069240991e-09 % max 9.917634344282e+00 min 2.094526077576e-05 max/min 4.735025479253e+05 132 KSP preconditioned resid norm 8.273052072454e-09 true resid norm 1.517967993701e-01 ||r(i)||/||b|| 2.399420473136e-01 132 KSP Residual norm 8.273052072454e-09 % max 9.917715776764e+00 min 2.092540881867e-05 max/min 4.739556518445e+05 133 KSP preconditioned resid norm 8.251281894079e-09 true resid norm 1.516304506087e-01 ||r(i)||/||b|| 2.396791032823e-01 133 KSP Residual norm 8.251281894079e-09 % max 9.917715781180e+00 min 2.090929722571e-05 max/min 4.743208570868e+05 134 KSP preconditioned resid norm 8.226033728737e-09 true resid norm 1.514349562547e-01 ||r(i)||/||b|| 2.393700894181e-01 134 KSP Residual norm 8.226033728737e-09 % max 9.917731867345e+00 min 2.089015001649e-05 max/min 4.747563736744e+05 135 KSP preconditioned resid norm 8.199286605060e-09 true resid norm 1.512447576528e-01 ||r(i)||/||b|| 2.390694464393e-01 135 KSP Residual norm 8.199286605060e-09 % max 9.917748489841e+00 min 2.087043434554e-05 max/min 4.752056581880e+05 136 KSP preconditioned resid norm 8.180296594388e-09 true resid norm 1.510479857878e-01 ||r(i)||/||b|| 2.387584132402e-01 136 KSP Residual norm 8.180296594388e-09 % max 9.917785029298e+00 min 2.085325140445e-05 max/min 4.755989767228e+05 137 KSP preconditioned resid norm 8.169264503348e-09 true resid norm 1.509040950747e-01 ||r(i)||/||b|| 2.385309681792e-01 137 KSP Residual norm 8.169264503348e-09 % max 9.917785730737e+00 min 2.084031855443e-05 max/min 4.758941522334e+05 138 KSP preconditioned resid norm 8.156637560480e-09 true resid norm 1.507559867333e-01 ||r(i)||/||b|| 2.382968564007e-01 138 KSP Residual norm 8.156637560480e-09 % max 9.917796791468e+00 min 2.082708164521e-05 max/min 4.761971437199e+05 139 KSP preconditioned resid norm 8.146852248061e-09 true resid norm 1.506511059489e-01 ||r(i)||/||b|| 2.381310735237e-01 139 KSP Residual norm 8.146852248061e-09 % max 9.917821676165e+00 min 2.081598847234e-05 max/min 4.764521122473e+05 140 KSP preconditioned resid norm 8.141191465029e-09 true resid norm 1.505854905603e-01 ||r(i)||/||b|| 2.380273566420e-01 140 KSP Residual norm 8.141191465029e-09 % max 9.917845903968e+00 min 2.080955754602e-05 max/min 4.766005179128e+05 141 KSP preconditioned resid norm 8.138923553302e-09 true resid norm 1.505722123545e-01 ||r(i)||/||b|| 2.380063680578e-01 141 KSP Residual norm 8.138923553302e-09 % max 9.917872635574e+00 min 2.080833075673e-05 max/min 4.766299013373e+05 142 KSP preconditioned resid norm 8.133951815022e-09 true resid norm 1.505617094662e-01 ||r(i)||/||b|| 2.379897663606e-01 142 KSP Residual norm 8.133951815022e-09 % max 9.917872952600e+00 min 2.080597338078e-05 max/min 4.766839200977e+05 143 KSP preconditioned resid norm 8.130475958447e-09 true resid norm 1.505390404850e-01 ||r(i)||/||b|| 2.379539339729e-01 143 KSP Residual norm 8.130475958447e-09 % max 9.917874128718e+00 min 2.080394570826e-05 max/min 4.767304369950e+05 144 KSP preconditioned resid norm 8.128953889987e-09 true resid norm 1.505364291420e-01 ||r(i)||/||b|| 2.379498062773e-01 144 KSP Residual norm 8.128953889987e-09 % max 9.917915384229e+00 min 2.080337902646e-05 max/min 4.767454061966e+05 145 KSP preconditioned resid norm 8.128697515577e-09 true resid norm 1.505360974540e-01 ||r(i)||/||b|| 2.379492819850e-01 145 KSP Residual norm 8.128697515577e-09 % max 9.917928714401e+00 min 2.080337467554e-05 max/min 4.767461466750e+05 146 KSP preconditioned resid norm 8.128385333914e-09 true resid norm 1.505304085835e-01 ||r(i)||/||b|| 2.379402897056e-01 146 KSP Residual norm 8.128385333914e-09 % max 9.918146741049e+00 min 2.080294082776e-05 max/min 4.767665698407e+05 147 KSP preconditioned resid norm 8.127949324855e-09 true resid norm 1.505226583248e-01 ||r(i)||/||b|| 2.379280390325e-01 147 KSP Residual norm 8.127949324855e-09 % max 9.918156051888e+00 min 2.080250556219e-05 max/min 4.767769931482e+05 148 KSP preconditioned resid norm 8.126892921741e-09 true resid norm 1.505086679169e-01 ||r(i)||/||b|| 2.379059246855e-01 148 KSP Residual norm 8.126892921741e-09 % max 9.918171317513e+00 min 2.080184752072e-05 max/min 4.767928092749e+05 149 KSP preconditioned resid norm 8.125786411061e-09 true resid norm 1.504901877800e-01 ||r(i)||/||b|| 2.378767135170e-01 149 KSP Residual norm 8.125786411061e-09 % max 9.918205450251e+00 min 2.080029672887e-05 max/min 4.768299981263e+05 150 KSP preconditioned resid norm 8.121865705516e-09 true resid norm 1.504379984469e-01 ||r(i)||/||b|| 2.377942189223e-01 150 KSP Residual norm 8.121865705516e-09 % max 9.918295492207e+00 min 2.079653971654e-05 max/min 4.769204698183e+05 151 KSP preconditioned resid norm 8.114591200267e-09 true resid norm 1.503744219252e-01 ||r(i)||/||b|| 2.376937248352e-01 151 KSP Residual norm 8.114591200267e-09 % max 9.918302707337e+00 min 2.079103745413e-05 max/min 4.770470318867e+05 152 KSP preconditioned resid norm 8.108201162281e-09 true resid norm 1.503020846718e-01 ||r(i)||/||b|| 2.375793828416e-01 152 KSP Residual norm 8.108201162281e-09 % max 9.918303753789e+00 min 2.078516129771e-05 max/min 4.771819478198e+05 153 KSP preconditioned resid norm 8.102490482870e-09 true resid norm 1.502240682865e-01 ||r(i)||/||b|| 2.374560639621e-01 153 KSP Residual norm 8.102490482870e-09 % max 9.918303965254e+00 min 2.077946249056e-05 max/min 4.773128260541e+05 154 KSP preconditioned resid norm 8.087174026596e-09 true resid norm 1.500412418392e-01 ||r(i)||/||b|| 2.371670739948e-01 154 KSP Residual norm 8.087174026596e-09 % max 9.918315546883e+00 min 2.076640155913e-05 max/min 4.776135874404e+05 155 KSP preconditioned resid norm 8.057393656065e-09 true resid norm 1.496423828383e-01 ||r(i)||/||b|| 2.365366058580e-01 155 KSP Residual norm 8.057393656065e-09 % max 9.918465324870e+00 min 2.073545587686e-05 max/min 4.783336032625e+05 156 KSP preconditioned resid norm 8.011693187532e-09 true resid norm 1.490284191223e-01 ||r(i)||/||b|| 2.355661261668e-01 156 KSP Residual norm 8.011693187532e-09 % max 9.918466288675e+00 min 2.068954861804e-05 max/min 4.793950062316e+05 157 KSP preconditioned resid norm 7.959013340300e-09 true resid norm 1.483574761823e-01 ||r(i)||/||b|| 2.345055805998e-01 157 KSP Residual norm 7.959013340300e-09 % max 9.918561932305e+00 min 2.064150405350e-05 max/min 4.805154656656e+05 158 KSP preconditioned resid norm 7.894240455844e-09 true resid norm 1.476088146004e-01 ||r(i)||/||b|| 2.333221867900e-01 158 KSP Residual norm 7.894240455844e-09 % max 9.918657124651e+00 min 2.058238758689e-05 max/min 4.819002208941e+05 159 KSP preconditioned resid norm 7.812904666749e-09 true resid norm 1.467208127766e-01 ||r(i)||/||b|| 2.319185407546e-01 159 KSP Residual norm 7.812904666749e-09 % max 9.919093811117e+00 min 2.051599215020e-05 max/min 4.834810687437e+05 160 KSP preconditioned resid norm 7.741530539562e-09 true resid norm 1.459917281627e-01 ||r(i)||/||b|| 2.307660918517e-01 160 KSP Residual norm 7.741530539562e-09 % max 9.919095332651e+00 min 2.045886641493e-05 max/min 4.848311304977e+05 161 KSP preconditioned resid norm 7.676609426916e-09 true resid norm 1.453051987717e-01 ||r(i)||/||b|| 2.296809091053e-01 161 KSP Residual norm 7.676609426916e-09 % max 9.919292102450e+00 min 2.040917630477e-05 max/min 4.860211874465e+05 162 KSP preconditioned resid norm 7.583530085636e-09 true resid norm 1.443036525464e-01 ||r(i)||/||b|| 2.280977857933e-01 162 KSP Residual norm 7.583530085636e-09 % max 9.919308484025e+00 min 2.033498476484e-05 max/min 4.877952257517e+05 163 KSP preconditioned resid norm 7.467692143507e-09 true resid norm 1.429522123636e-01 ||r(i)||/||b|| 2.259615923713e-01 163 KSP Residual norm 7.467692143507e-09 % max 9.919855464029e+00 min 2.024129556919e-05 max/min 4.900800657804e+05 164 KSP preconditioned resid norm 7.327851148024e-09 true resid norm 1.414046163109e-01 ||r(i)||/||b|| 2.235153394408e-01 164 KSP Residual norm 7.327851148024e-09 % max 9.920253012186e+00 min 2.012547547329e-05 max/min 4.929201809594e+05 165 KSP preconditioned resid norm 7.152579465113e-09 true resid norm 1.394666900679e-01 ||r(i)||/||b|| 2.204520996873e-01 165 KSP Residual norm 7.152579465113e-09 % max 9.920271087623e+00 min 1.998671358111e-05 max/min 4.963432856214e+05 166 KSP preconditioned resid norm 6.938769833629e-09 true resid norm 1.370973027073e-01 ||r(i)||/||b|| 2.167068583084e-01 166 KSP Residual norm 6.938769833629e-09 % max 9.920293566357e+00 min 1.981527794098e-05 max/min 5.006386282295e+05 167 KSP preconditioned resid norm 6.741207402281e-09 true resid norm 1.351840392321e-01 ||r(i)||/||b|| 2.136826024797e-01 167 KSP Residual norm 6.741207402281e-09 % max 9.920596954025e+00 min 1.967349814668e-05 max/min 5.042619711075e+05 168 KSP preconditioned resid norm 6.565143103123e-09 true resid norm 1.336623889818e-01 ||r(i)||/||b|| 2.112773615401e-01 168 KSP Residual norm 6.565143103123e-09 % max 9.921114523994e+00 min 1.956018369412e-05 max/min 5.072096806011e+05 169 KSP preconditioned resid norm 6.429632478813e-09 true resid norm 1.325320084241e-01 ||r(i)||/||b|| 2.094905924751e-01 169 KSP Residual norm 6.429632478813e-09 % max 9.921249706269e+00 min 1.947508804310e-05 max/min 5.094328551589e+05 170 KSP preconditioned resid norm 6.351340021846e-09 true resid norm 1.319352335272e-01 ||r(i)||/||b|| 2.085472827931e-01 170 KSP Residual norm 6.351340021846e-09 % max 9.921400860428e+00 min 1.943209479700e-05 max/min 5.105677470224e+05 171 KSP preconditioned resid norm 6.305273930969e-09 true resid norm 1.315826886900e-01 ||r(i)||/||b|| 2.079900224928e-01 171 KSP Residual norm 6.305273930969e-09 % max 9.921731787878e+00 min 1.940477171363e-05 max/min 5.113037109789e+05 172 KSP preconditioned resid norm 6.279015387005e-09 true resid norm 1.313487942570e-01 ||r(i)||/||b|| 2.076203104216e-01 172 KSP Residual norm 6.279015387005e-09 % max 9.922029289988e+00 min 1.939038385409e-05 max/min 5.116984462324e+05 173 KSP preconditioned resid norm 6.254268607814e-09 true resid norm 1.311057590476e-01 ||r(i)||/||b|| 2.072361497149e-01 173 KSP Residual norm 6.254268607814e-09 % max 9.922081473522e+00 min 1.937353844196e-05 max/min 5.121460647596e+05 174 KSP preconditioned resid norm 6.237354310290e-09 true resid norm 1.308928887713e-01 ||r(i)||/||b|| 2.068996700913e-01 174 KSP Residual norm 6.237354310290e-09 % max 9.922086586645e+00 min 1.935976905538e-05 max/min 5.125105861678e+05 175 KSP preconditioned resid norm 6.228697463260e-09 true resid norm 1.307776586248e-01 ||r(i)||/||b|| 2.067175281925e-01 175 KSP Residual norm 6.228697463260e-09 % max 9.922153120984e+00 min 1.935153803598e-05 max/min 5.127320165733e+05 176 KSP preconditioned resid norm 6.224066688220e-09 true resid norm 1.307217422786e-01 ||r(i)||/||b|| 2.066291423857e-01 176 KSP Residual norm 6.224066688220e-09 % max 9.922168914276e+00 min 1.934716877667e-05 max/min 5.128486254920e+05 177 KSP preconditioned resid norm 6.218867118158e-09 true resid norm 1.306556581492e-01 ||r(i)||/||b|| 2.065246845753e-01 177 KSP Residual norm 6.218867118158e-09 % max 9.922226850895e+00 min 1.934272085613e-05 max/min 5.129695519413e+05 178 KSP preconditioned resid norm 6.214644508175e-09 true resid norm 1.306129149611e-01 ||r(i)||/||b|| 2.064571213057e-01 178 KSP Residual norm 6.214644508175e-09 % max 9.922355708265e+00 min 1.933886017403e-05 max/min 5.130786209205e+05 179 KSP preconditioned resid norm 6.212855197040e-09 true resid norm 1.305891120840e-01 ||r(i)||/||b|| 2.064194965924e-01 179 KSP Residual norm 6.212855197040e-09 % max 9.922451375570e+00 min 1.933749030032e-05 max/min 5.131199148117e+05 180 KSP preconditioned resid norm 6.212625281595e-09 true resid norm 1.305842539484e-01 ||r(i)||/||b|| 2.064118174383e-01 180 KSP Residual norm 6.212625281595e-09 % max 9.922736461793e+00 min 1.933715169437e-05 max/min 5.131436427984e+05 181 KSP preconditioned resid norm 6.212400733084e-09 true resid norm 1.305772830113e-01 ||r(i)||/||b|| 2.064007986230e-01 181 KSP Residual norm 6.212400733084e-09 % max 9.923063598858e+00 min 1.933684249949e-05 max/min 5.131687657444e+05 182 KSP preconditioned resid norm 6.212071952354e-09 true resid norm 1.305684442926e-01 ||r(i)||/||b|| 2.063868274438e-01 182 KSP Residual norm 6.212071952354e-09 % max 9.923063744763e+00 min 1.933594544036e-05 max/min 5.131925809044e+05 183 KSP preconditioned resid norm 6.211566094656e-09 true resid norm 1.305591570573e-01 ||r(i)||/||b|| 2.063721473039e-01 183 KSP Residual norm 6.211566094656e-09 % max 9.923141089945e+00 min 1.933537478778e-05 max/min 5.132117271509e+05 184 KSP preconditioned resid norm 6.210461787001e-09 true resid norm 1.305375085549e-01 ||r(i)||/||b|| 2.063379279660e-01 184 KSP Residual norm 6.210461787001e-09 % max 9.923201283329e+00 min 1.933367383203e-05 max/min 5.132599923606e+05 185 KSP preconditioned resid norm 6.209143778669e-09 true resid norm 1.305127223575e-01 ||r(i)||/||b|| 2.062987489387e-01 185 KSP Residual norm 6.209143778669e-09 % max 9.923212917908e+00 min 1.933144232370e-05 max/min 5.133198419313e+05 186 KSP preconditioned resid norm 6.208771445688e-09 true resid norm 1.305097304722e-01 ||r(i)||/||b|| 2.062940197276e-01 186 KSP Residual norm 6.208771445688e-09 % max 9.923281135224e+00 min 1.933117841583e-05 max/min 5.133303786126e+05 187 KSP preconditioned resid norm 6.208512520426e-09 true resid norm 1.305036180825e-01 ||r(i)||/||b|| 2.062843580003e-01 187 KSP Residual norm 6.208512520426e-09 % max 9.923314434905e+00 min 1.933056167219e-05 max/min 5.133484791174e+05 188 KSP preconditioned resid norm 6.207066981762e-09 true resid norm 1.304777614033e-01 ||r(i)||/||b|| 2.062434868846e-01 188 KSP Residual norm 6.207066981762e-09 % max 9.923347740518e+00 min 1.932866808571e-05 max/min 5.134004938424e+05 189 KSP preconditioned resid norm 6.204934149804e-09 true resid norm 1.304392715020e-01 ||r(i)||/||b|| 2.061826466973e-01 189 KSP Residual norm 6.204934149804e-09 % max 9.923362596625e+00 min 1.932574598338e-05 max/min 5.134788900339e+05 190 KSP preconditioned resid norm 6.204155842095e-09 true resid norm 1.304126326631e-01 ||r(i)||/||b|| 2.061405392380e-01 190 KSP Residual norm 6.204155842095e-09 % max 9.923514343512e+00 min 1.932412287621e-05 max/min 5.135298718127e+05 191 KSP preconditioned resid norm 6.203518614933e-09 true resid norm 1.303882235745e-01 ||r(i)||/||b|| 2.061019562988e-01 191 KSP Residual norm 6.203518614933e-09 % max 9.923630839564e+00 min 1.932179761687e-05 max/min 5.135977012253e+05 192 KSP preconditioned resid norm 6.201048863957e-09 true resid norm 1.303177921103e-01 ||r(i)||/||b|| 2.059906267464e-01 192 KSP Residual norm 6.201048863957e-09 % max 9.923652649804e+00 min 1.931812341897e-05 max/min 5.136965136096e+05 193 KSP preconditioned resid norm 6.192716632256e-09 true resid norm 1.301249902732e-01 ||r(i)||/||b|| 2.056858688878e-01 193 KSP Residual norm 6.192716632256e-09 % max 9.923855558469e+00 min 1.930707285429e-05 max/min 5.140010416578e+05 194 KSP preconditioned resid norm 6.181798841032e-09 true resid norm 1.298629857002e-01 ||r(i)||/||b|| 2.052717237022e-01 194 KSP Residual norm 6.181798841032e-09 % max 9.924185498149e+00 min 1.929298985847e-05 max/min 5.143933403247e+05 195 KSP preconditioned resid norm 6.171531561212e-09 true resid norm 1.295856208773e-01 ||r(i)||/||b|| 2.048332988886e-01 195 KSP Residual norm 6.171531561212e-09 % max 9.924254204639e+00 min 1.928113063039e-05 max/min 5.147132911902e+05 196 KSP preconditioned resid norm 6.153444479612e-09 true resid norm 1.291447908864e-01 ||r(i)||/||b|| 2.041364880798e-01 196 KSP Residual norm 6.153444479612e-09 % max 9.924316134490e+00 min 1.926027722186e-05 max/min 5.152737948769e+05 197 KSP preconditioned resid norm 6.119723945157e-09 true resid norm 1.283308296537e-01 ||r(i)||/||b|| 2.028498764687e-01 197 KSP Residual norm 6.119723945157e-09 % max 9.924316422607e+00 min 1.922390011138e-05 max/min 5.162488550767e+05 198 KSP preconditioned resid norm 6.067997285763e-09 true resid norm 1.271271480671e-01 ||r(i)||/||b|| 2.009472419903e-01 198 KSP Residual norm 6.067997285763e-09 % max 9.924442950178e+00 min 1.916694880677e-05 max/min 5.177894014447e+05 199 KSP preconditioned resid norm 6.004429863992e-09 true resid norm 1.256627003698e-01 ||r(i)||/||b|| 1.986324199378e-01 199 KSP Residual norm 6.004429863992e-09 % max 9.924866956836e+00 min 1.909406330276e-05 max/min 5.197881037403e+05 200 KSP preconditioned resid norm 5.934094254933e-09 true resid norm 1.240250400661e-01 ||r(i)||/||b|| 1.960438043167e-01 200 KSP Residual norm 5.934094254933e-09 % max 9.924973873650e+00 min 1.900999030530e-05 max/min 5.220925268374e+05 201 KSP preconditioned resid norm 5.851144210814e-09 true resid norm 1.220942283818e-01 ||r(i)||/||b|| 1.929918103984e-01 201 KSP Residual norm 5.851144210814e-09 % max 9.925050713110e+00 min 1.890466927565e-05 max/min 5.250052549660e+05 202 KSP preconditioned resid norm 5.765248380990e-09 true resid norm 1.201835481423e-01 ||r(i)||/||b|| 1.899716378366e-01 202 KSP Residual norm 5.765248380990e-09 % max 9.925088925004e+00 min 1.879800779681e-05 max/min 5.279862117458e+05 203 KSP preconditioned resid norm 5.682453748888e-09 true resid norm 1.183002353836e-01 ||r(i)||/||b|| 1.869947244831e-01 203 KSP Residual norm 5.682453748888e-09 % max 9.925585994328e+00 min 1.868987652967e-05 max/min 5.310674994866e+05 204 KSP preconditioned resid norm 5.601209052384e-09 true resid norm 1.165424983140e-01 ||r(i)||/||b|| 1.842163060127e-01 204 KSP Residual norm 5.601209052384e-09 % max 9.926813616373e+00 min 1.859493580288e-05 max/min 5.338450060600e+05 205 KSP preconditioned resid norm 5.518585052549e-09 true resid norm 1.147552536083e-01 ||r(i)||/||b|| 1.813912454350e-01 205 KSP Residual norm 5.518585052549e-09 % max 9.927547139863e+00 min 1.849785740830e-05 max/min 5.366863264612e+05 206 KSP preconditioned resid norm 5.403946681825e-09 true resid norm 1.122288671650e-01 ||r(i)||/||b|| 1.773978388675e-01 206 KSP Residual norm 5.403946681825e-09 % max 9.927598607359e+00 min 1.836227794968e-05 max/min 5.406517990068e+05 207 KSP preconditioned resid norm 5.279599429018e-09 true resid norm 1.096186442675e-01 ||r(i)||/||b|| 1.732719137586e-01 207 KSP Residual norm 5.279599429018e-09 % max 9.928135298387e+00 min 1.822407087542e-05 max/min 5.447814248670e+05 208 KSP preconditioned resid norm 5.196372449667e-09 true resid norm 1.079356390651e-01 ||r(i)||/||b|| 1.706116223982e-01 208 KSP Residual norm 5.196372449667e-09 % max 9.929022382947e+00 min 1.813084812731e-05 max/min 5.476314352879e+05 209 KSP preconditioned resid norm 5.126109765538e-09 true resid norm 1.065191875250e-01 ||r(i)||/||b|| 1.683726668744e-01 209 KSP Residual norm 5.126109765538e-09 % max 9.929797496117e+00 min 1.805154321094e-05 max/min 5.500802551939e+05 210 KSP preconditioned resid norm 5.077403106475e-09 true resid norm 1.055994890106e-01 ||r(i)||/||b|| 1.669189185387e-01 210 KSP Residual norm 5.077403106475e-09 % max 9.930393445658e+00 min 1.799779398678e-05 max/min 5.517561459451e+05 211 KSP preconditioned resid norm 5.044178538989e-09 true resid norm 1.049755475719e-01 ||r(i)||/||b|| 1.659326672683e-01 211 KSP Residual norm 5.044178538989e-09 % max 9.930465924340e+00 min 1.796091021169e-05 max/min 5.528932446795e+05 212 KSP preconditioned resid norm 5.020571485059e-09 true resid norm 1.045258936180e-01 ||r(i)||/||b|| 1.652219086046e-01 212 KSP Residual norm 5.020571485059e-09 % max 9.930491351737e+00 min 1.793154500597e-05 max/min 5.538000963347e+05 213 KSP preconditioned resid norm 5.006825891577e-09 true resid norm 1.042955061868e-01 ||r(i)||/||b|| 1.648577399783e-01 213 KSP Residual norm 5.006825891577e-09 % max 9.930514884346e+00 min 1.791642205643e-05 max/min 5.542688631173e+05 214 KSP preconditioned resid norm 4.994301057584e-09 true resid norm 1.040932063678e-01 ||r(i)||/||b|| 1.645379688570e-01 214 KSP Residual norm 4.994301057584e-09 % max 9.930647907797e+00 min 1.790225819785e-05 max/min 5.547148185467e+05 215 KSP preconditioned resid norm 4.977855514222e-09 true resid norm 1.038198654095e-01 ||r(i)||/||b|| 1.641059044826e-01 215 KSP Residual norm 4.977855514222e-09 % max 9.930666016381e+00 min 1.788110391402e-05 max/min 5.553720879947e+05 216 KSP preconditioned resid norm 4.964348988837e-09 true resid norm 1.036080455604e-01 ||r(i)||/||b|| 1.637710852474e-01 216 KSP Residual norm 4.964348988837e-09 % max 9.930679455140e+00 min 1.786718726800e-05 max/min 5.558054161622e+05 217 KSP preconditioned resid norm 4.956200770500e-09 true resid norm 1.035069953035e-01 ||r(i)||/||b|| 1.636113572057e-01 217 KSP Residual norm 4.956200770500e-09 % max 9.930689869541e+00 min 1.785815278053e-05 max/min 5.560871827891e+05 218 KSP preconditioned resid norm 4.950534860249e-09 true resid norm 1.034473981952e-01 ||r(i)||/||b|| 1.635171532945e-01 218 KSP Residual norm 4.950534860249e-09 % max 9.930984291908e+00 min 1.785402053178e-05 max/min 5.562323777005e+05 219 KSP preconditioned resid norm 4.942975355172e-09 true resid norm 1.033459997641e-01 ||r(i)||/||b|| 1.633568749011e-01 219 KSP Residual norm 4.942975355172e-09 % max 9.931699409621e+00 min 1.784719695142e-05 max/min 5.564851128529e+05 220 KSP preconditioned resid norm 4.931006016804e-09 true resid norm 1.031436045505e-01 ||r(i)||/||b|| 1.630369529915e-01 220 KSP Residual norm 4.931006016804e-09 % max 9.931710539777e+00 min 1.783542753853e-05 max/min 5.568529556311e+05 221 KSP preconditioned resid norm 4.918043165591e-09 true resid norm 1.029101140458e-01 ||r(i)||/||b|| 1.626678794013e-01 221 KSP Residual norm 4.918043165591e-09 % max 9.931745055666e+00 min 1.782321540888e-05 max/min 5.572364372995e+05 222 KSP preconditioned resid norm 4.897513457279e-09 true resid norm 1.025741649722e-01 ||r(i)||/||b|| 1.621368516797e-01 222 KSP Residual norm 4.897513457279e-09 % max 9.931805660966e+00 min 1.780346830739e-05 max/min 5.578579122612e+05 223 KSP preconditioned resid norm 4.860092454927e-09 true resid norm 1.019662024304e-01 ||r(i)||/||b|| 1.611758579198e-01 223 KSP Residual norm 4.860092454927e-09 % max 9.932464226880e+00 min 1.776986772785e-05 max/min 5.589498120639e+05 224 KSP preconditioned resid norm 4.769160522417e-09 true resid norm 1.004448198639e-01 ||r(i)||/||b|| 1.587710401024e-01 224 KSP Residual norm 4.769160522417e-09 % max 9.933920635368e+00 min 1.767636423838e-05 max/min 5.619889079792e+05 225 KSP preconditioned resid norm 4.634557449537e-09 true resid norm 9.826212202717e-02 ||r(i)||/||b|| 1.553208949755e-01 225 KSP Residual norm 4.634557449537e-09 % max 9.935080165032e+00 min 1.754021072797e-05 max/min 5.664173777106e+05 226 KSP preconditioned resid norm 4.491782226584e-09 true resid norm 9.619719506576e-02 ||r(i)||/||b|| 1.520569078248e-01 226 KSP Residual norm 4.491782226584e-09 % max 9.937326276961e+00 min 1.739963797630e-05 max/min 5.711225883262e+05 227 KSP preconditioned resid norm 4.362133112512e-09 true resid norm 9.441389210582e-02 ||r(i)||/||b|| 1.492380778826e-01 227 KSP Residual norm 4.362133112512e-09 % max 9.946120307410e+00 min 1.727701416548e-05 max/min 5.756851393504e+05 228 KSP preconditioned resid norm 4.225957340088e-09 true resid norm 9.265275503217e-02 ||r(i)||/||b|| 1.464542850964e-01 228 KSP Residual norm 4.225957340088e-09 % max 9.985645498795e+00 min 1.715055176814e-05 max/min 5.822346495782e+05 229 KSP preconditioned resid norm 4.092962778807e-09 true resid norm 9.106715798430e-02 ||r(i)||/||b|| 1.439479647823e-01 229 KSP Residual norm 4.092962778807e-09 % max 1.012393782405e+01 min 1.703438299138e-05 max/min 5.943237174583e+05 230 KSP preconditioned resid norm 3.953159424001e-09 true resid norm 8.949933715671e-02 ||r(i)||/||b|| 1.414697429703e-01 230 KSP Residual norm 3.953159424001e-09 % max 1.034103229999e+01 min 1.691398442562e-05 max/min 6.113894892986e+05 231 KSP preconditioned resid norm 3.819537372036e-09 true resid norm 8.813823667573e-02 ||r(i)||/||b|| 1.393182797157e-01 231 KSP Residual norm 3.819537372036e-09 % max 1.059220161526e+01 min 1.680354842788e-05 max/min 6.303550503468e+05 232 KSP preconditioned resid norm 3.693790673013e-09 true resid norm 8.693612541781e-02 ||r(i)||/||b|| 1.374181274232e-01 232 KSP Residual norm 3.693790673013e-09 % max 1.086546283828e+01 min 1.670426209107e-05 max/min 6.504605099612e+05 233 KSP preconditioned resid norm 3.570890771949e-09 true resid norm 8.586106485480e-02 ||r(i)||/||b|| 1.357188015247e-01 233 KSP Residual norm 3.570890771949e-09 % max 1.117129763635e+01 min 1.660960311863e-05 max/min 6.725806484697e+05 234 KSP preconditioned resid norm 3.457471830300e-09 true resid norm 8.494167552889e-02 ||r(i)||/||b|| 1.342655419168e-01 234 KSP Residual norm 3.457471830300e-09 % max 1.149605765262e+01 min 1.652670056090e-05 max/min 6.956051276091e+05 235 KSP preconditioned resid norm 3.349671536188e-09 true resid norm 8.413008092747e-02 ||r(i)||/||b|| 1.329826711905e-01 235 KSP Residual norm 3.349671536188e-09 % max 1.186263512521e+01 min 1.644950134682e-05 max/min 7.211546949117e+05 236 KSP preconditioned resid norm 3.250181079812e-09 true resid norm 8.343835240331e-02 ||r(i)||/||b|| 1.318892702824e-01 236 KSP Residual norm 3.250181079812e-09 % max 1.224121618785e+01 min 1.638169066079e-05 max/min 7.472498682418e+05 237 KSP preconditioned resid norm 3.157573520911e-09 true resid norm 8.283558144539e-02 ||r(i)||/||b|| 1.309364827513e-01 237 KSP Residual norm 3.157573520911e-09 % max 1.266651789872e+01 min 1.631989107866e-05 max/min 7.761398552030e+05 238 KSP preconditioned resid norm 3.071923111478e-09 true resid norm 8.231741040511e-02 ||r(i)||/||b|| 1.301174205525e-01 238 KSP Residual norm 3.071923111478e-09 % max 1.310242143029e+01 min 1.626505408103e-05 max/min 8.055565856109e+05 239 KSP preconditioned resid norm 2.992453638560e-09 true resid norm 8.186535317118e-02 ||r(i)||/||b|| 1.294028630739e-01 239 KSP Residual norm 2.992453638560e-09 % max 1.358095904032e+01 min 1.621534964344e-05 max/min 8.375372310160e+05 240 KSP preconditioned resid norm 2.918686673178e-09 true resid norm 8.147203530886e-02 ||r(i)||/||b|| 1.287811536998e-01 240 KSP Residual norm 2.918686673178e-09 % max 1.407595144572e+01 min 1.617073516012e-05 max/min 8.704583499968e+05 241 KSP preconditioned resid norm 2.850021677599e-09 true resid norm 8.112645247485e-02 ||r(i)||/||b|| 1.282348980933e-01 241 KSP Residual norm 2.850021677599e-09 % max 1.460780048328e+01 min 1.613020605157e-05 max/min 9.056177234546e+05 242 KSP preconditioned resid norm 2.785952071661e-09 true resid norm 8.082199238068e-02 ||r(i)||/||b|| 1.277536443473e-01 242 KSP Residual norm 2.785952071661e-09 % max 1.516164202801e+01 min 1.609343254723e-05 max/min 9.421011946032e+05 243 KSP preconditioned resid norm 2.726003754945e-09 true resid norm 8.055198108373e-02 ||r(i)||/||b|| 1.273268430995e-01 243 KSP Residual norm 2.726003754945e-09 % max 1.574736003328e+01 min 1.605983429005e-05 max/min 9.805431207371e+05 244 KSP preconditioned resid norm 2.669760211669e-09 true resid norm 8.031144058846e-02 ||r(i)||/||b|| 1.269466257357e-01 244 KSP Residual norm 2.669760211669e-09 % max 1.635815972012e+01 min 1.602907156577e-05 max/min 1.020530705911e+06 245 KSP preconditioned resid norm 2.616859721135e-09 true resid norm 8.009610298041e-02 ||r(i)||/||b|| 1.266062460521e-01 245 KSP Residual norm 2.616859721135e-09 % max 1.699685923744e+01 min 1.600077930749e-05 max/min 1.062251963533e+06 246 KSP preconditioned resid norm 2.566982964804e-09 true resid norm 7.990246732814e-02 ||r(i)||/||b|| 1.263001701991e-01 246 KSP Residual norm 2.566982964804e-09 % max 1.766055495648e+01 min 1.597468378646e-05 max/min 1.105533930596e+06 247 KSP preconditioned resid norm 2.519853945818e-09 true resid norm 7.972764043767e-02 ||r(i)||/||b|| 1.260238249653e-01 247 KSP Residual norm 2.519853945818e-09 % max 1.834829832678e+01 min 1.595053475726e-05 max/min 1.150324964398e+06 248 KSP preconditioned resid norm 2.475228628479e-09 true resid norm 7.956917868632e-02 ||r(i)||/||b|| 1.257733477668e-01 248 KSP Residual norm 2.475228628479e-09 % max 1.905813218687e+01 min 1.592812497001e-05 max/min 1.196508203116e+06 249 KSP preconditioned resid norm 2.432893339181e-09 true resid norm 7.942504309496e-02 ||r(i)||/||b|| 1.255455156318e-01 249 KSP Residual norm 2.432893339181e-09 % max 1.978773059579e+01 min 1.590727268262e-05 max/min 1.243942377213e+06 250 KSP preconditioned resid norm 2.392658478629e-09 true resid norm 7.929349666938e-02 ||r(i)||/||b|| 1.253375829297e-01 250 KSP Residual norm 2.392658478629e-09 % max 2.053507536526e+01 min 1.588782169373e-05 max/min 1.292504143180e+06 251 KSP preconditioned resid norm 2.354356018003e-09 true resid norm 7.917306646215e-02 ||r(i)||/||b|| 1.251472214030e-01 251 KSP Residual norm 2.354356018003e-09 % max 2.129774752042e+01 min 1.586963551752e-05 max/min 1.342043898671e+06 252 KSP preconditioned resid norm 2.317836022639e-09 true resid norm 7.906248909205e-02 ||r(i)||/||b|| 1.249724340512e-01 252 KSP Residual norm 2.317836022639e-09 % max 2.207376104408e+01 min 1.585259496597e-05 max/min 1.392438341575e+06 253 KSP preconditioned resid norm 2.282964417342e-09 true resid norm 7.896067936127e-02 ||r(i)||/||b|| 1.248115055248e-01 253 KSP Residual norm 2.282964417342e-09 % max 2.286109489220e+01 min 1.583659527831e-05 max/min 1.443561225784e+06 254 KSP preconditioned resid norm 2.249620807694e-09 true resid norm 7.886669910772e-02 ||r(i)||/||b|| 1.246629528903e-01 254 KSP Residual norm 2.249620807694e-09 % max 2.365806660683e+01 min 1.582154408902e-05 max/min 1.495307061922e+06 255 KSP preconditioned resid norm 2.217696771416e-09 true resid norm 7.877973447027e-02 ||r(i)||/||b|| 1.245254896945e-01 255 KSP Residual norm 2.217696771416e-09 % max 2.446314901661e+01 min 1.580735961621e-05 max/min 1.547579710373e+06 256 KSP preconditioned resid norm 2.187094355924e-09 true resid norm 7.869907621975e-02 ||r(i)||/||b|| 1.243979948735e-01 256 KSP Residual norm 2.187094355924e-09 % max 2.527506655730e+01 min 1.579396921088e-05 max/min 1.600298583581e+06 257 KSP preconditioned resid norm 2.157724818661e-09 true resid norm 7.862410399905e-02 ||r(i)||/||b|| 1.242794878418e-01 257 KSP Residual norm 2.157724818661e-09 % max 2.609272696980e+01 min 1.578130810546e-05 max/min 1.653394433176e+06 258 KSP preconditioned resid norm 2.129507540860e-09 true resid norm 7.855427316399e-02 ||r(i)||/||b|| 1.241691076915e-01 258 KSP Residual norm 2.129507540860e-09 % max 2.691522199241e+01 min 1.576931837984e-05 max/min 1.706809472933e+06 259 KSP preconditioned resid norm 2.102369096407e-09 true resid norm 7.848910387083e-02 ||r(i)||/||b|| 1.240660959436e-01 259 KSP Residual norm 2.102369096407e-09 % max 2.774179262325e+01 min 1.575794808171e-05 max/min 1.760495242109e+06 260 KSP preconditioned resid norm 2.076242447040e-09 true resid norm 7.842817196590e-02 ||r(i)||/||b|| 1.239697821473e-01 260 KSP Residual norm 2.076242447040e-09 % max 2.857180990120e+01 min 1.574715047563e-05 max/min 1.814411435607e+06 261 KSP preconditioned resid norm 2.051066245234e-09 true resid norm 7.837110133459e-02 ||r(i)||/||b|| 1.238795717860e-01 261 KSP Residual norm 2.051066245234e-09 % max 2.940475189721e+01 min 1.573688340909e-05 max/min 1.868524480535e+06 262 KSP preconditioned resid norm 2.026784227773e-09 true resid norm 7.831755746963e-02 ||r(i)||/||b|| 1.237949361110e-01 262 KSP Residual norm 2.026784227773e-09 % max 3.024018579353e+01 min 1.572710876006e-05 max/min 1.922806426464e+06 263 KSP preconditioned resid norm 2.003344686512e-09 true resid norm 7.826724200522e-02 ||r(i)||/||b|| 1.237154035016e-01 263 KSP Residual norm 2.003344686512e-09 % max 3.107775189605e+01 min 1.571779197010e-05 max/min 1.977233949601e+06 264 KSP preconditioned resid norm 1.980700004942e-09 true resid norm 7.821988809140e-02 ||r(i)||/||b|| 1.236405521538e-01 264 KSP Residual norm 1.980700004942e-09 % max 3.191715058147e+01 min 1.570890163917e-05 max/min 2.031787537703e+06 265 KSP preconditioned resid norm 1.958806251109e-09 true resid norm 7.817525644637e-02 ||r(i)||/||b|| 1.235700038397e-01 265 KSP Residual norm 1.958806251109e-09 % max 3.275813144817e+01 min 1.570040916795e-05 max/min 2.086450811425e+06 266 KSP preconditioned resid norm 1.937622818970e-09 true resid norm 7.813313196624e-02 ||r(i)||/||b|| 1.235034185490e-01 266 KSP Residual norm 1.937622818970e-09 % max 3.360048447534e+01 min 1.569228845921e-05 max/min 2.141209968366e+06 267 KSP preconditioned resid norm 1.917112111503e-09 true resid norm 7.809332083185e-02 ||r(i)||/||b|| 1.234404899160e-01 267 KSP Residual norm 1.917112111503e-09 % max 3.444403281522e+01 min 1.568451564582e-05 max/min 2.196053330113e+06 268 KSP preconditioned resid norm 1.897239259969e-09 true resid norm 7.805564799222e-02 ||r(i)||/||b|| 1.233809412410e-01 268 KSP Residual norm 1.897239259969e-09 % max 3.528862694058e+01 min 1.567706885891e-05 max/min 2.250970972838e+06 269 KSP preconditioned resid norm 1.877971874523e-09 true resid norm 7.801995501184e-02 ||r(i)||/||b|| 1.233245220884e-01 269 KSP Residual norm 1.877971874523e-09 % max 3.613413989135e+01 min 1.566992802358e-05 max/min 2.305954426656e+06 270 KSP preconditioned resid norm 1.859279822146e-09 true resid norm 7.798609818486e-02 ||r(i)||/||b|| 1.232710053053e-01 270 KSP Residual norm 1.859279822146e-09 % max 3.698046341191e+01 min 1.566307468025e-05 max/min 2.360996430576e+06 271 KSP preconditioned resid norm 1.841135028406e-09 true resid norm 7.795394691239e-02 ||r(i)||/||b|| 1.232201844568e-01 271 KSP Residual norm 1.841135028406e-09 % max 3.782750480653e+01 min 1.565649181982e-05 max/min 2.416090733599e+06 272 KSP preconditioned resid norm 1.823511300102e-09 true resid norm 7.792338226282e-02 ||r(i)||/||b|| 1.231718715502e-01 272 KSP Residual norm 1.823511300102e-09 % max 3.867518437327e+01 min 1.565016375351e-05 max/min 2.471231929736e+06 273 KSP preconditioned resid norm 1.806384166231e-09 true resid norm 7.789429574400e-02 ||r(i)||/||b|| 1.231258950942e-01 273 KSP Residual norm 1.806384166231e-09 % max 3.952343330397e+01 min 1.564407597706e-05 max/min 2.526415325643e+06 274 KSP preconditioned resid norm 1.789730735084e-09 true resid norm 7.786658820055e-02 ||r(i)||/||b|| 1.230820983558e-01 274 KSP Residual norm 1.789730735084e-09 % max 4.037219195996e+01 min 1.563821506843e-05 max/min 2.581636828966e+06 275 KSP preconditioned resid norm 1.773529565569e-09 true resid norm 7.784016885169e-02 ||r(i)||/||b|| 1.230403378399e-01 275 KSP Residual norm 1.773529565569e-09 % max 4.122140845150e+01 min 1.563256858590e-05 max/min 2.636892857690e+06 276 KSP preconditioned resid norm 1.757760551130e-09 true resid norm 7.781495445520e-02 ||r(i)||/||b|| 1.230004819672e-01 276 KSP Residual norm 1.757760551130e-09 % max 4.207103746307e+01 min 1.562712498178e-05 max/min 2.692180264260e+06 277 KSP preconditioned resid norm 1.742404814808e-09 true resid norm 7.779086855347e-02 ||r(i)||/||b|| 1.229624098827e-01 277 KSP Residual norm 1.742404814808e-09 % max 4.292103927837e+01 min 1.562187351950e-05 max/min 2.747496273402e+06 278 KSP preconditioned resid norm 1.727444614226e-09 true resid norm 7.776784081058e-02 ||r(i)||/||b|| 1.229260104079e-01 278 KSP Residual norm 1.727444614226e-09 % max 4.377137896812e+01 min 1.561680420878e-05 max/min 2.802838428588e+06 279 KSP preconditioned resid norm 1.712863255388e-09 true resid norm 7.774580642352e-02 ||r(i)||/||b|| 1.228911811100e-01 279 KSP Residual norm 1.712863255388e-09 % max 4.462202571076e+01 min 1.561190773954e-05 max/min 2.858204548427e+06 280 KSP preconditioned resid norm 1.698645014357e-09 true resid norm 7.772470561649e-02 ||r(i)||/||b|| 1.228578275027e-01 280 KSP Residual norm 1.698645014357e-09 % max 4.547295222235e+01 min 1.560717542407e-05 max/min 2.913592689695e+06 281 KSP preconditioned resid norm 1.684775065960e-09 true resid norm 7.770448314865e-02 ||r(i)||/||b|| 1.228258622679e-01 281 KSP Residual norm 1.684775065960e-09 % max 4.632413427620e+01 min 1.560259914800e-05 max/min 2.969001115571e+06 282 KSP preconditioned resid norm 1.671239418801e-09 true resid norm 7.768508792437e-02 ||r(i)||/||b|| 1.227952046398e-01 282 KSP Residual norm 1.671239418801e-09 % max 4.717555029673e+01 min 1.559817132342e-05 max/min 3.024428269094e+06 283 KSP preconditioned resid norm 1.658024855925e-09 true resid norm 7.766647260718e-02 ||r(i)||/||b|| 1.227657797946e-01 283 KSP Residual norm 1.658024855925e-09 % max 4.802718101487e+01 min 1.559388484496e-05 max/min 3.079872750914e+06 284 KSP preconditioned resid norm 1.645118880556e-09 true resid norm 7.764859329326e-02 ||r(i)||/||b|| 1.227375183346e-01 284 KSP Residual norm 1.645118880556e-09 % max 4.887900917455e+01 min 1.558973305319e-05 max/min 3.135333299665e+06 285 KSP preconditioned resid norm 1.632509666424e-09 true resid norm 7.763140921182e-02 ||r(i)||/||b|| 1.227103558140e-01 285 KSP Residual norm 1.632509666424e-09 % max 4.973101928201e+01 min 1.558570970162e-05 max/min 3.190808775095e+06 286 KSP preconditioned resid norm 1.620186012203e-09 true resid norm 7.761488245375e-02 ||r(i)||/||b|| 1.226842323109e-01 286 KSP Residual norm 1.620186012203e-09 % max 5.058319739082e+01 min 1.558180892380e-05 max/min 3.246298144084e+06 287 KSP preconditioned resid norm 1.608137299684e-09 true resid norm 7.759897773695e-02 ||r(i)||/||b|| 1.226590920554e-01 287 KSP Residual norm 1.608137299684e-09 % max 5.143553091712e+01 min 1.557802520420e-05 max/min 3.301800468474e+06 288 KSP preconditioned resid norm 1.596353455318e-09 true resid norm 7.758366217724e-02 ||r(i)||/||b|| 1.226348830683e-01 288 KSP Residual norm 1.596353455318e-09 % max 5.228800848011e+01 min 1.557435335439e-05 max/min 3.357314893935e+06 289 KSP preconditioned resid norm 1.584824914814e-09 true resid norm 7.756890510550e-02 ||r(i)||/||b|| 1.226115568716e-01 289 KSP Residual norm 1.584824914814e-09 % max 5.314061976413e+01 min 1.557078848607e-05 max/min 3.412840641416e+06 290 KSP preconditioned resid norm 1.573542590509e-09 true resid norm 7.755467787456e-02 ||r(i)||/||b|| 1.225890681832e-01 290 KSP Residual norm 1.573542590509e-09 % max 5.399335539895e+01 min 1.556732599304e-05 max/min 3.468376998278e+06 291 KSP preconditioned resid norm 1.562497841257e-09 true resid norm 7.754095371195e-02 ||r(i)||/||b|| 1.225673746845e-01 291 KSP Residual norm 1.562497841257e-09 % max 5.484620685555e+01 min 1.556396153017e-05 max/min 3.523923311507e+06 292 KSP preconditioned resid norm 1.551682444612e-09 true resid norm 7.752770755925e-02 ||r(i)||/||b|| 1.225464367661e-01 292 KSP Residual norm 1.551682444612e-09 % max 5.569916635525e+01 min 1.556069099359e-05 max/min 3.579478981892e+06 293 KSP preconditioned resid norm 1.541088571101e-09 true resid norm 7.751491595478e-02 ||r(i)||/||b|| 1.225262173427e-01 293 KSP Residual norm 1.541088571101e-09 % max 5.655222679014e+01 min 1.555751050459e-05 max/min 3.635043458494e+06 294 KSP preconditioned resid norm 1.530708760400e-09 true resid norm 7.750255689784e-02 ||r(i)||/||b|| 1.225066816381e-01 294 KSP Residual norm 1.530708760400e-09 % max 5.740538165332e+01 min 1.555441639733e-05 max/min 3.690616233161e+06 295 KSP preconditioned resid norm 1.520535899256e-09 true resid norm 7.749060974081e-02 ||r(i)||/||b|| 1.224877970152e-01 295 KSP Residual norm 1.520535899256e-09 % max 5.825862497755e+01 min 1.555140519978e-05 max/min 3.746196837465e+06 296 KSP preconditioned resid norm 1.510563200994e-09 true resid norm 7.747905509827e-02 ||r(i)||/||b|| 1.224695328317e-01 296 KSP Residual norm 1.510563200994e-09 % max 5.911195128112e+01 min 1.554847362481e-05 max/min 3.801784838019e+06 297 KSP preconditioned resid norm 1.500784186497e-09 true resid norm 7.746787474440e-02 ||r(i)||/||b|| 1.224518602786e-01 297 KSP Residual norm 1.500784186497e-09 % max 5.996535552011e+01 min 1.554561855686e-05 max/min 3.857379833473e+06 298 KSP preconditioned resid norm 1.491192666507e-09 true resid norm 7.745705153527e-02 ||r(i)||/||b|| 1.224347522568e-01 298 KSP Residual norm 1.491192666507e-09 % max 6.081883304605e+01 min 1.554283704130e-05 max/min 3.912981451485e+06 299 KSP preconditioned resid norm 1.481782725168e-09 true resid norm 7.744656933035e-02 ||r(i)||/||b|| 1.224181832532e-01 299 KSP Residual norm 1.481782725168e-09 % max 6.167237956835e+01 min 1.554012627155e-05 max/min 3.968589346744e+06 300 KSP preconditioned resid norm 1.472548704690e-09 true resid norm 7.743641291307e-02 ||r(i)||/||b|| 1.224021292155e-01 300 KSP Residual norm 1.472548704690e-09 % max 6.252599112088e+01 min 1.553748358279e-05 max/min 4.024203197881e+06 301 KSP preconditioned resid norm 1.463485191052e-09 true resid norm 7.742656794078e-02 ||r(i)||/||b|| 1.223865674723e-01 301 KSP Residual norm 1.463485191052e-09 % max 6.337966403224e+01 min 1.553490644415e-05 max/min 4.079822705085e+06 302 KSP preconditioned resid norm 1.454587000664e-09 true resid norm 7.741702087545e-02 ||r(i)||/||b|| 1.223714766244e-01 302 KSP Residual norm 1.454587000664e-09 % max 6.423339489912e+01 min 1.553239244746e-05 max/min 4.135447589055e+06 303 KSP preconditioned resid norm 1.445849167908e-09 true resid norm 7.740775893208e-02 ||r(i)||/||b|| 1.223568364630e-01 303 KSP Residual norm 1.445849167908e-09 % max 6.508718056257e+01 min 1.552993929984e-05 max/min 4.191077589287e+06 304 KSP preconditioned resid norm 1.437266933493e-09 true resid norm 7.739877002315e-02 ||r(i)||/||b|| 1.223426278813e-01 304 KSP Residual norm 1.437266933493e-09 % max 6.594101808663e+01 min 1.552754481805e-05 max/min 4.246712462229e+06 305 KSP preconditioned resid norm 1.428835733563e-09 true resid norm 7.739004272067e-02 ||r(i)||/||b|| 1.223288328156e-01 305 KSP Residual norm 1.428835733563e-09 % max 6.679490473918e+01 min 1.552520692457e-05 max/min 4.302351979186e+06 306 KSP preconditioned resid norm 1.420551189495e-09 true resid norm 7.738156621021e-02 ||r(i)||/||b|| 1.223154341716e-01 306 KSP Residual norm 1.420551189495e-09 % max 6.764883797474e+01 min 1.552292363761e-05 max/min 4.357995926157e+06 307 KSP preconditioned resid norm 1.412409098347e-09 true resid norm 7.737333024472e-02 ||r(i)||/||b|| 1.223024157521e-01 307 KSP Residual norm 1.412409098347e-09 % max 6.850281541896e+01 min 1.552069306562e-05 max/min 4.413644102704e+06 308 KSP preconditioned resid norm 1.404405423899e-09 true resid norm 7.736532511275e-02 ||r(i)||/||b|| 1.222897622063e-01 308 KSP Residual norm 1.404405423899e-09 % max 6.935683485452e+01 min 1.551851340462e-05 max/min 4.469296320217e+06 309 KSP preconditioned resid norm 1.396536288240e-09 true resid norm 7.735754161515e-02 ||r(i)||/||b|| 1.222774589933e-01 309 KSP Residual norm 1.396536288240e-09 % max 7.021089420858e+01 min 1.551638293250e-05 max/min 4.524952401215e+06 310 KSP preconditioned resid norm 1.388797963878e-09 true resid norm 7.734997100748e-02 ||r(i)||/||b|| 1.222654922910e-01 310 KSP Residual norm 1.388797963878e-09 % max 7.106499154121e+01 min 1.551430000199e-05 max/min 4.580612179221e+06 311 KSP preconditioned resid norm 1.381186866318e-09 true resid norm 7.734260499993e-02 ||r(i)||/||b|| 1.222538489959e-01 311 KSP Residual norm 1.381186866318e-09 % max 7.191912503503e+01 min 1.551226304067e-05 max/min 4.636275496779e+06 312 KSP preconditioned resid norm 1.373699547088e-09 true resid norm 7.733543571137e-02 ||r(i)||/||b|| 1.222425166504e-01 312 KSP Residual norm 1.373699547088e-09 % max 7.277329298577e+01 min 1.551027054238e-05 max/min 4.691942206096e+06 313 KSP preconditioned resid norm 1.366332687169e-09 true resid norm 7.732845565140e-02 ||r(i)||/||b|| 1.222314834147e-01 313 KSP Residual norm 1.366332687169e-09 % max 7.362749379365e+01 min 1.550832106753e-05 max/min 4.747612167238e+06 314 KSP preconditioned resid norm 1.359083090818e-09 true resid norm 7.732165769144e-02 ||r(i)||/||b|| 1.222207380206e-01 314 KSP Residual norm 1.359083090818e-09 % max 7.448172595555e+01 min 1.550641323816e-05 max/min 4.803285247957e+06 315 KSP preconditioned resid norm 1.351947679741e-09 true resid norm 7.731503505485e-02 ||r(i)||/||b|| 1.222102697565e-01 315 KSP Residual norm 1.351947679741e-09 % max 7.533598805789e+01 min 1.550454573403e-05 max/min 4.858961323358e+06 316 KSP preconditioned resid norm 1.344923487600e-09 true resid norm 7.730858128088e-02 ||r(i)||/||b|| 1.222000684101e-01 316 KSP Residual norm 1.344923487600e-09 % max 7.619027877011e+01 min 1.550271729250e-05 max/min 4.914640274512e+06 317 KSP preconditioned resid norm 1.338007654833e-09 true resid norm 7.730229022367e-02 ||r(i)||/||b|| 1.221901242666e-01 317 KSP Residual norm 1.338007654833e-09 % max 7.704459683868e+01 min 1.550092669784e-05 max/min 4.970321990454e+06 318 KSP preconditioned resid norm 1.331197423761e-09 true resid norm 7.729615601478e-02 ||r(i)||/||b|| 1.221804280500e-01 318 KSP Residual norm 1.331197423761e-09 % max 7.789894108168e+01 min 1.549917278716e-05 max/min 5.026006365076e+06 319 KSP preconditioned resid norm 1.324490133971e-09 true resid norm 7.729017306553e-02 ||r(i)||/||b|| 1.221709709264e-01 319 KSP Residual norm 1.324490133971e-09 % max 7.875331038375e+01 min 1.549745444738e-05 max/min 5.081693296865e+06 320 KSP preconditioned resid norm 1.317883217946e-09 true resid norm 7.728433604124e-02 ||r(i)||/||b|| 1.221617444634e-01 320 KSP Residual norm 1.317883217946e-09 % max 7.960770369153e+01 min 1.549577060563e-05 max/min 5.137382690902e+06 321 KSP preconditioned resid norm 1.311374196936e-09 true resid norm 7.727863984637e-02 ||r(i)||/||b|| 1.221527406065e-01 321 KSP Residual norm 1.311374196936e-09 % max 8.046212000946e+01 min 1.549412023217e-05 max/min 5.193074456876e+06 322 KSP preconditioned resid norm 1.304960677053e-09 true resid norm 7.727307961732e-02 ||r(i)||/||b|| 1.221439516680e-01 322 KSP Residual norm 1.304960677053e-09 % max 8.131655839582e+01 min 1.549250234145e-05 max/min 5.248768507734e+06 323 KSP preconditioned resid norm 1.298640345569e-09 true resid norm 7.726765070255e-02 ||r(i)||/||b|| 1.221353702952e-01 323 KSP Residual norm 1.298640345569e-09 % max 8.217101795924e+01 min 1.549091598505e-05 max/min 5.304464761060e+06 324 KSP preconditioned resid norm 1.292410967416e-09 true resid norm 7.726234866242e-02 ||r(i)||/||b|| 1.221269894705e-01 324 KSP Residual norm 1.292410967416e-09 % max 8.302549785536e+01 min 1.548936024742e-05 max/min 5.360163139676e+06 325 KSP preconditioned resid norm 1.286270381860e-09 true resid norm 7.725716924438e-02 ||r(i)||/||b|| 1.221188024720e-01 325 KSP Residual norm 1.286270381860e-09 % max 8.387999728379e+01 min 1.548783425086e-05 max/min 5.415863569118e+06 326 KSP preconditioned resid norm 1.280216499357e-09 true resid norm 7.725210838122e-02 ||r(i)||/||b|| 1.221108028707e-01 326 KSP Residual norm 1.280216499357e-09 % max 8.473451548533e+01 min 1.548633715258e-05 max/min 5.471565977834e+06 327 KSP preconditioned resid norm 1.274247298561e-09 true resid norm 7.724716218140e-02 ||r(i)||/||b|| 1.221029845154e-01 327 KSP Residual norm 1.274247298561e-09 % max 8.558905173932e+01 min 1.548486813770e-05 max/min 5.527270298862e+06 328 KSP preconditioned resid norm 1.268360823492e-09 true resid norm 7.724232691489e-02 ||r(i)||/||b|| 1.220953415101e-01 328 KSP Residual norm 1.268360823492e-09 % max 8.644360536128e+01 min 1.548342642511e-05 max/min 5.582976467087e+06 329 KSP preconditioned resid norm 1.262555180844e-09 true resid norm 7.723759900898e-02 ||r(i)||/||b|| 1.220878682074e-01 329 KSP Residual norm 1.262555180844e-09 % max 8.729817570067e+01 min 1.548201126003e-05 max/min 5.638684421192e+06 330 KSP preconditioned resid norm 1.256828537422e-09 true resid norm 7.723297504110e-02 ||r(i)||/||b|| 1.220805591975e-01 330 KSP Residual norm 1.256828537422e-09 % max 8.815276213883e+01 min 1.548062191472e-05 max/min 5.694394102799e+06 331 KSP preconditioned resid norm 1.251179117715e-09 true resid norm 7.722845172784e-02 ||r(i)||/||b|| 1.220734092902e-01 331 KSP Residual norm 1.251179117715e-09 % max 8.900736408702e+01 min 1.547925768954e-05 max/min 5.750105455454e+06 332 KSP preconditioned resid norm 1.245605201579e-09 true resid norm 7.722402591733e-02 ||r(i)||/||b|| 1.220664135034e-01 332 KSP Residual norm 1.245605201579e-09 % max 8.986198098471e+01 min 1.547791791175e-05 max/min 5.805818424484e+06 333 KSP preconditioned resid norm 1.240105122040e-09 true resid norm 7.721969459269e-02 ||r(i)||/||b|| 1.220595670686e-01 333 KSP Residual norm 1.240105122040e-09 % max 9.071661229787e+01 min 1.547660192837e-05 max/min 5.861532959091e+06 334 KSP preconditioned resid norm 1.234677263197e-09 true resid norm 7.721545484810e-02 ||r(i)||/||b|| 1.220528653924e-01 334 KSP Residual norm 1.234677263197e-09 % max 9.157125751743e+01 min 1.547530911137e-05 max/min 5.917249009919e+06 335 KSP preconditioned resid norm 1.229320058231e-09 true resid norm 7.721130389656e-02 ||r(i)||/||b|| 1.220463040695e-01 335 KSP Residual norm 1.229320058231e-09 % max 9.242591615789e+01 min 1.547403885462e-05 max/min 5.972966529699e+06 336 KSP preconditioned resid norm 1.224031987499e-09 true resid norm 7.720723906513e-02 ||r(i)||/||b|| 1.220398788749e-01 336 KSP Residual norm 1.224031987499e-09 % max 9.328058775593e+01 min 1.547279057027e-05 max/min 6.028685474176e+06 337 KSP preconditioned resid norm 1.218811576730e-09 true resid norm 7.720325777856e-02 ||r(i)||/||b|| 1.220335857379e-01 337 KSP Residual norm 1.218811576730e-09 % max 9.413527186918e+01 min 1.547156369733e-05 max/min 6.084405798326e+06 338 KSP preconditioned resid norm 1.213657395293e-09 true resid norm 7.719935756389e-02 ||r(i)||/||b|| 1.220274207496e-01 338 KSP Residual norm 1.213657395293e-09 % max 9.498996807506e+01 min 1.547035768835e-05 max/min 6.140127461085e+06 339 KSP preconditioned resid norm 1.208568054553e-09 true resid norm 7.719553604324e-02 ||r(i)||/||b|| 1.220213801513e-01 339 KSP Residual norm 1.208568054553e-09 % max 9.584467596970e+01 min 1.546917201274e-05 max/min 6.195850423720e+06 340 KSP preconditioned resid norm 1.203542206295e-09 true resid norm 7.719179092494e-02 ||r(i)||/||b|| 1.220154603206e-01 340 KSP Residual norm 1.203542206295e-09 % max 9.669939516689e+01 min 1.546800616544e-05 max/min 6.251574645926e+06 341 KSP preconditioned resid norm 1.198578541230e-09 true resid norm 7.718812001002e-02 ||r(i)||/||b|| 1.220096577817e-01 341 KSP Residual norm 1.198578541230e-09 % max 9.755412529717e+01 min 1.546685964925e-05 max/min 6.307300092550e+06 342 KSP preconditioned resid norm 1.193675787557e-09 true resid norm 7.718452117706e-02 ||r(i)||/||b|| 1.220039691812e-01 342 KSP Residual norm 1.193675787557e-09 % max 9.840886600689e+01 min 1.546573198975e-05 max/min 6.363026727225e+06 343 KSP preconditioned resid norm 1.188832709597e-09 true resid norm 7.718099238265e-02 ||r(i)||/||b|| 1.219983912892e-01 343 KSP Residual norm 1.188832709597e-09 % max 9.926361695742e+01 min 1.546462272342e-05 max/min 6.418754516855e+06 344 KSP preconditioned resid norm 1.184048106481e-09 true resid norm 7.717753165982e-02 ||r(i)||/||b|| 1.219929209965e-01 344 KSP Residual norm 1.184048106481e-09 % max 1.001183778244e+02 min 1.546353140667e-05 max/min 6.474483427579e+06 345 KSP preconditioned resid norm 1.179320810902e-09 true resid norm 7.717413711596e-02 ||r(i)||/||b|| 1.219875553115e-01 345 KSP Residual norm 1.179320810902e-09 % max 1.009731482967e+02 min 1.546245760538e-05 max/min 6.530213428787e+06 346 KSP preconditioned resid norm 1.174649687921e-09 true resid norm 7.717080691875e-02 ||r(i)||/||b|| 1.219822913380e-01 346 KSP Residual norm 1.174649687921e-09 % max 1.018279280764e+02 min 1.546140090201e-05 max/min 6.585944489880e+06 347 KSP preconditioned resid norm 1.170033633815e-09 true resid norm 7.716753931049e-02 ||r(i)||/||b|| 1.219771262975e-01 347 KSP Residual norm 1.170033633815e-09 % max 1.026827168774e+02 min 1.546036089024e-05 max/min 6.641676582224e+06 348 KSP preconditioned resid norm 1.165471574989e-09 true resid norm 7.716433258709e-02 ||r(i)||/||b|| 1.219720574964e-01 348 KSP Residual norm 1.165471574989e-09 % max 1.035375144253e+02 min 1.545933718045e-05 max/min 6.697409676543e+06 349 KSP preconditioned resid norm 1.160962466922e-09 true resid norm 7.716118511043e-02 ||r(i)||/||b|| 1.219670823454e-01 349 KSP Residual norm 1.160962466922e-09 % max 1.043923204565e+02 min 1.545832939107e-05 max/min 6.753143746360e+06 350 KSP preconditioned resid norm 1.156505293164e-09 true resid norm 7.715809530016e-02 ||r(i)||/||b|| 1.219621983465e-01 350 KSP Residual norm 1.156505293164e-09 % max 1.052471347179e+02 min 1.545733715254e-05 max/min 6.808878766069e+06 351 KSP preconditioned resid norm 1.152099064375e-09 true resid norm 7.715506162638e-02 ||r(i)||/||b|| 1.219574030814e-01 351 KSP Residual norm 1.152099064375e-09 % max 1.061019569664e+02 min 1.545636010731e-05 max/min 6.864614710693e+06 352 KSP preconditioned resid norm 1.147742817401e-09 true resid norm 7.715208261314e-02 ||r(i)||/||b|| 1.219526942171e-01 352 KSP Residual norm 1.147742817401e-09 % max 1.069567869683e+02 min 1.545539791075e-05 max/min 6.920351555224e+06 353 KSP preconditioned resid norm 1.143435614387e-09 true resid norm 7.714915683577e-02 ||r(i)||/||b|| 1.219480695016e-01 353 KSP Residual norm 1.143435614387e-09 % max 1.078116244987e+02 min 1.545445022700e-05 max/min 6.976089276237e+06 354 KSP preconditioned resid norm 1.139176541935e-09 true resid norm 7.714628291836e-02 ||r(i)||/||b|| 1.219435267600e-01 354 KSP Residual norm 1.139176541935e-09 % max 1.086664693415e+02 min 1.545351672773e-05 max/min 7.031827852270e+06 355 KSP preconditioned resid norm 1.134964710284e-09 true resid norm 7.714345952721e-02 ||r(i)||/||b|| 1.219390638843e-01 355 KSP Residual norm 1.134964710284e-09 % max 1.095213212888e+02 min 1.545259710251e-05 max/min 7.087567258902e+06 356 KSP preconditioned resid norm 1.130799252532e-09 true resid norm 7.714068537802e-02 ||r(i)||/||b|| 1.219346788443e-01 356 KSP Residual norm 1.130799252532e-09 % max 1.103761801401e+02 min 1.545169103719e-05 max/min 7.143307478410e+06 357 KSP preconditioned resid norm 1.126679323889e-09 true resid norm 7.713795922628e-02 ||r(i)||/||b|| 1.219303696729e-01 357 KSP Residual norm 1.126679323889e-09 % max 1.112310457026e+02 min 1.545079823844e-05 max/min 7.199048488371e+06 358 KSP preconditioned resid norm 1.122604100951e-09 true resid norm 7.713527986831e-02 ||r(i)||/||b|| 1.219261344674e-01 358 KSP Residual norm 1.122604100951e-09 % max 1.120859177906e+02 min 1.544991841664e-05 max/min 7.254790269309e+06 359 KSP preconditioned resid norm 1.118572781015e-09 true resid norm 7.713264614048e-02 ||r(i)||/||b|| 1.219219713885e-01 359 KSP Residual norm 1.118572781015e-09 % max 1.129407962252e+02 min 1.544905128897e-05 max/min 7.310532803124e+06 360 KSP preconditioned resid norm 1.114584581410e-09 true resid norm 7.713005691674e-02 ||r(i)||/||b|| 1.219178786563e-01 360 KSP Residual norm 1.114584581410e-09 % max 1.137956808336e+02 min 1.544819658611e-05 max/min 7.366276069785e+06 361 KSP preconditioned resid norm 1.110638738862e-09 true resid norm 7.712751110547e-02 ||r(i)||/||b|| 1.219138545454e-01 361 KSP Residual norm 1.110638738862e-09 % max 1.146505714496e+02 min 1.544735404005e-05 max/min 7.422020052905e+06 362 KSP preconditioned resid norm 1.106734508879e-09 true resid norm 7.712500764805e-02 ||r(i)||/||b|| 1.219098973822e-01 362 KSP Residual norm 1.106734508879e-09 % max 1.155054679126e+02 min 1.544652339415e-05 max/min 7.477764734840e+06 363 KSP preconditioned resid norm 1.102871165161e-09 true resid norm 7.712254552760e-02 ||r(i)||/||b|| 1.219060055596e-01 363 KSP Residual norm 1.102871165161e-09 % max 1.163603700679e+02 min 1.544570439898e-05 max/min 7.533510098483e+06 364 KSP preconditioned resid norm 1.099047999031e-09 true resid norm 7.712012375229e-02 ||r(i)||/||b|| 1.219021775097e-01 364 KSP Residual norm 1.099047999031e-09 % max 1.172152777659e+02 min 1.544489680841e-05 max/min 7.589256129056e+06 365 KSP preconditioned resid norm 1.095264318893e-09 true resid norm 7.711774136137e-02 ||r(i)||/||b|| 1.218984117138e-01 365 KSP Residual norm 1.095264318893e-09 % max 1.180701908624e+02 min 1.544410038906e-05 max/min 7.645002809354e+06 366 KSP preconditioned resid norm 1.091519449702e-09 true resid norm 7.711539742160e-02 ||r(i)||/||b|| 1.218947066969e-01 366 KSP Residual norm 1.091519449702e-09 % max 1.189251092179e+02 min 1.544331490869e-05 max/min 7.700750125285e+06 367 KSP preconditioned resid norm 1.087812732460e-09 true resid norm 7.711309103674e-02 ||r(i)||/||b|| 1.218910610423e-01 367 KSP Residual norm 1.087812732460e-09 % max 1.197800326979e+02 min 1.544254014346e-05 max/min 7.756498062180e+06 368 KSP preconditioned resid norm 1.084143523729e-09 true resid norm 7.711082133212e-02 ||r(i)||/||b|| 1.218874733673e-01 368 KSP Residual norm 1.084143523729e-09 % max 1.206349611723e+02 min 1.544177587792e-05 max/min 7.812246604670e+06 369 KSP preconditioned resid norm 1.080511195162e-09 true resid norm 7.710858745788e-02 ||r(i)||/||b|| 1.218839423287e-01 369 KSP Residual norm 1.080511195162e-09 % max 1.214898945154e+02 min 1.544102189887e-05 max/min 7.867995739603e+06 370 KSP preconditioned resid norm 1.076915133049e-09 true resid norm 7.710638859341e-02 ||r(i)||/||b|| 1.218804666293e-01 370 KSP Residual norm 1.076915133049e-09 % max 1.223448326055e+02 min 1.544027799826e-05 max/min 7.923745454537e+06 371 KSP preconditioned resid norm 1.073354737886e-09 true resid norm 7.710422394121e-02 ||r(i)||/||b|| 1.218770450086e-01 371 KSP Residual norm 1.073354737886e-09 % max 1.231997753251e+02 min 1.543954397783e-05 max/min 7.979495735240e+06 372 KSP preconditioned resid norm 1.069829423948e-09 true resid norm 7.710209272561e-02 ||r(i)||/||b|| 1.218736762404e-01 372 KSP Residual norm 1.069829423948e-09 % max 1.240547225605e+02 min 1.543881964126e-05 max/min 8.035246569561e+06 373 KSP preconditioned resid norm 1.066338618893e-09 true resid norm 7.709999419813e-02 ||r(i)||/||b|| 1.218703591416e-01 373 KSP Residual norm 1.066338618893e-09 % max 1.249096742015e+02 min 1.543810479771e-05 max/min 8.090997945550e+06 374 KSP preconditioned resid norm 1.062881763364e-09 true resid norm 7.709792762746e-02 ||r(i)||/||b|| 1.218670925562e-01 374 KSP Residual norm 1.062881763364e-09 % max 1.257646301416e+02 min 1.543739926076e-05 max/min 8.146749851917e+06 375 KSP preconditioned resid norm 1.059458310615e-09 true resid norm 7.709589230803e-02 ||r(i)||/||b|| 1.218638753691e-01 375 KSP Residual norm 1.059458310615e-09 % max 1.266195902777e+02 min 1.543670285377e-05 max/min 8.202502275075e+06 376 KSP preconditioned resid norm 1.056067726151e-09 true resid norm 7.709388754828e-02 ||r(i)||/||b|| 1.218607064870e-01 376 KSP Residual norm 1.056067726151e-09 % max 1.274745545100e+02 min 1.543601539683e-05 max/min 8.258255205948e+06 377 KSP preconditioned resid norm 1.052709487372e-09 true resid norm 7.709191268570e-02 ||r(i)||/||b|| 1.218575848627e-01 377 KSP Residual norm 1.052709487372e-09 % max 1.283295227415e+02 min 1.543533671742e-05 max/min 8.314008634271e+06 378 KSP preconditioned resid norm 1.049383083240e-09 true resid norm 7.708996706456e-02 ||r(i)||/||b|| 1.218545094598e-01 378 KSP Residual norm 1.049383083240e-09 % max 1.291844948787e+02 min 1.543466665321e-05 max/min 8.369762546949e+06 379 KSP preconditioned resid norm 1.046088013948e-09 true resid norm 7.708805005888e-02 ||r(i)||/||b|| 1.218514792888e-01 379 KSP Residual norm 1.046088013948e-09 % max 1.300394708306e+02 min 1.543400503818e-05 max/min 8.425516935422e+06 380 KSP preconditioned resid norm 1.042823790607e-09 true resid norm 7.708616105731e-02 ||r(i)||/||b|| 1.218484933832e-01 380 KSP Residual norm 1.042823790607e-09 % max 1.308944505091e+02 min 1.543335171343e-05 max/min 8.481271789797e+06 381 KSP preconditioned resid norm 1.039589934939e-09 true resid norm 7.708429945932e-02 ||r(i)||/||b|| 1.218455507940e-01 381 KSP Residual norm 1.039589934939e-09 % max 1.317494338290e+02 min 1.543270652608e-05 max/min 8.537027099320e+06 382 KSP preconditioned resid norm 1.036385978986e-09 true resid norm 7.708246468814e-02 ||r(i)||/||b|| 1.218426506093e-01 382 KSP Residual norm 1.036385978986e-09 % max 1.326044207073e+02 min 1.543206932545e-05 max/min 8.592782854377e+06 383 KSP preconditioned resid norm 1.033211464824e-09 true resid norm 7.708065617992e-02 ||r(i)||/||b|| 1.218397919379e-01 383 KSP Residual norm 1.033211464824e-09 % max 1.334594110638e+02 min 1.543143995964e-05 max/min 8.648539048386e+06 384 KSP preconditioned resid norm 1.030065944287e-09 true resid norm 7.707887338809e-02 ||r(i)||/||b|| 1.218369739159e-01 384 KSP Residual norm 1.030065944287e-09 % max 1.343144048204e+02 min 1.543081828661e-05 max/min 8.704295671538e+06 385 KSP preconditioned resid norm 1.026948978700e-09 true resid norm 7.707711577599e-02 ||r(i)||/||b|| 1.218341956950e-01 385 KSP Residual norm 1.026948978700e-09 % max 1.351694019016e+02 min 1.543020417169e-05 max/min 8.760052712044e+06 386 KSP preconditioned resid norm 1.023860138629e-09 true resid norm 7.707538282676e-02 ||r(i)||/||b|| 1.218314564581e-01 386 KSP Residual norm 1.023860138629e-09 % max 1.360244022339e+02 min 1.542959747165e-05 max/min 8.815810165097e+06 387 KSP preconditioned resid norm 1.020799003623e-09 true resid norm 7.707367403825e-02 ||r(i)||/||b|| 1.218287554116e-01 387 KSP Residual norm 1.020799003623e-09 % max 1.368794057461e+02 min 1.542899805510e-05 max/min 8.871568021284e+06 388 KSP preconditioned resid norm 1.017765161980e-09 true resid norm 7.707198891737e-02 ||r(i)||/||b|| 1.218260917761e-01 388 KSP Residual norm 1.017765161980e-09 % max 1.377344123689e+02 min 1.542840579215e-05 max/min 8.927326272360e+06 389 KSP preconditioned resid norm 1.014758210512e-09 true resid norm 7.707032698231e-02 ||r(i)||/||b|| 1.218234647898e-01 389 KSP Residual norm 1.014758210512e-09 % max 1.385894220351e+02 min 1.542782055322e-05 max/min 8.983084911899e+06 390 KSP preconditioned resid norm 1.011777754323e-09 true resid norm 7.706868777483e-02 ||r(i)||/||b|| 1.218208737286e-01 390 KSP Residual norm 1.011777754323e-09 % max 1.394444346793e+02 min 1.542724221860e-05 max/min 9.038843929678e+06 391 KSP preconditioned resid norm 1.008823406584e-09 true resid norm 7.706707083757e-02 ||r(i)||/||b|| 1.218183178695e-01 391 KSP Residual norm 1.008823406584e-09 % max 1.402994502381e+02 min 1.542667066261e-05 max/min 9.094603320870e+06 392 KSP preconditioned resid norm 1.005894788330e-09 true resid norm 7.706547572694e-02 ||r(i)||/||b|| 1.218157965112e-01 392 KSP Residual norm 1.005894788330e-09 % max 1.411544686498e+02 min 1.542610576846e-05 max/min 9.150363077267e+06 393 KSP preconditioned resid norm 1.002991528252e-09 true resid norm 7.706390201354e-02 ||r(i)||/||b|| 1.218133089752e-01 393 KSP Residual norm 1.002991528252e-09 % max 1.420094898544e+02 min 1.542554741975e-05 max/min 9.206123192267e+06 394 KSP preconditioned resid norm 1.000113262500e-09 true resid norm 7.706234928097e-02 ||r(i)||/||b|| 1.218108546031e-01 394 KSP Residual norm 1.000113262500e-09 % max 1.428645137936e+02 min 1.542499550445e-05 max/min 9.261883658399e+06 395 KSP preconditioned resid norm 9.972596344905e-10 true resid norm 7.706081711850e-02 ||r(i)||/||b|| 1.218084327457e-01 395 KSP Residual norm 9.972596344905e-10 % max 1.437195404107e+02 min 1.542444990971e-05 max/min 9.317644470431e+06 396 KSP preconditioned resid norm 9.944302947220e-10 true resid norm 7.705930512894e-02 ||r(i)||/||b|| 1.218060427752e-01 396 KSP Residual norm 9.944302947220e-10 % max 1.445745696505e+02 min 1.542391053187e-05 max/min 9.373405619265e+06 397 KSP preconditioned resid norm 9.916249005950e-10 true resid norm 7.705781292267e-02 ||r(i)||/||b|| 1.218036840757e-01 397 KSP Residual norm 9.916249005950e-10 % max 1.454296014594e+02 min 1.542337726140e-05 max/min 9.429167100999e+06 398 KSP preconditioned resid norm 9.888431162377e-10 true resid norm 7.705634012295e-02 ||r(i)||/||b|| 1.218013560518e-01 398 KSP Residual norm 9.888431162377e-10 % max 1.462846357853e+02 min 1.542284999699e-05 max/min 9.484928908327e+06 399 KSP preconditioned resid norm 9.860846123368e-10 true resid norm 7.705488636085e-02 ||r(i)||/||b|| 1.217990581203e-01 399 KSP Residual norm 9.860846123368e-10 % max 1.471396725772e+02 min 1.542232863599e-05 max/min 9.540691036362e+06 400 KSP preconditioned resid norm 9.833490659742e-10 true resid norm 7.705345127778e-02 ||r(i)||/||b|| 1.217967897143e-01 400 KSP Residual norm 9.833490659742e-10 % max 1.479947117857e+02 min 1.542181308274e-05 max/min 9.596453477403e+06 401 KSP preconditioned resid norm 9.806361604678e-10 true resid norm 7.705203451983e-02 ||r(i)||/||b|| 1.217945502744e-01 401 KSP Residual norm 9.806361604678e-10 % max 1.488497533627e+02 min 1.542130323624e-05 max/min 9.652216228579e+06 402 KSP preconditioned resid norm 9.779455852179e-10 true resid norm 7.705063574650e-02 ||r(i)||/||b|| 1.217923392624e-01 402 KSP Residual norm 9.779455852179e-10 % max 1.497047972613e+02 min 1.542079900965e-05 max/min 9.707979279646e+06 403 KSP preconditioned resid norm 9.752770355574e-10 true resid norm 7.704925462217e-02 ||r(i)||/||b|| 1.217901561479e-01 403 KSP Residual norm 9.752770355574e-10 % max 1.505598434358e+02 min 1.542030029976e-05 max/min 9.763742632051e+06 404 KSP preconditioned resid norm 9.726302126073e-10 true resid norm 7.704789082703e-02 ||r(i)||/||b|| 1.217880004253e-01 404 KSP Residual norm 9.726302126073e-10 % max 1.514148918418e+02 min 1.541980702756e-05 max/min 9.819506273407e+06 405 KSP preconditioned resid norm 9.700048231357e-10 true resid norm 7.704654403631e-02 ||r(i)||/||b|| 1.217858715812e-01 405 KSP Residual norm 9.700048231357e-10 % max 1.522699424359e+02 min 1.541931909671e-05 max/min 9.875270203627e+06 406 KSP preconditioned resid norm 9.674005794219e-10 true resid norm 7.704521394321e-02 ||r(i)||/||b|| 1.217837691306e-01 406 KSP Residual norm 9.674005794219e-10 % max 1.531249951760e+02 min 1.541883642568e-05 max/min 9.931034414568e+06 407 KSP preconditioned resid norm 9.648171991231e-10 true resid norm 7.704390024238e-02 ||r(i)||/||b|| 1.217816925910e-01 407 KSP Residual norm 9.648171991231e-10 % max 1.539800500208e+02 min 1.541835892931e-05 max/min 9.986798901676e+06 408 KSP preconditioned resid norm 9.622544051468e-10 true resid norm 7.704260264030e-02 ||r(i)||/||b|| 1.217796414984e-01 408 KSP Residual norm 9.622544051468e-10 % max 1.548351069303e+02 min 1.541788651841e-05 max/min 1.004256366431e+07 409 KSP preconditioned resid norm 9.597119255252e-10 true resid norm 7.704132084797e-02 ||r(i)||/||b|| 1.217776153959e-01 409 KSP Residual norm 9.597119255252e-10 % max 1.556901658654e+02 min 1.541741912358e-05 max/min 1.009832869026e+07 410 KSP preconditioned resid norm 9.571894932944e-10 true resid norm 7.704005458204e-02 ||r(i)||/||b|| 1.217756138356e-01 410 KSP Residual norm 9.571894932944e-10 % max 1.565452267880e+02 min 1.541695665871e-05 max/min 1.015409397934e+07 411 KSP preconditioned resid norm 9.546868463763e-10 true resid norm 7.703880356970e-02 ||r(i)||/||b|| 1.217736363864e-01 411 KSP Residual norm 9.546868463763e-10 % max 1.574002896609e+02 min 1.541649904298e-05 max/min 1.020985952920e+07 412 KSP preconditioned resid norm 9.522037274646e-10 true resid norm 7.703756753884e-02 ||r(i)||/||b|| 1.217716826181e-01 412 KSP Residual norm 9.522037274646e-10 % max 1.582553544478e+02 min 1.541604620560e-05 max/min 1.026562533202e+07 413 KSP preconditioned resid norm 9.497398839135e-10 true resid norm 7.703634622699e-02 ||r(i)||/||b|| 1.217697521158e-01 413 KSP Residual norm 9.497398839135e-10 % max 1.591104211134e+02 min 1.541559807295e-05 max/min 1.032139138297e+07 414 KSP preconditioned resid norm 9.472950676296e-10 true resid norm 7.703513938163e-02 ||r(i)||/||b|| 1.217678444804e-01 414 KSP Residual norm 9.472950676296e-10 % max 1.599654896233e+02 min 1.541515456432e-05 max/min 1.037715768310e+07 415 KSP preconditioned resid norm 9.448690349674e-10 true resid norm 7.703394674743e-02 ||r(i)||/||b|| 1.217659593083e-01 415 KSP Residual norm 9.448690349674e-10 % max 1.608205599436e+02 min 1.541471562375e-05 max/min 1.043292421793e+07 416 KSP preconditioned resid norm 9.424615466266e-10 true resid norm 7.703276808042e-02 ||r(i)||/||b|| 1.217640962139e-01 416 KSP Residual norm 9.424615466266e-10 % max 1.616756320416e+02 min 1.541428116222e-05 max/min 1.048869099636e+07 417 KSP preconditioned resid norm 9.400723675537e-10 true resid norm 7.703160313689e-02 ||r(i)||/||b|| 1.217622548118e-01 417 KSP Residual norm 9.400723675537e-10 % max 1.625307058852e+02 min 1.541385112375e-05 max/min 1.054445800601e+07 418 KSP preconditioned resid norm 9.377012668452e-10 true resid norm 7.703045169141e-02 ||r(i)||/||b|| 1.217604347459e-01 418 KSP Residual norm 9.377012668452e-10 % max 1.633857814432e+02 min 1.541342543594e-05 max/min 1.060022524664e+07 419 KSP preconditioned resid norm 9.353480176536e-10 true resid norm 7.702931350860e-02 ||r(i)||/||b|| 1.217586356440e-01 419 KSP Residual norm 9.353480176536e-10 % max 1.642408586848e+02 min 1.541300403949e-05 max/min 1.065599271005e+07 420 KSP preconditioned resid norm 9.330123970972e-10 true resid norm 7.702818836330e-02 ||r(i)||/||b|| 1.217568571502e-01 420 KSP Residual norm 9.330123970972e-10 % max 1.650959375804e+02 min 1.541258686504e-05 max/min 1.071176039597e+07 421 KSP preconditioned resid norm 9.306941861701e-10 true resid norm 7.702707604016e-02 ||r(i)||/||b|| 1.217550989241e-01 421 KSP Residual norm 9.306941861701e-10 % max 1.659510181008e+02 min 1.541217384870e-05 max/min 1.076752830132e+07 422 KSP preconditioned resid norm 9.283931696573e-10 true resid norm 7.702597632846e-02 ||r(i)||/||b|| 1.217533606326e-01 422 KSP Residual norm 9.283931696573e-10 % max 1.668061002175e+02 min 1.541176492885e-05 max/min 1.082329642241e+07 423 KSP preconditioned resid norm 9.261091360496e-10 true resid norm 7.702488900721e-02 ||r(i)||/||b|| 1.217516419265e-01 423 KSP Residual norm 9.261091360496e-10 % max 1.676611839026e+02 min 1.541136004679e-05 max/min 1.087906475442e+07 424 KSP preconditioned resid norm 9.238418774628e-10 true resid norm 7.702381387948e-02 ||r(i)||/||b|| 1.217499424945e-01 424 KSP Residual norm 9.238418774628e-10 % max 1.685162691291e+02 min 1.541095914290e-05 max/min 1.093483329406e+07 425 KSP preconditioned resid norm 9.215911895577e-10 true resid norm 7.702275074538e-02 ||r(i)||/||b|| 1.217482620205e-01 425 KSP Residual norm 9.215911895577e-10 % max 1.693713558704e+02 min 1.541056215850e-05 max/min 1.099060203829e+07 426 KSP preconditioned resid norm 9.193568714629e-10 true resid norm 7.702169940327e-02 ||r(i)||/||b|| 1.217466001858e-01 426 KSP Residual norm 9.193568714629e-10 % max 1.702264441005e+02 min 1.541016903696e-05 max/min 1.104637098349e+07 427 KSP preconditioned resid norm 9.171387256997e-10 true resid norm 7.702065966772e-02 ||r(i)||/||b|| 1.217449566974e-01 427 KSP Residual norm 9.171387256997e-10 % max 1.710815337941e+02 min 1.540977971745e-05 max/min 1.110214012991e+07 428 KSP preconditioned resid norm 9.149365581086e-10 true resid norm 7.701963134449e-02 ||r(i)||/||b|| 1.217433312483e-01 428 KSP Residual norm 9.149365581086e-10 % max 1.719366249265e+02 min 1.540939414902e-05 max/min 1.115790947157e+07 429 KSP preconditioned resid norm 9.127501777777e-10 true resid norm 7.701861425445e-02 ||r(i)||/||b|| 1.217417235552e-01 429 KSP Residual norm 9.127501777777e-10 % max 1.727917174733e+02 min 1.540901228158e-05 max/min 1.121367900263e+07 430 KSP preconditioned resid norm 9.105793969736e-10 true resid norm 7.701760821750e-02 ||r(i)||/||b|| 1.217401333335e-01 430 KSP Residual norm 9.105793969736e-10 % max 1.736468114109e+02 min 1.540863405594e-05 max/min 1.126944872468e+07 431 KSP preconditioned resid norm 9.084240310736e-10 true resid norm 7.701661305348e-02 ||r(i)||/||b|| 1.217385602984e-01 431 KSP Residual norm 9.084240310736e-10 % max 1.745019067161e+02 min 1.540825942103e-05 max/min 1.132521863423e+07 432 KSP preconditioned resid norm 9.062838984992e-10 true resid norm 7.701562859033e-02 ||r(i)||/||b|| 1.217370041779e-01 432 KSP Residual norm 9.062838984992e-10 % max 1.753570033662e+02 min 1.540788832843e-05 max/min 1.138098872658e+07 433 KSP preconditioned resid norm 9.041588206529e-10 true resid norm 7.701465466315e-02 ||r(i)||/||b|| 1.217354647115e-01 433 KSP Residual norm 9.041588206529e-10 % max 1.762121013392e+02 min 1.540752072662e-05 max/min 1.143675900009e+07 434 KSP preconditioned resid norm 9.020486218545e-10 true resid norm 7.701369109927e-02 ||r(i)||/||b|| 1.217339416261e-01 434 KSP Residual norm 9.020486218545e-10 % max 1.770672006133e+02 min 1.540715656775e-05 max/min 1.149252945115e+07 435 KSP preconditioned resid norm 8.999531292809e-10 true resid norm 7.701273774708e-02 ||r(i)||/||b|| 1.217324346821e-01 435 KSP Residual norm 8.999531292809e-10 % max 1.779223011674e+02 min 1.540679580267e-05 max/min 1.154830007785e+07 436 KSP preconditioned resid norm 8.978721729060e-10 true resid norm 7.701179443778e-02 ||r(i)||/||b|| 1.217309436126e-01 436 KSP Residual norm 8.978721729060e-10 % max 1.787774029806e+02 min 1.540643838466e-05 max/min 1.160407087718e+07 437 KSP preconditioned resid norm 8.958055854435e-10 true resid norm 7.701086102012e-02 ||r(i)||/||b|| 1.217294681788e-01 437 KSP Residual norm 8.958055854435e-10 % max 1.796325060327e+02 min 1.540608426227e-05 max/min 1.165984185044e+07 438 KSP preconditioned resid norm 8.937532022898e-10 true resid norm 7.700993734290e-02 ||r(i)||/||b|| 1.217280081414e-01 438 KSP Residual norm 8.937532022898e-10 % max 1.804876103037e+02 min 1.540573340057e-05 max/min 1.171561298711e+07 439 KSP preconditioned resid norm 8.917148614689e-10 true resid norm 7.700902325376e-02 ||r(i)||/||b|| 1.217265632596e-01 439 KSP Residual norm 8.917148614689e-10 % max 1.813427157743e+02 min 1.540538574951e-05 max/min 1.177138428877e+07 440 KSP preconditioned resid norm 8.896904035794e-10 true resid norm 7.700811860899e-02 ||r(i)||/||b|| 1.217251333064e-01 440 KSP Residual norm 8.896904035794e-10 % max 1.821978224254e+02 min 1.540504126229e-05 max/min 1.182715575526e+07 441 KSP preconditioned resid norm 8.876796717414e-10 true resid norm 7.700722326477e-02 ||r(i)||/||b|| 1.217237180544e-01 441 KSP Residual norm 8.876796717414e-10 % max 1.830529302383e+02 min 1.540469989749e-05 max/min 1.188292738297e+07 442 KSP preconditioned resid norm 8.856825115457e-10 true resid norm 7.700633707629e-02 ||r(i)||/||b|| 1.217223172747e-01 442 KSP Residual norm 8.856825115457e-10 % max 1.839080391948e+02 min 1.540436161576e-05 max/min 1.193869916730e+07 443 KSP preconditioned resid norm 8.836987710040e-10 true resid norm 7.700545991462e-02 ||r(i)||/||b|| 1.217209307635e-01 443 KSP Residual norm 8.836987710040e-10 % max 1.847631492771e+02 min 1.540402637418e-05 max/min 1.199447110703e+07 444 KSP preconditioned resid norm 8.817283005006e-10 true resid norm 7.700459163848e-02 ||r(i)||/||b|| 1.217195582974e-01 444 KSP Residual norm 8.817283005006e-10 % max 1.856182604675e+02 min 1.540369412984e-05 max/min 1.205024320159e+07 445 KSP preconditioned resid norm 8.797709527445e-10 true resid norm 7.700373211375e-02 ||r(i)||/||b|| 1.217181996645e-01 445 KSP Residual norm 8.797709527445e-10 % max 1.864733727491e+02 min 1.540336484553e-05 max/min 1.210601544657e+07 446 KSP preconditioned resid norm 8.778265827233e-10 true resid norm 7.700288121255e-02 ||r(i)||/||b|| 1.217168546627e-01 446 KSP Residual norm 8.778265827233e-10 % max 1.873284861049e+02 min 1.540303848040e-05 max/min 1.216178784097e+07 447 KSP preconditioned resid norm 8.758950476583e-10 true resid norm 7.700203881086e-02 ||r(i)||/||b|| 1.217155230958e-01 447 KSP Residual norm 8.758950476583e-10 % max 1.881836005185e+02 min 1.540271499238e-05 max/min 1.221756038540e+07 448 KSP preconditioned resid norm 8.739762069604e-10 true resid norm 7.700120478187e-02 ||r(i)||/||b|| 1.217142047635e-01 448 KSP Residual norm 8.739762069604e-10 % max 1.890387159738e+02 min 1.540239435002e-05 max/min 1.227333307263e+07 449 KSP preconditioned resid norm 8.720699221868e-10 true resid norm 7.700037900278e-02 ||r(i)||/||b|| 1.217128994717e-01 449 KSP Residual norm 8.720699221868e-10 % max 1.898938324550e+02 min 1.540207651332e-05 max/min 1.232910590276e+07 450 KSP preconditioned resid norm 8.701760569996e-10 true resid norm 7.699956135165e-02 ||r(i)||/||b|| 1.217116070275e-01 450 KSP Residual norm 8.701760569996e-10 % max 1.907489499467e+02 min 1.540176144765e-05 max/min 1.238487887214e+07 451 KSP preconditioned resid norm 8.682944771243e-10 true resid norm 7.699875171161e-02 ||r(i)||/||b|| 1.217103272463e-01 451 KSP Residual norm 8.682944771243e-10 % max 1.916040684336e+02 min 1.540144910772e-05 max/min 1.244065198628e+07 452 KSP preconditioned resid norm 8.664250503101e-10 true resid norm 7.699794997108e-02 ||r(i)||/||b|| 1.217090599517e-01 452 KSP Residual norm 8.664250503101e-10 % max 1.924591879009e+02 min 1.540113947663e-05 max/min 1.249642522834e+07 453 KSP preconditioned resid norm 8.645676462907e-10 true resid norm 7.699715601200e-02 ||r(i)||/||b|| 1.217078049571e-01 453 KSP Residual norm 8.645676462907e-10 % max 1.933143083340e+02 min 1.540083249795e-05 max/min 1.255219861393e+07 454 KSP preconditioned resid norm 8.627221367466e-10 true resid norm 7.699636972505e-02 ||r(i)||/||b|| 1.217065620897e-01 454 KSP Residual norm 8.627221367466e-10 % max 1.941694297187e+02 min 1.540052815040e-05 max/min 1.260797213072e+07 455 KSP preconditioned resid norm 8.608883952670e-10 true resid norm 7.699559100258e-02 ||r(i)||/||b|| 1.217053311792e-01 455 KSP Residual norm 8.608883952670e-10 % max 1.950245520408e+02 min 1.540022639631e-05 max/min 1.266374578023e+07 456 KSP preconditioned resid norm 8.590662973146e-10 true resid norm 7.699481973160e-02 ||r(i)||/||b|| 1.217041120472e-01 456 KSP Residual norm 8.590662973146e-10 % max 1.958796752868e+02 min 1.539992720743e-05 max/min 1.271951955670e+07 457 KSP preconditioned resid norm 8.572557201889e-10 true resid norm 7.699405581233e-02 ||r(i)||/||b|| 1.217029045359e-01 457 KSP Residual norm 8.572557201889e-10 % max 1.967347994430e+02 min 1.539963054120e-05 max/min 1.277529346672e+07 458 KSP preconditioned resid norm 8.554565429923e-10 true resid norm 7.699329914254e-02 ||r(i)||/||b|| 1.217017084837e-01 458 KSP Residual norm 8.554565429923e-10 % max 1.975899244964e+02 min 1.539933638367e-05 max/min 1.283106749365e+07 459 KSP preconditioned resid norm 8.536686465960e-10 true resid norm 7.699254961914e-02 ||r(i)||/||b|| 1.217005237276e-01 459 KSP Residual norm 8.536686465960e-10 % max 1.984450504338e+02 min 1.539904468019e-05 max/min 1.288684165512e+07 460 KSP preconditioned resid norm 8.518919136065e-10 true resid norm 7.699180714235e-02 ||r(i)||/||b|| 1.216993501100e-01 460 KSP Residual norm 8.518919136065e-10 % max 1.993001772428e+02 min 1.539875541750e-05 max/min 1.294261593481e+07 461 KSP preconditioned resid norm 8.501262283338e-10 true resid norm 7.699107161430e-02 ||r(i)||/||b|| 1.216981874761e-01 461 KSP Residual norm 8.501262283338e-10 % max 2.001553049106e+02 min 1.539846855934e-05 max/min 1.299839033598e+07 462 KSP preconditioned resid norm 8.483714767594e-10 true resid norm 7.699034293767e-02 ||r(i)||/||b|| 1.216970356721e-01 462 KSP Residual norm 8.483714767594e-10 % max 2.010104334253e+02 min 1.539818407370e-05 max/min 1.305416485887e+07 463 KSP preconditioned resid norm 8.466275465054e-10 true resid norm 7.698962102339e-02 ||r(i)||/||b|| 1.216958945572e-01 463 KSP Residual norm 8.466275465054e-10 % max 2.018655627746e+02 min 1.539790193476e-05 max/min 1.310993949889e+07 464 KSP preconditioned resid norm 8.448943268044e-10 true resid norm 7.698890577352e-02 ||r(i)||/||b|| 1.216947639765e-01 464 KSP Residual norm 8.448943268044e-10 % max 2.027206929469e+02 min 1.539762211089e-05 max/min 1.316571425685e+07 465 KSP preconditioned resid norm 8.431717084698e-10 true resid norm 7.698819709932e-02 ||r(i)||/||b|| 1.216936437900e-01 465 KSP Residual norm 8.431717084698e-10 % max 2.035758239306e+02 min 1.539734457892e-05 max/min 1.322148912673e+07 466 KSP preconditioned resid norm 8.414595838671e-10 true resid norm 7.698749491585e-02 ||r(i)||/||b|| 1.216925338631e-01 466 KSP Residual norm 8.414595838671e-10 % max 2.044309557144e+02 min 1.539706930707e-05 max/min 1.327726411029e+07 467 KSP preconditioned resid norm 8.397578468854e-10 true resid norm 7.698679913097e-02 ||r(i)||/||b|| 1.216914340504e-01 467 KSP Residual norm 8.397578468854e-10 % max 2.052860882871e+02 min 1.539679626817e-05 max/min 1.333303920580e+07 468 KSP preconditioned resid norm 8.380663929100e-10 true resid norm 7.698610965659e-02 ||r(i)||/||b|| 1.216903442126e-01 468 KSP Residual norm 8.380663929100e-10 % max 2.061412216378e+02 min 1.539652542826e-05 max/min 1.338881441779e+07 469 KSP preconditioned resid norm 8.363851187952e-10 true resid norm 7.698542641515e-02 ||r(i)||/||b|| 1.216892642271e-01 469 KSP Residual norm 8.363851187952e-10 % max 2.069963557558e+02 min 1.539625677349e-05 max/min 1.344458973379e+07 470 KSP preconditioned resid norm 8.347139228382e-10 true resid norm 7.698474931774e-02 ||r(i)||/||b|| 1.216881939532e-01 470 KSP Residual norm 8.347139228382e-10 % max 2.078514906305e+02 min 1.539599027325e-05 max/min 1.350036515622e+07 471 KSP preconditioned resid norm 8.330527047528e-10 true resid norm 7.698407828643e-02 ||r(i)||/||b|| 1.216871332680e-01 471 KSP Residual norm 8.330527047528e-10 % max 2.087066262516e+02 min 1.539572589156e-05 max/min 1.355614069266e+07 472 KSP preconditioned resid norm 8.314013656445e-10 true resid norm 7.698341323890e-02 ||r(i)||/||b|| 1.216860820412e-01 472 KSP Residual norm 8.314013656445e-10 % max 2.095617626089e+02 min 1.539546361499e-05 max/min 1.361191633131e+07 473 KSP preconditioned resid norm 8.297598079857e-10 true resid norm 7.698275409694e-02 ||r(i)||/||b|| 1.216850401492e-01 473 KSP Residual norm 8.297598079857e-10 % max 2.104168996925e+02 min 1.539520341682e-05 max/min 1.366769207237e+07 474 KSP preconditioned resid norm 8.281279355912e-10 true resid norm 7.698210078561e-02 ||r(i)||/||b|| 1.216840074735e-01 474 KSP Residual norm 8.281279355912e-10 % max 2.112720374926e+02 min 1.539494527436e-05 max/min 1.372346791285e+07 475 KSP preconditioned resid norm 8.265056535949e-10 true resid norm 7.698145322348e-02 ||r(i)||/||b|| 1.216829838855e-01 475 KSP Residual norm 8.265056535949e-10 % max 2.121271759996e+02 min 1.539468915743e-05 max/min 1.377924385679e+07 476 KSP preconditioned resid norm 8.248928684267e-10 true resid norm 7.698081134149e-02 ||r(i)||/||b|| 1.216819692760e-01 476 KSP Residual norm 8.248928684267e-10 % max 2.129823152040e+02 min 1.539443504456e-05 max/min 1.383501990087e+07 477 KSP preconditioned resid norm 8.232894877893e-10 true resid norm 7.698017506308e-02 ||r(i)||/||b|| 1.216809635239e-01 477 KSP Residual norm 8.232894877893e-10 % max 2.138374550966e+02 min 1.539418291104e-05 max/min 1.389079604499e+07 478 KSP preconditioned resid norm 8.216954206367e-10 true resid norm 7.697954431722e-02 ||r(i)||/||b|| 1.216799665171e-01 478 KSP Residual norm 8.216954206367e-10 % max 2.146925956681e+02 min 1.539393273158e-05 max/min 1.394657228998e+07 479 KSP preconditioned resid norm 8.201105771517e-10 true resid norm 7.697891903432e-02 ||r(i)||/||b|| 1.216789781454e-01 479 KSP Residual norm 8.201105771517e-10 % max 2.155477369098e+02 min 1.539368449054e-05 max/min 1.400234862825e+07 480 KSP preconditioned resid norm 8.185348687256e-10 true resid norm 7.697829914078e-02 ||r(i)||/||b|| 1.216779982925e-01 480 KSP Residual norm 8.185348687256e-10 % max 2.164028788127e+02 min 1.539343816240e-05 max/min 1.405812506145e+07 481 KSP preconditioned resid norm 8.169682079365e-10 true resid norm 7.697768457143e-02 ||r(i)||/||b|| 1.216770268555e-01 481 KSP Residual norm 8.169682079365e-10 % max 2.172580213682e+02 min 1.539319372853e-05 max/min 1.411390158532e+07 482 KSP preconditioned resid norm 8.154105085294e-10 true resid norm 7.697707525626e-02 ||r(i)||/||b|| 1.216760637237e-01 482 KSP Residual norm 8.154105085294e-10 % max 2.181131645679e+02 min 1.539295116083e-05 max/min 1.416967820458e+07 483 KSP preconditioned resid norm 8.138616853962e-10 true resid norm 7.697647113181e-02 ||r(i)||/||b|| 1.216751087967e-01 483 KSP Residual norm 8.138616853962e-10 % max 2.189683084033e+02 min 1.539271043853e-05 max/min 1.422545491762e+07 484 KSP preconditioned resid norm 8.123216545560e-10 true resid norm 7.697587213190e-02 ||r(i)||/||b|| 1.216741619700e-01 484 KSP Residual norm 8.123216545560e-10 % max 2.198234528663e+02 min 1.539247154007e-05 max/min 1.428123172384e+07 485 KSP preconditioned resid norm 8.107903331364e-10 true resid norm 7.697527818843e-02 ||r(i)||/||b|| 1.216732231359e-01 485 KSP Residual norm 8.107903331364e-10 % max 2.206785979488e+02 min 1.539223445599e-05 max/min 1.433700861170e+07 486 KSP preconditioned resid norm 8.092676393541e-10 true resid norm 7.697468924499e-02 ||r(i)||/||b|| 1.216722922052e-01 486 KSP Residual norm 8.092676393541e-10 % max 2.215337436429e+02 min 1.539199915274e-05 max/min 1.439278559234e+07 487 KSP preconditioned resid norm 8.077534924970e-10 true resid norm 7.697410523157e-02 ||r(i)||/||b|| 1.216713690674e-01 487 KSP Residual norm 8.077534924970e-10 % max 2.223888899408e+02 min 1.539176560838e-05 max/min 1.444856266650e+07 488 KSP preconditioned resid norm 8.062478129065e-10 true resid norm 7.697352609817e-02 ||r(i)||/||b|| 1.216704536432e-01 488 KSP Residual norm 8.062478129065e-10 % max 2.232440368347e+02 min 1.539153381927e-05 max/min 1.450433981799e+07 489 KSP preconditioned resid norm 8.047505219590e-10 true resid norm 7.697295177444e-02 ||r(i)||/||b|| 1.216695458216e-01 489 KSP Residual norm 8.047505219590e-10 % max 2.240991843173e+02 min 1.539130375307e-05 max/min 1.456011705783e+07 490 KSP preconditioned resid norm 8.032615420497e-10 true resid norm 7.697238220643e-02 ||r(i)||/||b|| 1.216686455173e-01 490 KSP Residual norm 8.032615420497e-10 % max 2.249543323810e+02 min 1.539107539327e-05 max/min 1.461589438249e+07 491 KSP preconditioned resid norm 8.017807965749e-10 true resid norm 7.697181733368e-02 ||r(i)||/||b|| 1.216677526347e-01 491 KSP Residual norm 8.017807965749e-10 % max 2.258094810186e+02 min 1.539084872474e-05 max/min 1.467167178737e+07 492 KSP preconditioned resid norm 8.003082099158e-10 true resid norm 7.697125710155e-02 ||r(i)||/||b|| 1.216668670874e-01 492 KSP Residual norm 8.003082099158e-10 % max 2.266646302229e+02 min 1.539062371415e-05 max/min 1.472744928554e+07 493 KSP preconditioned resid norm 7.988437074224e-10 true resid norm 7.697070145112e-02 ||r(i)||/||b|| 1.216659887823e-01 493 KSP Residual norm 7.988437074224e-10 % max 2.275197799868e+02 min 1.539040036644e-05 max/min 1.478322685373e+07 494 KSP preconditioned resid norm 7.973872153972e-10 true resid norm 7.697015032934e-02 ||r(i)||/||b|| 1.216651176356e-01 494 KSP Residual norm 7.973872153972e-10 % max 2.283749303036e+02 min 1.539017865071e-05 max/min 1.483900450324e+07 495 KSP preconditioned resid norm 7.959386610801e-10 true resid norm 7.696960368018e-02 ||r(i)||/||b|| 1.216642535586e-01 495 KSP Residual norm 7.959386610801e-10 % max 2.292300811662e+02 min 1.538995854689e-05 max/min 1.489478223530e+07 496 KSP preconditioned resid norm 7.944979726328e-10 true resid norm 7.696906145072e-02 ||r(i)||/||b|| 1.216633964678e-01 496 KSP Residual norm 7.944979726328e-10 % max 2.300852325680e+02 min 1.538974004078e-05 max/min 1.495056004574e+07 497 KSP preconditioned resid norm 7.930650791238e-10 true resid norm 7.696852358851e-02 ||r(i)||/||b|| 1.216625462801e-01 497 KSP Residual norm 7.930650791238e-10 % max 2.309403845025e+02 min 1.538952311570e-05 max/min 1.500633793304e+07 498 KSP preconditioned resid norm 7.916399105142e-10 true resid norm 7.696799004048e-02 ||r(i)||/||b|| 1.216617029118e-01 498 KSP Residual norm 7.916399105142e-10 % max 2.317955369631e+02 min 1.538930774757e-05 max/min 1.506211590315e+07 499 KSP preconditioned resid norm 7.902223976426e-10 true resid norm 7.696746075603e-02 ||r(i)||/||b|| 1.216608662829e-01 499 KSP Residual norm 7.902223976426e-10 % max 2.326506899435e+02 min 1.538909393078e-05 max/min 1.511789394424e+07 500 KSP preconditioned resid norm 7.888124722116e-10 true resid norm 7.696693568410e-02 ||r(i)||/||b|| 1.216600363126e-01 500 KSP Residual norm 7.888124722116e-10 % max 2.335058434373e+02 min 1.538888164219e-05 max/min 1.517367206185e+07 Linear solve did not converge due to DIVERGED_ITS iterations 500 KSP Object: 4 MPI processes type: gmres restart=500, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=14924, cols=14924 total: nonzeros=1393670, allocated nonzeros=1393670 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Krylov solver did not converge in 500 iters; Petsc reason: DIVERGED_ITS, with residual norm 7.88812e-10-------------------------------------------------------------------------- mpirun has exited due to process rank 0 with PID 14460 on node rook exiting improperly. There are three reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter orte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one. This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). You can avoid this message by specifying -quiet on the mpirun command line. -------------------------------------------------------------------------- Thank you very much for your time and help. Kind Regards, Shidi From knepley at gmail.com Fri May 18 04:58:22 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 18 May 2018 05:58:22 -0400 Subject: [petsc-users] Linear iterative solver cannot converge In-Reply-To: <7bab65849037a08d487af6f860bc2658@cam.ac.uk> References: <7bab65849037a08d487af6f860bc2658@cam.ac.uk> Message-ID: On Fri, May 18, 2018 at 5:54 AM, Y. Shidi wrote: > Hello all, > > I do not have much knowledge on linear iterative solvers, > so when use PETSc krylov solvers, I try several combinations > (e.g. cg+jacobi, gmres+hypre, etc). But I cannot get a > converged solution. I have used 'preonly' and 'lu' check > that the system is correctly constructed. > > Below is the ksp log output with 500 iterations. > This is only 411 iterations, and we cannot see what solver was used. In general, the iterative solver is very sensitive to the system. We recommend starting with a solver that has worked before for someone by checking the literature. Thanks, Matt > 0 KSP preconditioned resid norm 7.082525933041e-04 true resid norm > 6.326394271849e-01 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP Residual norm 7.082525933041e-04 % max 1.000000000000e+00 min > 1.000000000000e+00 max/min 1.000000000000e+00 > 1 KSP preconditioned resid norm 7.826722253922e-05 true resid norm > 1.564088399639e+00 ||r(i)||/||b|| 2.472322040690e+00 > 1 KSP Residual norm 7.826722253922e-05 % max 9.487098101782e-01 min > 9.487098101782e-01 max/min 1.000000000000e+00 > 2 KSP preconditioned resid norm 1.160749405043e-05 true resid norm > 6.895674436780e+00 ||r(i)||/||b|| 1.089984933039e+01 > 2 KSP Residual norm 1.160749405043e-05 % max 1.047419854967e+00 min > 8.139171190265e-01 max/min 1.286887608679e+00 > 3 KSP preconditioned resid norm 2.302649108076e-06 true resid norm > 3.399636614710e+01 ||r(i)||/||b|| 5.373734972285e+01 > 3 KSP Residual norm 2.302649108076e-06 % max 1.095412865033e+00 min > 7.524989114835e-01 max/min 1.455700265231e+00 > 4 KSP preconditioned resid norm 1.515706794298e-06 true resid norm > 7.915135424703e+00 ||r(i)||/||b|| 1.251129013556e+01 > 4 KSP Residual norm 1.515706794298e-06 % max 4.259813886522e+00 min > 7.476020986409e-01 max/min 5.697969406808e+00 > 5 KSP preconditioned resid norm 1.062659539539e-06 true resid norm > 1.291636850619e+01 ||r(i)||/||b|| 2.041663537106e+01 > 5 KSP Residual norm 1.062659539539e-06 % max 6.979478925762e+00 min > 7.421457344352e-01 max/min 9.404458722751e+00 > 6 KSP preconditioned resid norm 4.248015453585e-07 true resid norm > 7.273048463349e+00 ||r(i)||/||b|| 1.149635661456e+01 > 6 KSP Residual norm 4.248015453585e-07 % max 8.197134494997e+00 min > 7.249135442935e-01 max/min 1.130774084651e+01 > 7 KSP preconditioned resid norm 1.844877736368e-07 true resid norm > 1.800911822307e+00 ||r(i)||/||b|| 2.846663905095e+00 > 7 KSP Residual norm 1.844877736368e-07 % max 8.617693175971e+00 min > 7.171590601865e-01 max/min 1.201643213394e+01 > 8 KSP preconditioned resid norm 1.343329736018e-07 true resid norm > 8.308030162167e-01 ||r(i)||/||b|| 1.313233068501e+00 > 8 KSP Residual norm 1.343329736018e-07 % max 8.797419513003e+00 min > 7.151810153432e-01 max/min 1.230096901941e+01 > 9 KSP preconditioned resid norm 8.435244868263e-08 true resid norm > 1.039165561694e+00 ||r(i)||/||b|| 1.642587415580e+00 > 9 KSP Residual norm 8.435244868263e-08 % max 8.954896895684e+00 min > 7.106664887897e-01 max/min 1.260070235046e+01 > 10 KSP preconditioned resid norm 3.966928154020e-08 true resid norm > 5.935564227055e-01 ||r(i)||/||b|| 9.382223067360e-01 > 10 KSP Residual norm 3.966928154020e-08 % max 9.031210516030e+00 min > 7.020732475337e-01 max/min 1.286363003826e+01 > 11 KSP preconditioned resid norm 2.465535486554e-08 true resid norm > 4.008826554548e-01 ||r(i)||/||b|| 6.336668854779e-01 > 11 KSP Residual norm 2.465535486554e-08 % max 9.085166722582e+00 min > 6.977273003895e-01 max/min 1.302108534023e+01 > 12 KSP preconditioned resid norm 2.030859749117e-08 true resid norm > 3.813432885130e-01 ||r(i)||/||b|| 6.027814140670e-01 > 12 KSP Residual norm 2.030859749117e-08 % max 9.108740437496e+00 min > 6.953332846189e-01 max/min 1.309981938012e+01 > 13 KSP preconditioned resid norm 1.822679962077e-08 true resid norm > 3.872430711288e-01 ||r(i)||/||b|| 6.121070778847e-01 > 13 KSP Residual norm 1.822679962077e-08 % max 9.136924983475e+00 min > 6.787423686921e-01 max/min 1.346155095796e+01 > 14 KSP preconditioned resid norm 1.695370146944e-08 true resid norm > 3.791469790845e-01 ||r(i)||/||b|| 5.993097533799e-01 > 14 KSP Residual norm 1.695370146944e-08 % max 9.150838240684e+00 min > 4.836016572082e-01 max/min 1.892226402513e+01 > 15 KSP preconditioned resid norm 1.670891990697e-08 true resid norm > 3.742448806985e-01 ||r(i)||/||b|| 5.915611082981e-01 > 15 KSP Residual norm 1.670891990697e-08 % max 9.168112723795e+00 min > 3.589247137480e-01 max/min 2.554327515667e+01 > 16 KSP preconditioned resid norm 1.664236372933e-08 true resid norm > 3.722478007586e-01 ||r(i)||/||b|| 5.884043655247e-01 > 16 KSP Residual norm 1.664236372933e-08 % max 9.178525399198e+00 min > 2.321963324507e-01 max/min 3.952915751220e+01 > 17 KSP preconditioned resid norm 1.663949610587e-08 true resid norm > 3.719568951825e-01 ||r(i)||/||b|| 5.879445371238e-01 > 17 KSP Residual norm 1.663949610587e-08 % max 9.184089312561e+00 min > 1.131861415164e-01 max/min 8.114146475459e+01 > 18 KSP preconditioned resid norm 1.663337572879e-08 true resid norm > 3.719017039287e-01 ||r(i)||/||b|| 5.878572974555e-01 > 18 KSP Residual norm 1.663337572879e-08 % max 9.189859567814e+00 min > 5.648232511467e-02 max/min 1.627032801705e+02 > 19 KSP preconditioned resid norm 1.662539368980e-08 true resid norm > 3.720300901295e-01 ||r(i)||/||b|| 5.880602348560e-01 > 19 KSP Residual norm 1.662539368980e-08 % max 9.194407242900e+00 min > 3.409368279927e-02 max/min 2.696806706695e+02 > 20 KSP preconditioned resid norm 1.660752961666e-08 true resid norm > 3.720476648171e-01 ||r(i)||/||b|| 5.880880147997e-01 > 20 KSP Residual norm 1.660752961666e-08 % max 9.194602239844e+00 min > 2.365991044206e-02 max/min 3.886152596545e+02 > 21 KSP preconditioned resid norm 1.660447026728e-08 true resid norm > 3.719370961482e-01 ||r(i)||/||b|| 5.879132412016e-01 > 21 KSP Residual norm 1.660447026728e-08 % max 9.198469035476e+00 min > 1.563110293107e-02 max/min 5.884721683453e+02 > 22 KSP preconditioned resid norm 1.660446294321e-08 true resid norm > 3.719253643667e-01 ||r(i)||/||b|| 5.878946970183e-01 > 22 KSP Residual norm 1.660446294321e-08 % max 9.198477237204e+00 min > 1.078305273545e-02 max/min 8.530494529592e+02 > 23 KSP preconditioned resid norm 1.660330040466e-08 true resid norm > 3.721217486420e-01 ||r(i)||/||b|| 5.882051175626e-01 > 23 KSP Residual norm 1.660330040466e-08 % max 9.202933332869e+00 min > 9.205686060935e-03 max/min 9.997009752398e+02 > 24 KSP preconditioned resid norm 1.660327295939e-08 true resid norm > 3.721432241387e-01 ||r(i)||/||b|| 5.882390634340e-01 > 24 KSP Residual norm 1.660327295939e-08 % max 9.203266640610e+00 min > 8.725099522239e-03 max/min 1.054803629134e+03 > 25 KSP preconditioned resid norm 1.660322661301e-08 true resid norm > 3.721299294797e-01 ||r(i)||/||b|| 5.882180488428e-01 > 25 KSP Residual norm 1.660322661301e-08 % max 9.203403735734e+00 min > 8.464109615459e-03 max/min 1.087344582462e+03 > 26 KSP preconditioned resid norm 1.660321583344e-08 true resid norm > 3.721333122477e-01 ||r(i)||/||b|| 5.882233959139e-01 > 26 KSP Residual norm 1.660321583344e-08 % max 9.203842444616e+00 min > 8.185670305944e-03 max/min 1.124384699190e+03 > 27 KSP preconditioned resid norm 1.659716957149e-08 true resid norm > 3.722220312184e-01 ||r(i)||/||b|| 5.883636321479e-01 > 27 KSP Residual norm 1.659716957149e-08 % max 9.203846515427e+00 min > 7.896672970975e-03 max/min 1.165534719401e+03 > 28 KSP preconditioned resid norm 1.659174589261e-08 true resid norm > 3.723499490782e-01 ||r(i)||/||b|| 5.885658292514e-01 > 28 KSP Residual norm 1.659174589261e-08 % max 9.204684093512e+00 min > 7.626673798499e-03 max/min 1.206906750794e+03 > 29 KSP preconditioned resid norm 1.658497009797e-08 true resid norm > 3.724237904064e-01 ||r(i)||/||b|| 5.886825487050e-01 > 29 KSP Residual norm 1.658497009797e-08 % max 9.204686109552e+00 min > 7.248769500402e-03 max/min 1.269827397470e+03 > 30 KSP preconditioned resid norm 1.658420583638e-08 true resid norm > 3.724329227626e-01 ||r(i)||/||b|| 5.886969840306e-01 > 30 KSP Residual norm 1.658420583638e-08 % max 9.205127707991e+00 min > 6.913923297310e-03 max/min 1.331389908762e+03 > 31 KSP preconditioned resid norm 1.658394997001e-08 true resid norm > 3.724106820061e-01 ||r(i)||/||b|| 5.886618285289e-01 > 31 KSP Residual norm 1.658394997001e-08 % max 9.205128428478e+00 min > 6.644030177187e-03 max/min 1.385473603068e+03 > 32 KSP preconditioned resid norm 1.658394544037e-08 true resid norm > 3.724013538135e-01 ||r(i)||/||b|| 5.886470836487e-01 > 32 KSP Residual norm 1.658394544037e-08 % max 9.205133034961e+00 min > 6.415376763387e-03 max/min 1.434854627322e+03 > 33 KSP preconditioned resid norm 1.658256686670e-08 true resid norm > 3.725160103788e-01 ||r(i)||/||b|| 5.888283188996e-01 > 33 KSP Residual norm 1.658256686670e-08 % max 9.205322724899e+00 min > 6.129032159846e-03 max/min 1.501921100236e+03 > 34 KSP preconditioned resid norm 1.658198817115e-08 true resid norm > 3.725196207066e-01 ||r(i)||/||b|| 5.888340256696e-01 > 34 KSP Residual norm 1.658198817115e-08 % max 9.205896491877e+00 min > 5.825724791452e-03 max/min 1.580214792396e+03 > 35 KSP preconditioned resid norm 1.658171734588e-08 true resid norm > 3.725087580299e-01 ||r(i)||/||b|| 5.888168552622e-01 > 35 KSP Residual norm 1.658171734588e-08 % max 9.205945192996e+00 min > 5.605660067611e-03 max/min 1.642258910095e+03 > 36 KSP preconditioned resid norm 1.658129588997e-08 true resid norm > 3.724965685568e-01 ||r(i)||/||b|| 5.887975876153e-01 > 36 KSP Residual norm 1.658129588997e-08 % max 9.206318976719e+00 min > 5.485984644166e-03 max/min 1.678152523906e+03 > 37 KSP preconditioned resid norm 1.658089864613e-08 true resid norm > 3.725244893116e-01 ||r(i)||/||b|| 5.888417213724e-01 > 37 KSP Residual norm 1.658089864613e-08 % max 9.206643625524e+00 min > 5.409333758549e-03 max/min 1.701992155868e+03 > 38 KSP preconditioned resid norm 1.658010080967e-08 true resid norm > 3.725552388317e-01 ||r(i)||/||b|| 5.888903265001e-01 > 38 KSP Residual norm 1.658010080967e-08 % max 9.222413265296e+00 min > 5.329794244440e-03 max/min 1.730350711928e+03 > 39 KSP preconditioned resid norm 1.657990108602e-08 true resid norm > 3.725613932319e-01 ||r(i)||/||b|| 5.889000546327e-01 > 39 KSP Residual norm 1.657990108602e-08 % max 9.265183049042e+00 min > 5.255818039127e-03 max/min 1.762843192072e+03 > 40 KSP preconditioned resid norm 1.657960808250e-08 true resid norm > 3.725537447923e-01 ||r(i)||/||b|| 5.888879649031e-01 > 40 KSP Residual norm 1.657960808250e-08 % max 9.295348987310e+00 min > 5.172817833368e-03 max/min 1.796960435635e+03 > 41 KSP preconditioned resid norm 1.657958687100e-08 true resid norm > 3.725506176790e-01 ||r(i)||/||b|| 5.888830219400e-01 > 41 KSP Residual norm 1.657958687100e-08 % max 9.313105724221e+00 min > 5.071725615663e-03 max/min 1.836279489462e+03 > 42 KSP preconditioned resid norm 1.657921742619e-08 true resid norm > 3.725582418972e-01 ||r(i)||/||b|| 5.888950733833e-01 > 42 KSP Residual norm 1.657921742619e-08 % max 9.322955878206e+00 min > 4.963104015890e-03 max/min 1.878452647447e+03 > 43 KSP preconditioned resid norm 1.657684473798e-08 true resid norm > 3.725895709563e-01 ||r(i)||/||b|| 5.889445945762e-01 > 43 KSP Residual norm 1.657684473798e-08 % max 9.329772550486e+00 min > 4.863208011283e-03 max/min 1.918439953389e+03 > 44 KSP preconditioned resid norm 1.657659296508e-08 true resid norm > 3.725990953050e-01 ||r(i)||/||b|| 5.889596495163e-01 > 44 KSP Residual norm 1.657659296508e-08 % max 9.338931498127e+00 min > 4.781973618640e-03 max/min 1.952945006163e+03 > 45 KSP preconditioned resid norm 1.657642406464e-08 true resid norm > 3.726031709829e-01 ||r(i)||/||b|| 5.889660918557e-01 > 45 KSP Residual norm 1.657642406464e-08 % max 9.367754233355e+00 min > 4.692163450780e-03 max/min 1.996468011318e+03 > 46 KSP preconditioned resid norm 1.657554485219e-08 true resid norm > 3.725847080302e-01 ||r(i)||/||b|| 5.889369078499e-01 > 46 KSP Residual norm 1.657554485219e-08 % max 9.525984789796e+00 min > 4.602804961008e-03 max/min 2.069604267505e+03 > 47 KSP preconditioned resid norm 1.657502492412e-08 true resid norm > 3.726033324164e-01 ||r(i)||/||b|| 5.889663470303e-01 > 47 KSP Residual norm 1.657502492412e-08 % max 9.649033152731e+00 min > 4.528495247014e-03 max/min 2.130737171270e+03 > 48 KSP preconditioned resid norm 1.657431464097e-08 true resid norm > 3.726349240386e-01 ||r(i)||/||b|| 5.890162832513e-01 > 48 KSP Residual norm 1.657431464097e-08 % max 9.706832936743e+00 min > 4.483584175077e-03 max/min 2.164971718542e+03 > 49 KSP preconditioned resid norm 1.657430235052e-08 true resid norm > 3.726384684009e-01 ||r(i)||/||b|| 5.890218857511e-01 > 49 KSP Residual norm 1.657430235052e-08 % max 9.727915700790e+00 min > 4.452026131090e-03 max/min 2.185053594555e+03 > 50 KSP preconditioned resid norm 1.657409874525e-08 true resid norm > 3.726369422798e-01 ||r(i)||/||b|| 5.890194734432e-01 > 50 KSP Residual norm 1.657409874525e-08 % max 9.739354480687e+00 min > 4.417847429271e-03 max/min 2.204547494365e+03 > 51 KSP preconditioned resid norm 1.657402591687e-08 true resid norm > 3.726432094382e-01 ||r(i)||/||b|| 5.890293798102e-01 > 51 KSP Residual norm 1.657402591687e-08 % max 9.745116733535e+00 min > 4.373866876163e-03 max/min 2.228032313156e+03 > 52 KSP preconditioned resid norm 1.657322689294e-08 true resid norm > 3.726746392101e-01 ||r(i)||/||b|| 5.890790601978e-01 > 52 KSP Residual norm 1.657322689294e-08 % max 9.748668028325e+00 min > 4.324537557867e-03 max/min 2.254268323000e+03 > 53 KSP preconditioned resid norm 1.657285683191e-08 true resid norm > 3.726927767005e-01 ||r(i)||/||b|| 5.891077297521e-01 > 53 KSP Residual norm 1.657285683191e-08 % max 9.752597336655e+00 min > 4.266519999816e-03 max/min 2.285843576750e+03 > 54 KSP preconditioned resid norm 1.657191968374e-08 true resid norm > 3.727170337719e-01 ||r(i)||/||b|| 5.891460724009e-01 > 54 KSP Residual norm 1.657191968374e-08 % max 9.754248539341e+00 min > 4.180253974132e-03 max/min 2.333410505606e+03 > 55 KSP preconditioned resid norm 1.657127209230e-08 true resid norm > 3.727281752171e-01 ||r(i)||/||b|| 5.891636834519e-01 > 55 KSP Residual norm 1.657127209230e-08 % max 9.755352416617e+00 min > 4.058383922487e-03 max/min 2.403752972351e+03 > 56 KSP preconditioned resid norm 1.657120137140e-08 true resid norm > 3.727397473887e-01 ||r(i)||/||b|| 5.891819753432e-01 > 56 KSP Residual norm 1.657120137140e-08 % max 9.756044425698e+00 min > 3.878156481600e-03 max/min 2.515639704583e+03 > 57 KSP preconditioned resid norm 1.657119611240e-08 true resid norm > 3.727422879145e-01 ||r(i)||/||b|| 5.891859910995e-01 > 57 KSP Residual norm 1.657119611240e-08 % max 9.756250547216e+00 min > 3.666552757137e-03 max/min 2.660878267257e+03 > 58 KSP preconditioned resid norm 1.657111258731e-08 true resid norm > 3.727518818995e-01 ||r(i)||/||b|| 5.892011561122e-01 > 58 KSP Residual norm 1.657111258731e-08 % max 9.756370020073e+00 min > 3.468718193201e-03 max/min 2.812673003877e+03 > 59 KSP preconditioned resid norm 1.657094596592e-08 true resid norm > 3.727705204449e-01 ||r(i)||/||b|| 5.892306176738e-01 > 59 KSP Residual norm 1.657094596592e-08 % max 9.756371302765e+00 min > 3.286906229078e-03 max/min 2.968253616867e+03 > 60 KSP preconditioned resid norm 1.657094109276e-08 true resid norm > 3.727697067525e-01 ||r(i)||/||b|| 5.892293314871e-01 > 60 KSP Residual norm 1.657094109276e-08 % max 9.756462036212e+00 min > 3.176741626112e-03 max/min 3.071216732270e+03 > 61 KSP preconditioned resid norm 1.657074458504e-08 true resid norm > 3.727879098160e-01 ||r(i)||/||b|| 5.892581046914e-01 > 61 KSP Residual norm 1.657074458504e-08 % max 9.756473123113e+00 min > 3.125891922048e-03 max/min 3.121180567471e+03 > 62 KSP preconditioned resid norm 1.657036224624e-08 true resid norm > 3.728005388067e-01 ||r(i)||/||b|| 5.892780670746e-01 > 62 KSP Residual norm 1.657036224624e-08 % max 9.756473207720e+00 min > 3.096484756596e-03 max/min 3.150822295164e+03 > 63 KSP preconditioned resid norm 1.657008856448e-08 true resid norm > 3.728158759902e-01 ||r(i)||/||b|| 5.893023102419e-01 > 63 KSP Residual norm 1.657008856448e-08 % max 9.756531519156e+00 min > 3.069291385183e-03 max/min 3.178757014161e+03 > 64 KSP preconditioned resid norm 1.656933678892e-08 true resid norm > 3.728618516341e-01 ||r(i)||/||b|| 5.893749829871e-01 > 64 KSP Residual norm 1.656933678892e-08 % max 9.756534097563e+00 min > 3.030347211416e-03 max/min 3.219609311041e+03 > 65 KSP preconditioned resid norm 1.656929382536e-08 true resid norm > 3.728620592926e-01 ||r(i)||/||b|| 5.893753112286e-01 > 65 KSP Residual norm 1.656929382536e-08 % max 9.756542682563e+00 min > 2.980473130456e-03 max/min 3.273487884479e+03 > 66 KSP preconditioned resid norm 1.656925076679e-08 true resid norm > 3.728588689809e-01 ||r(i)||/||b|| 5.893702683692e-01 > 66 KSP Residual norm 1.656925076679e-08 % max 9.756768124123e+00 min > 2.899706071757e-03 max/min 3.364743833574e+03 > 67 KSP preconditioned resid norm 1.656919627951e-08 true resid norm > 3.728539364727e-01 ||r(i)||/||b|| 5.893624716560e-01 > 67 KSP Residual norm 1.656919627951e-08 % max 9.756860985294e+00 min > 2.765552951215e-03 max/min 3.527996446790e+03 > 68 KSP preconditioned resid norm 1.656919067597e-08 true resid norm > 3.728560345126e-01 ||r(i)||/||b|| 5.893657879842e-01 > 68 KSP Residual norm 1.656919067597e-08 % max 9.756861143601e+00 min > 2.585147683296e-03 max/min 3.774198745644e+03 > 69 KSP preconditioned resid norm 1.656919023495e-08 true resid norm > 3.728555414719e-01 ||r(i)||/||b|| 5.893650086449e-01 > 69 KSP Residual norm 1.656919023495e-08 % max 9.756912528517e+00 min > 2.353080610226e-03 max/min 4.146442109171e+03 > 70 KSP preconditioned resid norm 1.656817007273e-08 true resid norm > 3.728377336313e-01 ||r(i)||/||b|| 5.893368601612e-01 > 70 KSP Residual norm 1.656817007273e-08 % max 9.756913153950e+00 min > 2.015051177738e-03 max/min 4.842017543646e+03 > 71 KSP preconditioned resid norm 1.656803992911e-08 true resid norm > 3.728311872625e-01 ||r(i)||/||b|| 5.893265124520e-01 > 71 KSP Residual norm 1.656803992911e-08 % max 9.756913653213e+00 min > 1.699244792218e-03 max/min 5.741911758620e+03 > 72 KSP preconditioned resid norm 1.656718546555e-08 true resid norm > 3.728063898356e-01 ||r(i)||/||b|| 5.892873156745e-01 > 72 KSP Residual norm 1.656718546555e-08 % max 9.756940076063e+00 min > 1.425792038708e-03 max/min 6.843171943156e+03 > 73 KSP preconditioned resid norm 1.656661764862e-08 true resid norm > 3.727786721034e-01 ||r(i)||/||b|| 5.892435028310e-01 > 73 KSP Residual norm 1.656661764862e-08 % max 9.756965888951e+00 min > 1.158185964332e-03 max/min 8.424351692586e+03 > 74 KSP preconditioned resid norm 1.656636420620e-08 true resid norm > 3.727518398420e-01 ||r(i)||/||b|| 5.892010896329e-01 > 74 KSP Residual norm 1.656636420620e-08 % max 9.757003390346e+00 min > 9.438943306994e-04 max/min 1.033696577361e+04 > 75 KSP preconditioned resid norm 1.656520179659e-08 true resid norm > 3.726868430624e-01 ||r(i)||/||b|| 5.890983505736e-01 > 75 KSP Residual norm 1.656520179659e-08 % max 9.757006096169e+00 min > 7.903213567346e-04 max/min 1.234561866895e+04 > 76 KSP preconditioned resid norm 1.656424480034e-08 true resid norm > 3.726173157679e-01 ||r(i)||/||b|| 5.889884502235e-01 > 76 KSP Residual norm 1.656424480034e-08 % max 9.757008722883e+00 min > 6.669508794331e-04 max/min 1.462927634367e+04 > 77 KSP preconditioned resid norm 1.656223746236e-08 true resid norm > 3.725149253621e-01 ||r(i)||/||b|| 5.888266038361e-01 > 77 KSP Residual norm 1.656223746236e-08 % max 9.757108556440e+00 min > 5.550476575711e-04 max/min 1.757886628896e+04 > 78 KSP preconditioned resid norm 1.655570645923e-08 true resid norm > 3.723230681871e-01 ||r(i)||/||b|| 5.885233391852e-01 > 78 KSP Residual norm 1.655570645923e-08 % max 9.757706826959e+00 min > 4.399820830439e-04 max/min 2.217750950096e+04 > 79 KSP preconditioned resid norm 1.654641744591e-08 true resid norm > 3.720276670884e-01 ||r(i)||/||b|| 5.880564048053e-01 > 79 KSP Residual norm 1.654641744591e-08 % max 9.794972447526e+00 min > 3.582518455199e-04 max/min 2.734102439392e+04 > 80 KSP preconditioned resid norm 1.653478342204e-08 true resid norm > 3.716527795590e-01 ||r(i)||/||b|| 5.874638278755e-01 > 80 KSP Residual norm 1.653478342204e-08 % max 9.836802284188e+00 min > 3.026308703663e-04 max/min 3.250429234890e+04 > 81 KSP preconditioned resid norm 1.652174198034e-08 true resid norm > 3.711834567643e-01 ||r(i)||/||b|| 5.867219790838e-01 > 81 KSP Residual norm 1.652174198034e-08 % max 9.856229987881e+00 min > 2.557127305560e-04 max/min 3.854415056477e+04 > 82 KSP preconditioned resid norm 1.649703927678e-08 true resid norm > 3.702808970873e-01 ||r(i)||/||b|| 5.852953217522e-01 > 82 KSP Residual norm 1.649703927678e-08 % max 9.865801022673e+00 min > 2.113312927870e-04 max/min 4.668405181535e+04 > 83 KSP preconditioned resid norm 1.647272703733e-08 true resid norm > 3.691356976009e-01 ||r(i)||/||b|| 5.834851287146e-01 > 83 KSP Residual norm 1.647272703733e-08 % max 9.870514331986e+00 min > 1.772107156780e-04 max/min 5.569930855605e+04 > 84 KSP preconditioned resid norm 1.644833370624e-08 true resid norm > 3.678694491788e-01 ||r(i)||/||b|| 5.814835961391e-01 > 84 KSP Residual norm 1.644833370624e-08 % max 9.875179985385e+00 min > 1.531203216170e-04 max/min 6.449294176698e+04 > 85 KSP preconditioned resid norm 1.641503406416e-08 true resid norm > 3.663991314975e-01 ||r(i)||/||b|| 5.791594955248e-01 > 85 KSP Residual norm 1.641503406416e-08 % max 9.878712452345e+00 min > 1.354454765823e-04 max/min 7.293497502916e+04 > 86 KSP preconditioned resid norm 1.638631902117e-08 true resid norm > 3.650020055391e-01 ||r(i)||/||b|| 5.769510875465e-01 > 86 KSP Residual norm 1.638631902117e-08 % max 9.884296885865e+00 min > 1.217649759756e-04 max/min 8.117520499364e+04 > 87 KSP preconditioned resid norm 1.635837174633e-08 true resid norm > 3.637826153071e-01 ||r(i)||/||b|| 5.750236227386e-01 > 87 KSP Residual norm 1.635837174633e-08 % max 9.888595738642e+00 min > 1.129410821511e-04 max/min 8.755534788848e+04 > 88 KSP preconditioned resid norm 1.633085122612e-08 true resid norm > 3.627922390431e-01 ||r(i)||/||b|| 5.734581555524e-01 > 88 KSP Residual norm 1.633085122612e-08 % max 9.892636665772e+00 min > 1.069969658571e-04 max/min 9.245717003771e+04 > 89 KSP preconditioned resid norm 1.630349445652e-08 true resid norm > 3.617701380674e-01 ||r(i)||/||b|| 5.718425417733e-01 > 89 KSP Residual norm 1.630349445652e-08 % max 9.895013763062e+00 min > 1.017466157066e-04 max/min 9.725152718198e+04 > 90 KSP preconditioned resid norm 1.627361015004e-08 true resid norm > 3.606744164091e-01 ||r(i)||/||b|| 5.701105573107e-01 > 90 KSP Residual norm 1.627361015004e-08 % max 9.896088200035e+00 min > 9.710482756506e-05 max/min 1.019113925454e+05 > 91 KSP preconditioned resid norm 1.624399979693e-08 true resid norm > 3.595515169218e-01 ||r(i)||/||b|| 5.683356134185e-01 > 91 KSP Residual norm 1.624399979693e-08 % max 9.898456960605e+00 min > 9.277670680089e-05 max/min 1.066911868498e+05 > 92 KSP preconditioned resid norm 1.621187417344e-08 true resid norm > 3.581993711865e-01 ||r(i)||/||b|| 5.661983047444e-01 > 92 KSP Residual norm 1.621187417344e-08 % max 9.900133841601e+00 min > 8.846853223877e-05 max/min 1.119057091948e+05 > 93 KSP preconditioned resid norm 1.617899333556e-08 true resid norm > 3.567416069788e-01 ||r(i)||/||b|| 5.638940471450e-01 > 93 KSP Residual norm 1.617899333556e-08 % max 9.903319235139e+00 min > 8.477730005240e-05 max/min 1.168156951096e+05 > 94 KSP preconditioned resid norm 1.613800415846e-08 true resid norm > 3.548922412073e-01 ||r(i)||/||b|| 5.609707930891e-01 > 94 KSP Residual norm 1.613800415846e-08 % max 9.906628476064e+00 min > 8.053311577407e-05 max/min 1.230131031296e+05 > 95 KSP preconditioned resid norm 1.605124840571e-08 true resid norm > 3.510853485394e-01 ||r(i)||/||b|| 5.549533169339e-01 > 95 KSP Residual norm 1.605124840571e-08 % max 9.909345975071e+00 min > 7.312347368914e-05 max/min 1.355152521500e+05 > 96 KSP preconditioned resid norm 1.588326672478e-08 true resid norm > 3.439633921057e-01 ||r(i)||/||b|| 5.436957883518e-01 > 96 KSP Residual norm 1.588326672478e-08 % max 9.911160469884e+00 min > 6.360279448537e-05 max/min 1.558290095597e+05 > 97 KSP preconditioned resid norm 1.563012116603e-08 true resid norm > 3.340016328594e-01 ||r(i)||/||b|| 5.279494424584e-01 > 97 KSP Residual norm 1.563012116603e-08 % max 9.912448943622e+00 min > 5.488437852775e-05 max/min 1.806060159470e+05 > 98 KSP preconditioned resid norm 1.532053893409e-08 true resid norm > 3.219757242443e-01 ||r(i)||/||b|| 5.089403385385e-01 > 98 KSP Residual norm 1.532053893409e-08 % max 9.912462355327e+00 min > 4.767224242002e-05 max/min 2.079294334005e+05 > 99 KSP preconditioned resid norm 1.497377114129e-08 true resid norm > 3.092597005441e-01 ||r(i)||/||b|| 4.888403840403e-01 > 99 KSP Residual norm 1.497377114129e-08 % max 9.912980027078e+00 min > 4.250681360104e-05 max/min 2.332092007677e+05 > 100 KSP preconditioned resid norm 1.467327275303e-08 true resid norm > 2.986174217023e-01 ||r(i)||/||b|| 4.720183549595e-01 > 100 KSP Residual norm 1.467327275303e-08 % max 9.912991792085e+00 min > 3.923846350091e-05 max/min 2.526345556791e+05 > 101 KSP preconditioned resid norm 1.442486143582e-08 true resid norm > 2.895399487479e-01 ||r(i)||/||b|| 4.576697820373e-01 > 101 KSP Residual norm 1.442486143582e-08 % max 9.913176892700e+00 min > 3.690617907051e-05 max/min 2.686048012112e+05 > 102 KSP preconditioned resid norm 1.420878540463e-08 true resid norm > 2.818595095401e-01 ||r(i)||/||b|| 4.455294713363e-01 > 102 KSP Residual norm 1.420878540463e-08 % max 9.913184584644e+00 min > 3.525622450935e-05 max/min 2.811754441266e+05 > 103 KSP preconditioned resid norm 1.394867857384e-08 true resid norm > 2.728700074720e-01 ||r(i)||/||b|| 4.313199521665e-01 > 103 KSP Residual norm 1.394867857384e-08 % max 9.913184754083e+00 min > 3.359065177627e-05 max/min 2.951173683711e+05 > 104 KSP preconditioned resid norm 1.361967798602e-08 true resid norm > 2.618731154865e-01 ||r(i)||/||b|| 4.139373934561e-01 > 104 KSP Residual norm 1.361967798602e-08 % max 9.913195023405e+00 min > 3.179302216209e-05 max/min 3.118041113822e+05 > 105 KSP preconditioned resid norm 1.325728782686e-08 true resid norm > 2.505110349419e-01 ||r(i)||/||b|| 3.959775887769e-01 > 105 KSP Residual norm 1.325728782686e-08 % max 9.913373149107e+00 min > 3.017902902755e-05 max/min 3.284854903734e+05 > 106 KSP preconditioned resid norm 1.274686747953e-08 true resid norm > 2.359244406376e-01 ||r(i)||/||b|| 3.729208621843e-01 > 106 KSP Residual norm 1.274686747953e-08 % max 9.913956033293e+00 min > 2.841281476492e-05 max/min 3.489255153112e+05 > 107 KSP preconditioned resid norm 1.226462735794e-08 true resid norm > 2.229402253412e-01 ||r(i)||/||b|| 3.523969828015e-01 > 107 KSP Residual norm 1.226462735794e-08 % max 9.914139670727e+00 min > 2.702981177754e-05 max/min 3.667853758037e+05 > 108 KSP preconditioned resid norm 1.178822925406e-08 true resid norm > 2.110417020056e-01 ||r(i)||/||b|| 3.335892341467e-01 > 108 KSP Residual norm 1.178822925406e-08 % max 9.914145602125e+00 min > 2.587391774620e-05 max/min 3.831714122064e+05 > 109 KSP preconditioned resid norm 1.124711770554e-08 true resid norm > 1.987259952756e-01 ||r(i)||/||b|| 3.141220523670e-01 > 109 KSP Residual norm 1.124711770554e-08 % max 9.914354741568e+00 min > 2.476929124966e-05 max/min 4.002680028927e+05 > 110 KSP preconditioned resid norm 1.066970872129e-08 true resid norm > 1.861895786153e-01 ||r(i)||/||b|| 2.943059989856e-01 > 110 KSP Residual norm 1.066970872129e-08 % max 9.914357883390e+00 min > 2.374270452184e-05 max/min 4.175749175613e+05 > 111 KSP preconditioned resid norm 1.012599205842e-08 true resid norm > 1.757986089120e-01 ||r(i)||/||b|| 2.778812090393e-01 > 111 KSP Residual norm 1.012599205842e-08 % max 9.914370592801e+00 min > 2.292071171331e-05 max/min 4.325507303965e+05 > 112 KSP preconditioned resid norm 9.672164972334e-09 true resid norm > 1.683403272714e-01 ||r(i)||/||b|| 2.660920581894e-01 > 112 KSP Residual norm 9.672164972334e-09 % max 9.914736313673e+00 min > 2.232161659891e-05 max/min 4.441764452741e+05 > 113 KSP preconditioned resid norm 9.299198823689e-09 true resid norm > 1.629158622006e-01 ||r(i)||/||b|| 2.575177189407e-01 > 113 KSP Residual norm 9.299198823689e-09 % max 9.914935691241e+00 min > 2.188325766675e-05 max/min 4.530831671515e+05 > 114 KSP preconditioned resid norm 9.034077005430e-09 true resid norm > 1.595873484943e-01 ||r(i)||/||b|| 2.522564064722e-01 > 114 KSP Residual norm 9.034077005430e-09 % max 9.914996347981e+00 min > 2.160694766564e-05 max/min 4.588800094031e+05 > 115 KSP preconditioned resid norm 8.888865878467e-09 true resid norm > 1.580113210268e-01 ||r(i)||/||b|| 2.497652125950e-01 > 115 KSP Residual norm 8.888865878467e-09 % max 9.915030824457e+00 min > 2.147558458967e-05 max/min 4.616885180964e+05 > 116 KSP preconditioned resid norm 8.771135905853e-09 true resid norm > 1.568621931485e-01 ||r(i)||/||b|| 2.479488100300e-01 > 116 KSP Residual norm 8.771135905853e-09 % max 9.915048169257e+00 min > 2.137210965771e-05 max/min 4.639246348655e+05 > 117 KSP preconditioned resid norm 8.659171467993e-09 true resid norm > 1.558355526791e-01 ||r(i)||/||b|| 2.463260207675e-01 > 117 KSP Residual norm 8.659171467993e-09 % max 9.915307398147e+00 min > 2.127637945733e-05 max/min 4.660241850844e+05 > 118 KSP preconditioned resid norm 8.575179217480e-09 true resid norm > 1.550972317778e-01 ||r(i)||/||b|| 2.451589722568e-01 > 118 KSP Residual norm 8.575179217480e-09 % max 9.916072938441e+00 min > 2.121014916472e-05 max/min 4.675154739098e+05 > 119 KSP preconditioned resid norm 8.518079720132e-09 true resid norm > 1.544593367397e-01 ||r(i)||/||b|| 2.441506648218e-01 > 119 KSP Residual norm 8.518079720132e-09 % max 9.916686583796e+00 min > 2.115333030509e-05 max/min 4.688002522897e+05 > 120 KSP preconditioned resid norm 8.478238591051e-09 true resid norm > 1.539772069403e-01 ||r(i)||/||b|| 2.433885722637e-01 > 120 KSP Residual norm 8.478238591051e-09 % max 9.916686588373e+00 min > 2.111527934504e-05 max/min 4.696450577957e+05 > 121 KSP preconditioned resid norm 8.444941671542e-09 true resid norm > 1.535936004591e-01 ||r(i)||/||b|| 2.427822134680e-01 > 121 KSP Residual norm 8.444941671542e-09 % max 9.916688169159e+00 min > 2.108428257901e-05 max/min 4.703355749478e+05 > 122 KSP preconditioned resid norm 8.427676435678e-09 true resid norm > 1.533605832162e-01 ||r(i)||/||b|| 2.424138879530e-01 > 122 KSP Residual norm 8.427676435678e-09 % max 9.916689320407e+00 min > 2.106414747089e-05 max/min 4.707852209122e+05 > 123 KSP preconditioned resid norm 8.415410922025e-09 true resid norm > 1.532305930189e-01 ||r(i)||/||b|| 2.422084151485e-01 > 123 KSP Residual norm 8.415410922025e-09 % max 9.917057372015e+00 min > 2.105353113559e-05 max/min 4.710400981264e+05 > 124 KSP preconditioned resid norm 8.408154019827e-09 true resid norm > 1.531806523846e-01 ||r(i)||/||b|| 2.421294750254e-01 > 124 KSP Residual norm 8.408154019827e-09 % max 9.917120919215e+00 min > 2.104834192197e-05 max/min 4.711592464612e+05 > 125 KSP preconditioned resid norm 8.393802244658e-09 true resid norm > 1.530793493932e-01 ||r(i)||/||b|| 2.419693474913e-01 > 125 KSP Residual norm 8.393802244658e-09 % max 9.917132534500e+00 min > 2.103675625602e-05 max/min 4.714192822224e+05 > 126 KSP preconditioned resid norm 8.379753801792e-09 true resid norm > 1.529688114683e-01 ||r(i)||/||b|| 2.417946224898e-01 > 126 KSP Residual norm 8.379753801792e-09 % max 9.917431785146e+00 min > 2.102585361587e-05 max/min 4.716779621095e+05 > 127 KSP preconditioned resid norm 8.371571974323e-09 true resid norm > 1.528853003941e-01 ||r(i)||/||b|| 2.416626182697e-01 > 127 KSP Residual norm 8.371571974323e-09 % max 9.917505129146e+00 min > 2.101791298538e-05 max/min 4.718596530514e+05 > 128 KSP preconditioned resid norm 8.367539460475e-09 true resid norm > 1.528216413116e-01 ||r(i)||/||b|| 2.415619936803e-01 > 128 KSP Residual norm 8.367539460475e-09 % max 9.917526993547e+00 min > 2.101234223878e-05 max/min 4.719857920096e+05 > 129 KSP preconditioned resid norm 8.356446303986e-09 true resid norm > 1.526541189646e-01 ||r(i)||/||b|| 2.412971945866e-01 > 129 KSP Residual norm 8.356446303986e-09 % max 9.917537226333e+00 min > 2.099928136381e-05 max/min 4.722798392247e+05 > 130 KSP preconditioned resid norm 8.331561543348e-09 true resid norm > 1.523523830474e-01 ||r(i)||/||b|| 2.408202468906e-01 > 130 KSP Residual norm 8.331561543348e-09 % max 9.917571522876e+00 min > 2.097331245945e-05 max/min 4.728662457134e+05 > 131 KSP preconditioned resid norm 8.300069240991e-09 true resid norm > 1.520236092506e-01 ||r(i)||/||b|| 2.403005609800e-01 > 131 KSP Residual norm 8.300069240991e-09 % max 9.917634344282e+00 min > 2.094526077576e-05 max/min 4.735025479253e+05 > 132 KSP preconditioned resid norm 8.273052072454e-09 true resid norm > 1.517967993701e-01 ||r(i)||/||b|| 2.399420473136e-01 > 132 KSP Residual norm 8.273052072454e-09 % max 9.917715776764e+00 min > 2.092540881867e-05 max/min 4.739556518445e+05 > 133 KSP preconditioned resid norm 8.251281894079e-09 true resid norm > 1.516304506087e-01 ||r(i)||/||b|| 2.396791032823e-01 > 133 KSP Residual norm 8.251281894079e-09 % max 9.917715781180e+00 min > 2.090929722571e-05 max/min 4.743208570868e+05 > 134 KSP preconditioned resid norm 8.226033728737e-09 true resid norm > 1.514349562547e-01 ||r(i)||/||b|| 2.393700894181e-01 > 134 KSP Residual norm 8.226033728737e-09 % max 9.917731867345e+00 min > 2.089015001649e-05 max/min 4.747563736744e+05 > 135 KSP preconditioned resid norm 8.199286605060e-09 true resid norm > 1.512447576528e-01 ||r(i)||/||b|| 2.390694464393e-01 > 135 KSP Residual norm 8.199286605060e-09 % max 9.917748489841e+00 min > 2.087043434554e-05 max/min 4.752056581880e+05 > 136 KSP preconditioned resid norm 8.180296594388e-09 true resid norm > 1.510479857878e-01 ||r(i)||/||b|| 2.387584132402e-01 > 136 KSP Residual norm 8.180296594388e-09 % max 9.917785029298e+00 min > 2.085325140445e-05 max/min 4.755989767228e+05 > 137 KSP preconditioned resid norm 8.169264503348e-09 true resid norm > 1.509040950747e-01 ||r(i)||/||b|| 2.385309681792e-01 > 137 KSP Residual norm 8.169264503348e-09 % max 9.917785730737e+00 min > 2.084031855443e-05 max/min 4.758941522334e+05 > 138 KSP preconditioned resid norm 8.156637560480e-09 true resid norm > 1.507559867333e-01 ||r(i)||/||b|| 2.382968564007e-01 > 138 KSP Residual norm 8.156637560480e-09 % max 9.917796791468e+00 min > 2.082708164521e-05 max/min 4.761971437199e+05 > 139 KSP preconditioned resid norm 8.146852248061e-09 true resid norm > 1.506511059489e-01 ||r(i)||/||b|| 2.381310735237e-01 > 139 KSP Residual norm 8.146852248061e-09 % max 9.917821676165e+00 min > 2.081598847234e-05 max/min 4.764521122473e+05 > 140 KSP preconditioned resid norm 8.141191465029e-09 true resid norm > 1.505854905603e-01 ||r(i)||/||b|| 2.380273566420e-01 > 140 KSP Residual norm 8.141191465029e-09 % max 9.917845903968e+00 min > 2.080955754602e-05 max/min 4.766005179128e+05 > 141 KSP preconditioned resid norm 8.138923553302e-09 true resid norm > 1.505722123545e-01 ||r(i)||/||b|| 2.380063680578e-01 > 141 KSP Residual norm 8.138923553302e-09 % max 9.917872635574e+00 min > 2.080833075673e-05 max/min 4.766299013373e+05 > 142 KSP preconditioned resid norm 8.133951815022e-09 true resid norm > 1.505617094662e-01 ||r(i)||/||b|| 2.379897663606e-01 > 142 KSP Residual norm 8.133951815022e-09 % max 9.917872952600e+00 min > 2.080597338078e-05 max/min 4.766839200977e+05 > 143 KSP preconditioned resid norm 8.130475958447e-09 true resid norm > 1.505390404850e-01 ||r(i)||/||b|| 2.379539339729e-01 > 143 KSP Residual norm 8.130475958447e-09 % max 9.917874128718e+00 min > 2.080394570826e-05 max/min 4.767304369950e+05 > 144 KSP preconditioned resid norm 8.128953889987e-09 true resid norm > 1.505364291420e-01 ||r(i)||/||b|| 2.379498062773e-01 > 144 KSP Residual norm 8.128953889987e-09 % max 9.917915384229e+00 min > 2.080337902646e-05 max/min 4.767454061966e+05 > 145 KSP preconditioned resid norm 8.128697515577e-09 true resid norm > 1.505360974540e-01 ||r(i)||/||b|| 2.379492819850e-01 > 145 KSP Residual norm 8.128697515577e-09 % max 9.917928714401e+00 min > 2.080337467554e-05 max/min 4.767461466750e+05 > 146 KSP preconditioned resid norm 8.128385333914e-09 true resid norm > 1.505304085835e-01 ||r(i)||/||b|| 2.379402897056e-01 > 146 KSP Residual norm 8.128385333914e-09 % max 9.918146741049e+00 min > 2.080294082776e-05 max/min 4.767665698407e+05 > 147 KSP preconditioned resid norm 8.127949324855e-09 true resid norm > 1.505226583248e-01 ||r(i)||/||b|| 2.379280390325e-01 > 147 KSP Residual norm 8.127949324855e-09 % max 9.918156051888e+00 min > 2.080250556219e-05 max/min 4.767769931482e+05 > 148 KSP preconditioned resid norm 8.126892921741e-09 true resid norm > 1.505086679169e-01 ||r(i)||/||b|| 2.379059246855e-01 > 148 KSP Residual norm 8.126892921741e-09 % max 9.918171317513e+00 min > 2.080184752072e-05 max/min 4.767928092749e+05 > 149 KSP preconditioned resid norm 8.125786411061e-09 true resid norm > 1.504901877800e-01 ||r(i)||/||b|| 2.378767135170e-01 > 149 KSP Residual norm 8.125786411061e-09 % max 9.918205450251e+00 min > 2.080029672887e-05 max/min 4.768299981263e+05 > 150 KSP preconditioned resid norm 8.121865705516e-09 true resid norm > 1.504379984469e-01 ||r(i)||/||b|| 2.377942189223e-01 > 150 KSP Residual norm 8.121865705516e-09 % max 9.918295492207e+00 min > 2.079653971654e-05 max/min 4.769204698183e+05 > 151 KSP preconditioned resid norm 8.114591200267e-09 true resid norm > 1.503744219252e-01 ||r(i)||/||b|| 2.376937248352e-01 > 151 KSP Residual norm 8.114591200267e-09 % max 9.918302707337e+00 min > 2.079103745413e-05 max/min 4.770470318867e+05 > 152 KSP preconditioned resid norm 8.108201162281e-09 true resid norm > 1.503020846718e-01 ||r(i)||/||b|| 2.375793828416e-01 > 152 KSP Residual norm 8.108201162281e-09 % max 9.918303753789e+00 min > 2.078516129771e-05 max/min 4.771819478198e+05 > 153 KSP preconditioned resid norm 8.102490482870e-09 true resid norm > 1.502240682865e-01 ||r(i)||/||b|| 2.374560639621e-01 > 153 KSP Residual norm 8.102490482870e-09 % max 9.918303965254e+00 min > 2.077946249056e-05 max/min 4.773128260541e+05 > 154 KSP preconditioned resid norm 8.087174026596e-09 true resid norm > 1.500412418392e-01 ||r(i)||/||b|| 2.371670739948e-01 > 154 KSP Residual norm 8.087174026596e-09 % max 9.918315546883e+00 min > 2.076640155913e-05 max/min 4.776135874404e+05 > 155 KSP preconditioned resid norm 8.057393656065e-09 true resid norm > 1.496423828383e-01 ||r(i)||/||b|| 2.365366058580e-01 > 155 KSP Residual norm 8.057393656065e-09 % max 9.918465324870e+00 min > 2.073545587686e-05 max/min 4.783336032625e+05 > 156 KSP preconditioned resid norm 8.011693187532e-09 true resid norm > 1.490284191223e-01 ||r(i)||/||b|| 2.355661261668e-01 > 156 KSP Residual norm 8.011693187532e-09 % max 9.918466288675e+00 min > 2.068954861804e-05 max/min 4.793950062316e+05 > 157 KSP preconditioned resid norm 7.959013340300e-09 true resid norm > 1.483574761823e-01 ||r(i)||/||b|| 2.345055805998e-01 > 157 KSP Residual norm 7.959013340300e-09 % max 9.918561932305e+00 min > 2.064150405350e-05 max/min 4.805154656656e+05 > 158 KSP preconditioned resid norm 7.894240455844e-09 true resid norm > 1.476088146004e-01 ||r(i)||/||b|| 2.333221867900e-01 > 158 KSP Residual norm 7.894240455844e-09 % max 9.918657124651e+00 min > 2.058238758689e-05 max/min 4.819002208941e+05 > 159 KSP preconditioned resid norm 7.812904666749e-09 true resid norm > 1.467208127766e-01 ||r(i)||/||b|| 2.319185407546e-01 > 159 KSP Residual norm 7.812904666749e-09 % max 9.919093811117e+00 min > 2.051599215020e-05 max/min 4.834810687437e+05 > 160 KSP preconditioned resid norm 7.741530539562e-09 true resid norm > 1.459917281627e-01 ||r(i)||/||b|| 2.307660918517e-01 > 160 KSP Residual norm 7.741530539562e-09 % max 9.919095332651e+00 min > 2.045886641493e-05 max/min 4.848311304977e+05 > 161 KSP preconditioned resid norm 7.676609426916e-09 true resid norm > 1.453051987717e-01 ||r(i)||/||b|| 2.296809091053e-01 > 161 KSP Residual norm 7.676609426916e-09 % max 9.919292102450e+00 min > 2.040917630477e-05 max/min 4.860211874465e+05 > 162 KSP preconditioned resid norm 7.583530085636e-09 true resid norm > 1.443036525464e-01 ||r(i)||/||b|| 2.280977857933e-01 > 162 KSP Residual norm 7.583530085636e-09 % max 9.919308484025e+00 min > 2.033498476484e-05 max/min 4.877952257517e+05 > 163 KSP preconditioned resid norm 7.467692143507e-09 true resid norm > 1.429522123636e-01 ||r(i)||/||b|| 2.259615923713e-01 > 163 KSP Residual norm 7.467692143507e-09 % max 9.919855464029e+00 min > 2.024129556919e-05 max/min 4.900800657804e+05 > 164 KSP preconditioned resid norm 7.327851148024e-09 true resid norm > 1.414046163109e-01 ||r(i)||/||b|| 2.235153394408e-01 > 164 KSP Residual norm 7.327851148024e-09 % max 9.920253012186e+00 min > 2.012547547329e-05 max/min 4.929201809594e+05 > 165 KSP preconditioned resid norm 7.152579465113e-09 true resid norm > 1.394666900679e-01 ||r(i)||/||b|| 2.204520996873e-01 > 165 KSP Residual norm 7.152579465113e-09 % max 9.920271087623e+00 min > 1.998671358111e-05 max/min 4.963432856214e+05 > 166 KSP preconditioned resid norm 6.938769833629e-09 true resid norm > 1.370973027073e-01 ||r(i)||/||b|| 2.167068583084e-01 > 166 KSP Residual norm 6.938769833629e-09 % max 9.920293566357e+00 min > 1.981527794098e-05 max/min 5.006386282295e+05 > 167 KSP preconditioned resid norm 6.741207402281e-09 true resid norm > 1.351840392321e-01 ||r(i)||/||b|| 2.136826024797e-01 > 167 KSP Residual norm 6.741207402281e-09 % max 9.920596954025e+00 min > 1.967349814668e-05 max/min 5.042619711075e+05 > 168 KSP preconditioned resid norm 6.565143103123e-09 true resid norm > 1.336623889818e-01 ||r(i)||/||b|| 2.112773615401e-01 > 168 KSP Residual norm 6.565143103123e-09 % max 9.921114523994e+00 min > 1.956018369412e-05 max/min 5.072096806011e+05 > 169 KSP preconditioned resid norm 6.429632478813e-09 true resid norm > 1.325320084241e-01 ||r(i)||/||b|| 2.094905924751e-01 > 169 KSP Residual norm 6.429632478813e-09 % max 9.921249706269e+00 min > 1.947508804310e-05 max/min 5.094328551589e+05 > 170 KSP preconditioned resid norm 6.351340021846e-09 true resid norm > 1.319352335272e-01 ||r(i)||/||b|| 2.085472827931e-01 > 170 KSP Residual norm 6.351340021846e-09 % max 9.921400860428e+00 min > 1.943209479700e-05 max/min 5.105677470224e+05 > 171 KSP preconditioned resid norm 6.305273930969e-09 true resid norm > 1.315826886900e-01 ||r(i)||/||b|| 2.079900224928e-01 > 171 KSP Residual norm 6.305273930969e-09 % max 9.921731787878e+00 min > 1.940477171363e-05 max/min 5.113037109789e+05 > 172 KSP preconditioned resid norm 6.279015387005e-09 true resid norm > 1.313487942570e-01 ||r(i)||/||b|| 2.076203104216e-01 > 172 KSP Residual norm 6.279015387005e-09 % max 9.922029289988e+00 min > 1.939038385409e-05 max/min 5.116984462324e+05 > 173 KSP preconditioned resid norm 6.254268607814e-09 true resid norm > 1.311057590476e-01 ||r(i)||/||b|| 2.072361497149e-01 > 173 KSP Residual norm 6.254268607814e-09 % max 9.922081473522e+00 min > 1.937353844196e-05 max/min 5.121460647596e+05 > 174 KSP preconditioned resid norm 6.237354310290e-09 true resid norm > 1.308928887713e-01 ||r(i)||/||b|| 2.068996700913e-01 > 174 KSP Residual norm 6.237354310290e-09 % max 9.922086586645e+00 min > 1.935976905538e-05 max/min 5.125105861678e+05 > 175 KSP preconditioned resid norm 6.228697463260e-09 true resid norm > 1.307776586248e-01 ||r(i)||/||b|| 2.067175281925e-01 > 175 KSP Residual norm 6.228697463260e-09 % max 9.922153120984e+00 min > 1.935153803598e-05 max/min 5.127320165733e+05 > 176 KSP preconditioned resid norm 6.224066688220e-09 true resid norm > 1.307217422786e-01 ||r(i)||/||b|| 2.066291423857e-01 > 176 KSP Residual norm 6.224066688220e-09 % max 9.922168914276e+00 min > 1.934716877667e-05 max/min 5.128486254920e+05 > 177 KSP preconditioned resid norm 6.218867118158e-09 true resid norm > 1.306556581492e-01 ||r(i)||/||b|| 2.065246845753e-01 > 177 KSP Residual norm 6.218867118158e-09 % max 9.922226850895e+00 min > 1.934272085613e-05 max/min 5.129695519413e+05 > 178 KSP preconditioned resid norm 6.214644508175e-09 true resid norm > 1.306129149611e-01 ||r(i)||/||b|| 2.064571213057e-01 > 178 KSP Residual norm 6.214644508175e-09 % max 9.922355708265e+00 min > 1.933886017403e-05 max/min 5.130786209205e+05 > 179 KSP preconditioned resid norm 6.212855197040e-09 true resid norm > 1.305891120840e-01 ||r(i)||/||b|| 2.064194965924e-01 > 179 KSP Residual norm 6.212855197040e-09 % max 9.922451375570e+00 min > 1.933749030032e-05 max/min 5.131199148117e+05 > 180 KSP preconditioned resid norm 6.212625281595e-09 true resid norm > 1.305842539484e-01 ||r(i)||/||b|| 2.064118174383e-01 > 180 KSP Residual norm 6.212625281595e-09 % max 9.922736461793e+00 min > 1.933715169437e-05 max/min 5.131436427984e+05 > 181 KSP preconditioned resid norm 6.212400733084e-09 true resid norm > 1.305772830113e-01 ||r(i)||/||b|| 2.064007986230e-01 > 181 KSP Residual norm 6.212400733084e-09 % max 9.923063598858e+00 min > 1.933684249949e-05 max/min 5.131687657444e+05 > 182 KSP preconditioned resid norm 6.212071952354e-09 true resid norm > 1.305684442926e-01 ||r(i)||/||b|| 2.063868274438e-01 > 182 KSP Residual norm 6.212071952354e-09 % max 9.923063744763e+00 min > 1.933594544036e-05 max/min 5.131925809044e+05 > 183 KSP preconditioned resid norm 6.211566094656e-09 true resid norm > 1.305591570573e-01 ||r(i)||/||b|| 2.063721473039e-01 > 183 KSP Residual norm 6.211566094656e-09 % max 9.923141089945e+00 min > 1.933537478778e-05 max/min 5.132117271509e+05 > 184 KSP preconditioned resid norm 6.210461787001e-09 true resid norm > 1.305375085549e-01 ||r(i)||/||b|| 2.063379279660e-01 > 184 KSP Residual norm 6.210461787001e-09 % max 9.923201283329e+00 min > 1.933367383203e-05 max/min 5.132599923606e+05 > 185 KSP preconditioned resid norm 6.209143778669e-09 true resid norm > 1.305127223575e-01 ||r(i)||/||b|| 2.062987489387e-01 > 185 KSP Residual norm 6.209143778669e-09 % max 9.923212917908e+00 min > 1.933144232370e-05 max/min 5.133198419313e+05 > 186 KSP preconditioned resid norm 6.208771445688e-09 true resid norm > 1.305097304722e-01 ||r(i)||/||b|| 2.062940197276e-01 > 186 KSP Residual norm 6.208771445688e-09 % max 9.923281135224e+00 min > 1.933117841583e-05 max/min 5.133303786126e+05 > 187 KSP preconditioned resid norm 6.208512520426e-09 true resid norm > 1.305036180825e-01 ||r(i)||/||b|| 2.062843580003e-01 > 187 KSP Residual norm 6.208512520426e-09 % max 9.923314434905e+00 min > 1.933056167219e-05 max/min 5.133484791174e+05 > 188 KSP preconditioned resid norm 6.207066981762e-09 true resid norm > 1.304777614033e-01 ||r(i)||/||b|| 2.062434868846e-01 > 188 KSP Residual norm 6.207066981762e-09 % max 9.923347740518e+00 min > 1.932866808571e-05 max/min 5.134004938424e+05 > 189 KSP preconditioned resid norm 6.204934149804e-09 true resid norm > 1.304392715020e-01 ||r(i)||/||b|| 2.061826466973e-01 > 189 KSP Residual norm 6.204934149804e-09 % max 9.923362596625e+00 min > 1.932574598338e-05 max/min 5.134788900339e+05 > 190 KSP preconditioned resid norm 6.204155842095e-09 true resid norm > 1.304126326631e-01 ||r(i)||/||b|| 2.061405392380e-01 > 190 KSP Residual norm 6.204155842095e-09 % max 9.923514343512e+00 min > 1.932412287621e-05 max/min 5.135298718127e+05 > 191 KSP preconditioned resid norm 6.203518614933e-09 true resid norm > 1.303882235745e-01 ||r(i)||/||b|| 2.061019562988e-01 > 191 KSP Residual norm 6.203518614933e-09 % max 9.923630839564e+00 min > 1.932179761687e-05 max/min 5.135977012253e+05 > 192 KSP preconditioned resid norm 6.201048863957e-09 true resid norm > 1.303177921103e-01 ||r(i)||/||b|| 2.059906267464e-01 > 192 KSP Residual norm 6.201048863957e-09 % max 9.923652649804e+00 min > 1.931812341897e-05 max/min 5.136965136096e+05 > 193 KSP preconditioned resid norm 6.192716632256e-09 true resid norm > 1.301249902732e-01 ||r(i)||/||b|| 2.056858688878e-01 > 193 KSP Residual norm 6.192716632256e-09 % max 9.923855558469e+00 min > 1.930707285429e-05 max/min 5.140010416578e+05 > 194 KSP preconditioned resid norm 6.181798841032e-09 true resid norm > 1.298629857002e-01 ||r(i)||/||b|| 2.052717237022e-01 > 194 KSP Residual norm 6.181798841032e-09 % max 9.924185498149e+00 min > 1.929298985847e-05 max/min 5.143933403247e+05 > 195 KSP preconditioned resid norm 6.171531561212e-09 true resid norm > 1.295856208773e-01 ||r(i)||/||b|| 2.048332988886e-01 > 195 KSP Residual norm 6.171531561212e-09 % max 9.924254204639e+00 min > 1.928113063039e-05 max/min 5.147132911902e+05 > 196 KSP preconditioned resid norm 6.153444479612e-09 true resid norm > 1.291447908864e-01 ||r(i)||/||b|| 2.041364880798e-01 > 196 KSP Residual norm 6.153444479612e-09 % max 9.924316134490e+00 min > 1.926027722186e-05 max/min 5.152737948769e+05 > 197 KSP preconditioned resid norm 6.119723945157e-09 true resid norm > 1.283308296537e-01 ||r(i)||/||b|| 2.028498764687e-01 > 197 KSP Residual norm 6.119723945157e-09 % max 9.924316422607e+00 min > 1.922390011138e-05 max/min 5.162488550767e+05 > 198 KSP preconditioned resid norm 6.067997285763e-09 true resid norm > 1.271271480671e-01 ||r(i)||/||b|| 2.009472419903e-01 > 198 KSP Residual norm 6.067997285763e-09 % max 9.924442950178e+00 min > 1.916694880677e-05 max/min 5.177894014447e+05 > 199 KSP preconditioned resid norm 6.004429863992e-09 true resid norm > 1.256627003698e-01 ||r(i)||/||b|| 1.986324199378e-01 > 199 KSP Residual norm 6.004429863992e-09 % max 9.924866956836e+00 min > 1.909406330276e-05 max/min 5.197881037403e+05 > 200 KSP preconditioned resid norm 5.934094254933e-09 true resid norm > 1.240250400661e-01 ||r(i)||/||b|| 1.960438043167e-01 > 200 KSP Residual norm 5.934094254933e-09 % max 9.924973873650e+00 min > 1.900999030530e-05 max/min 5.220925268374e+05 > 201 KSP preconditioned resid norm 5.851144210814e-09 true resid norm > 1.220942283818e-01 ||r(i)||/||b|| 1.929918103984e-01 > 201 KSP Residual norm 5.851144210814e-09 % max 9.925050713110e+00 min > 1.890466927565e-05 max/min 5.250052549660e+05 > 202 KSP preconditioned resid norm 5.765248380990e-09 true resid norm > 1.201835481423e-01 ||r(i)||/||b|| 1.899716378366e-01 > 202 KSP Residual norm 5.765248380990e-09 % max 9.925088925004e+00 min > 1.879800779681e-05 max/min 5.279862117458e+05 > 203 KSP preconditioned resid norm 5.682453748888e-09 true resid norm > 1.183002353836e-01 ||r(i)||/||b|| 1.869947244831e-01 > 203 KSP Residual norm 5.682453748888e-09 % max 9.925585994328e+00 min > 1.868987652967e-05 max/min 5.310674994866e+05 > 204 KSP preconditioned resid norm 5.601209052384e-09 true resid norm > 1.165424983140e-01 ||r(i)||/||b|| 1.842163060127e-01 > 204 KSP Residual norm 5.601209052384e-09 % max 9.926813616373e+00 min > 1.859493580288e-05 max/min 5.338450060600e+05 > 205 KSP preconditioned resid norm 5.518585052549e-09 true resid norm > 1.147552536083e-01 ||r(i)||/||b|| 1.813912454350e-01 > 205 KSP Residual norm 5.518585052549e-09 % max 9.927547139863e+00 min > 1.849785740830e-05 max/min 5.366863264612e+05 > 206 KSP preconditioned resid norm 5.403946681825e-09 true resid norm > 1.122288671650e-01 ||r(i)||/||b|| 1.773978388675e-01 > 206 KSP Residual norm 5.403946681825e-09 % max 9.927598607359e+00 min > 1.836227794968e-05 max/min 5.406517990068e+05 > 207 KSP preconditioned resid norm 5.279599429018e-09 true resid norm > 1.096186442675e-01 ||r(i)||/||b|| 1.732719137586e-01 > 207 KSP Residual norm 5.279599429018e-09 % max 9.928135298387e+00 min > 1.822407087542e-05 max/min 5.447814248670e+05 > 208 KSP preconditioned resid norm 5.196372449667e-09 true resid norm > 1.079356390651e-01 ||r(i)||/||b|| 1.706116223982e-01 > 208 KSP Residual norm 5.196372449667e-09 % max 9.929022382947e+00 min > 1.813084812731e-05 max/min 5.476314352879e+05 > 209 KSP preconditioned resid norm 5.126109765538e-09 true resid norm > 1.065191875250e-01 ||r(i)||/||b|| 1.683726668744e-01 > 209 KSP Residual norm 5.126109765538e-09 % max 9.929797496117e+00 min > 1.805154321094e-05 max/min 5.500802551939e+05 > 210 KSP preconditioned resid norm 5.077403106475e-09 true resid norm > 1.055994890106e-01 ||r(i)||/||b|| 1.669189185387e-01 > 210 KSP Residual norm 5.077403106475e-09 % max 9.930393445658e+00 min > 1.799779398678e-05 max/min 5.517561459451e+05 > 211 KSP preconditioned resid norm 5.044178538989e-09 true resid norm > 1.049755475719e-01 ||r(i)||/||b|| 1.659326672683e-01 > 211 KSP Residual norm 5.044178538989e-09 % max 9.930465924340e+00 min > 1.796091021169e-05 max/min 5.528932446795e+05 > 212 KSP preconditioned resid norm 5.020571485059e-09 true resid norm > 1.045258936180e-01 ||r(i)||/||b|| 1.652219086046e-01 > 212 KSP Residual norm 5.020571485059e-09 % max 9.930491351737e+00 min > 1.793154500597e-05 max/min 5.538000963347e+05 > 213 KSP preconditioned resid norm 5.006825891577e-09 true resid norm > 1.042955061868e-01 ||r(i)||/||b|| 1.648577399783e-01 > 213 KSP Residual norm 5.006825891577e-09 % max 9.930514884346e+00 min > 1.791642205643e-05 max/min 5.542688631173e+05 > 214 KSP preconditioned resid norm 4.994301057584e-09 true resid norm > 1.040932063678e-01 ||r(i)||/||b|| 1.645379688570e-01 > 214 KSP Residual norm 4.994301057584e-09 % max 9.930647907797e+00 min > 1.790225819785e-05 max/min 5.547148185467e+05 > 215 KSP preconditioned resid norm 4.977855514222e-09 true resid norm > 1.038198654095e-01 ||r(i)||/||b|| 1.641059044826e-01 > 215 KSP Residual norm 4.977855514222e-09 % max 9.930666016381e+00 min > 1.788110391402e-05 max/min 5.553720879947e+05 > 216 KSP preconditioned resid norm 4.964348988837e-09 true resid norm > 1.036080455604e-01 ||r(i)||/||b|| 1.637710852474e-01 > 216 KSP Residual norm 4.964348988837e-09 % max 9.930679455140e+00 min > 1.786718726800e-05 max/min 5.558054161622e+05 > 217 KSP preconditioned resid norm 4.956200770500e-09 true resid norm > 1.035069953035e-01 ||r(i)||/||b|| 1.636113572057e-01 > 217 KSP Residual norm 4.956200770500e-09 % max 9.930689869541e+00 min > 1.785815278053e-05 max/min 5.560871827891e+05 > 218 KSP preconditioned resid norm 4.950534860249e-09 true resid norm > 1.034473981952e-01 ||r(i)||/||b|| 1.635171532945e-01 > 218 KSP Residual norm 4.950534860249e-09 % max 9.930984291908e+00 min > 1.785402053178e-05 max/min 5.562323777005e+05 > 219 KSP preconditioned resid norm 4.942975355172e-09 true resid norm > 1.033459997641e-01 ||r(i)||/||b|| 1.633568749011e-01 > 219 KSP Residual norm 4.942975355172e-09 % max 9.931699409621e+00 min > 1.784719695142e-05 max/min 5.564851128529e+05 > 220 KSP preconditioned resid norm 4.931006016804e-09 true resid norm > 1.031436045505e-01 ||r(i)||/||b|| 1.630369529915e-01 > 220 KSP Residual norm 4.931006016804e-09 % max 9.931710539777e+00 min > 1.783542753853e-05 max/min 5.568529556311e+05 > 221 KSP preconditioned resid norm 4.918043165591e-09 true resid norm > 1.029101140458e-01 ||r(i)||/||b|| 1.626678794013e-01 > 221 KSP Residual norm 4.918043165591e-09 % max 9.931745055666e+00 min > 1.782321540888e-05 max/min 5.572364372995e+05 > 222 KSP preconditioned resid norm 4.897513457279e-09 true resid norm > 1.025741649722e-01 ||r(i)||/||b|| 1.621368516797e-01 > 222 KSP Residual norm 4.897513457279e-09 % max 9.931805660966e+00 min > 1.780346830739e-05 max/min 5.578579122612e+05 > 223 KSP preconditioned resid norm 4.860092454927e-09 true resid norm > 1.019662024304e-01 ||r(i)||/||b|| 1.611758579198e-01 > 223 KSP Residual norm 4.860092454927e-09 % max 9.932464226880e+00 min > 1.776986772785e-05 max/min 5.589498120639e+05 > 224 KSP preconditioned resid norm 4.769160522417e-09 true resid norm > 1.004448198639e-01 ||r(i)||/||b|| 1.587710401024e-01 > 224 KSP Residual norm 4.769160522417e-09 % max 9.933920635368e+00 min > 1.767636423838e-05 max/min 5.619889079792e+05 > 225 KSP preconditioned resid norm 4.634557449537e-09 true resid norm > 9.826212202717e-02 ||r(i)||/||b|| 1.553208949755e-01 > 225 KSP Residual norm 4.634557449537e-09 % max 9.935080165032e+00 min > 1.754021072797e-05 max/min 5.664173777106e+05 > 226 KSP preconditioned resid norm 4.491782226584e-09 true resid norm > 9.619719506576e-02 ||r(i)||/||b|| 1.520569078248e-01 > 226 KSP Residual norm 4.491782226584e-09 % max 9.937326276961e+00 min > 1.739963797630e-05 max/min 5.711225883262e+05 > 227 KSP preconditioned resid norm 4.362133112512e-09 true resid norm > 9.441389210582e-02 ||r(i)||/||b|| 1.492380778826e-01 > 227 KSP Residual norm 4.362133112512e-09 % max 9.946120307410e+00 min > 1.727701416548e-05 max/min 5.756851393504e+05 > 228 KSP preconditioned resid norm 4.225957340088e-09 true resid norm > 9.265275503217e-02 ||r(i)||/||b|| 1.464542850964e-01 > 228 KSP Residual norm 4.225957340088e-09 % max 9.985645498795e+00 min > 1.715055176814e-05 max/min 5.822346495782e+05 > 229 KSP preconditioned resid norm 4.092962778807e-09 true resid norm > 9.106715798430e-02 ||r(i)||/||b|| 1.439479647823e-01 > 229 KSP Residual norm 4.092962778807e-09 % max 1.012393782405e+01 min > 1.703438299138e-05 max/min 5.943237174583e+05 > 230 KSP preconditioned resid norm 3.953159424001e-09 true resid norm > 8.949933715671e-02 ||r(i)||/||b|| 1.414697429703e-01 > 230 KSP Residual norm 3.953159424001e-09 % max 1.034103229999e+01 min > 1.691398442562e-05 max/min 6.113894892986e+05 > 231 KSP preconditioned resid norm 3.819537372036e-09 true resid norm > 8.813823667573e-02 ||r(i)||/||b|| 1.393182797157e-01 > 231 KSP Residual norm 3.819537372036e-09 % max 1.059220161526e+01 min > 1.680354842788e-05 max/min 6.303550503468e+05 > 232 KSP preconditioned resid norm 3.693790673013e-09 true resid norm > 8.693612541781e-02 ||r(i)||/||b|| 1.374181274232e-01 > 232 KSP Residual norm 3.693790673013e-09 % max 1.086546283828e+01 min > 1.670426209107e-05 max/min 6.504605099612e+05 > 233 KSP preconditioned resid norm 3.570890771949e-09 true resid norm > 8.586106485480e-02 ||r(i)||/||b|| 1.357188015247e-01 > 233 KSP Residual norm 3.570890771949e-09 % max 1.117129763635e+01 min > 1.660960311863e-05 max/min 6.725806484697e+05 > 234 KSP preconditioned resid norm 3.457471830300e-09 true resid norm > 8.494167552889e-02 ||r(i)||/||b|| 1.342655419168e-01 > 234 KSP Residual norm 3.457471830300e-09 % max 1.149605765262e+01 min > 1.652670056090e-05 max/min 6.956051276091e+05 > 235 KSP preconditioned resid norm 3.349671536188e-09 true resid norm > 8.413008092747e-02 ||r(i)||/||b|| 1.329826711905e-01 > 235 KSP Residual norm 3.349671536188e-09 % max 1.186263512521e+01 min > 1.644950134682e-05 max/min 7.211546949117e+05 > 236 KSP preconditioned resid norm 3.250181079812e-09 true resid norm > 8.343835240331e-02 ||r(i)||/||b|| 1.318892702824e-01 > 236 KSP Residual norm 3.250181079812e-09 % max 1.224121618785e+01 min > 1.638169066079e-05 max/min 7.472498682418e+05 > 237 KSP preconditioned resid norm 3.157573520911e-09 true resid norm > 8.283558144539e-02 ||r(i)||/||b|| 1.309364827513e-01 > 237 KSP Residual norm 3.157573520911e-09 % max 1.266651789872e+01 min > 1.631989107866e-05 max/min 7.761398552030e+05 > 238 KSP preconditioned resid norm 3.071923111478e-09 true resid norm > 8.231741040511e-02 ||r(i)||/||b|| 1.301174205525e-01 > 238 KSP Residual norm 3.071923111478e-09 % max 1.310242143029e+01 min > 1.626505408103e-05 max/min 8.055565856109e+05 > 239 KSP preconditioned resid norm 2.992453638560e-09 true resid norm > 8.186535317118e-02 ||r(i)||/||b|| 1.294028630739e-01 > 239 KSP Residual norm 2.992453638560e-09 % max 1.358095904032e+01 min > 1.621534964344e-05 max/min 8.375372310160e+05 > 240 KSP preconditioned resid norm 2.918686673178e-09 true resid norm > 8.147203530886e-02 ||r(i)||/||b|| 1.287811536998e-01 > 240 KSP Residual norm 2.918686673178e-09 % max 1.407595144572e+01 min > 1.617073516012e-05 max/min 8.704583499968e+05 > 241 KSP preconditioned resid norm 2.850021677599e-09 true resid norm > 8.112645247485e-02 ||r(i)||/||b|| 1.282348980933e-01 > 241 KSP Residual norm 2.850021677599e-09 % max 1.460780048328e+01 min > 1.613020605157e-05 max/min 9.056177234546e+05 > 242 KSP preconditioned resid norm 2.785952071661e-09 true resid norm > 8.082199238068e-02 ||r(i)||/||b|| 1.277536443473e-01 > 242 KSP Residual norm 2.785952071661e-09 % max 1.516164202801e+01 min > 1.609343254723e-05 max/min 9.421011946032e+05 > 243 KSP preconditioned resid norm 2.726003754945e-09 true resid norm > 8.055198108373e-02 ||r(i)||/||b|| 1.273268430995e-01 > 243 KSP Residual norm 2.726003754945e-09 % max 1.574736003328e+01 min > 1.605983429005e-05 max/min 9.805431207371e+05 > 244 KSP preconditioned resid norm 2.669760211669e-09 true resid norm > 8.031144058846e-02 ||r(i)||/||b|| 1.269466257357e-01 > 244 KSP Residual norm 2.669760211669e-09 % max 1.635815972012e+01 min > 1.602907156577e-05 max/min 1.020530705911e+06 > 245 KSP preconditioned resid norm 2.616859721135e-09 true resid norm > 8.009610298041e-02 ||r(i)||/||b|| 1.266062460521e-01 > 245 KSP Residual norm 2.616859721135e-09 % max 1.699685923744e+01 min > 1.600077930749e-05 max/min 1.062251963533e+06 > 246 KSP preconditioned resid norm 2.566982964804e-09 true resid norm > 7.990246732814e-02 ||r(i)||/||b|| 1.263001701991e-01 > 246 KSP Residual norm 2.566982964804e-09 % max 1.766055495648e+01 min > 1.597468378646e-05 max/min 1.105533930596e+06 > 247 KSP preconditioned resid norm 2.519853945818e-09 true resid norm > 7.972764043767e-02 ||r(i)||/||b|| 1.260238249653e-01 > 247 KSP Residual norm 2.519853945818e-09 % max 1.834829832678e+01 min > 1.595053475726e-05 max/min 1.150324964398e+06 > 248 KSP preconditioned resid norm 2.475228628479e-09 true resid norm > 7.956917868632e-02 ||r(i)||/||b|| 1.257733477668e-01 > 248 KSP Residual norm 2.475228628479e-09 % max 1.905813218687e+01 min > 1.592812497001e-05 max/min 1.196508203116e+06 > 249 KSP preconditioned resid norm 2.432893339181e-09 true resid norm > 7.942504309496e-02 ||r(i)||/||b|| 1.255455156318e-01 > 249 KSP Residual norm 2.432893339181e-09 % max 1.978773059579e+01 min > 1.590727268262e-05 max/min 1.243942377213e+06 > 250 KSP preconditioned resid norm 2.392658478629e-09 true resid norm > 7.929349666938e-02 ||r(i)||/||b|| 1.253375829297e-01 > 250 KSP Residual norm 2.392658478629e-09 % max 2.053507536526e+01 min > 1.588782169373e-05 max/min 1.292504143180e+06 > 251 KSP preconditioned resid norm 2.354356018003e-09 true resid norm > 7.917306646215e-02 ||r(i)||/||b|| 1.251472214030e-01 > 251 KSP Residual norm 2.354356018003e-09 % max 2.129774752042e+01 min > 1.586963551752e-05 max/min 1.342043898671e+06 > 252 KSP preconditioned resid norm 2.317836022639e-09 true resid norm > 7.906248909205e-02 ||r(i)||/||b|| 1.249724340512e-01 > 252 KSP Residual norm 2.317836022639e-09 % max 2.207376104408e+01 min > 1.585259496597e-05 max/min 1.392438341575e+06 > 253 KSP preconditioned resid norm 2.282964417342e-09 true resid norm > 7.896067936127e-02 ||r(i)||/||b|| 1.248115055248e-01 > 253 KSP Residual norm 2.282964417342e-09 % max 2.286109489220e+01 min > 1.583659527831e-05 max/min 1.443561225784e+06 > 254 KSP preconditioned resid norm 2.249620807694e-09 true resid norm > 7.886669910772e-02 ||r(i)||/||b|| 1.246629528903e-01 > 254 KSP Residual norm 2.249620807694e-09 % max 2.365806660683e+01 min > 1.582154408902e-05 max/min 1.495307061922e+06 > 255 KSP preconditioned resid norm 2.217696771416e-09 true resid norm > 7.877973447027e-02 ||r(i)||/||b|| 1.245254896945e-01 > 255 KSP Residual norm 2.217696771416e-09 % max 2.446314901661e+01 min > 1.580735961621e-05 max/min 1.547579710373e+06 > 256 KSP preconditioned resid norm 2.187094355924e-09 true resid norm > 7.869907621975e-02 ||r(i)||/||b|| 1.243979948735e-01 > 256 KSP Residual norm 2.187094355924e-09 % max 2.527506655730e+01 min > 1.579396921088e-05 max/min 1.600298583581e+06 > 257 KSP preconditioned resid norm 2.157724818661e-09 true resid norm > 7.862410399905e-02 ||r(i)||/||b|| 1.242794878418e-01 > 257 KSP Residual norm 2.157724818661e-09 % max 2.609272696980e+01 min > 1.578130810546e-05 max/min 1.653394433176e+06 > 258 KSP preconditioned resid norm 2.129507540860e-09 true resid norm > 7.855427316399e-02 ||r(i)||/||b|| 1.241691076915e-01 > 258 KSP Residual norm 2.129507540860e-09 % max 2.691522199241e+01 min > 1.576931837984e-05 max/min 1.706809472933e+06 > 259 KSP preconditioned resid norm 2.102369096407e-09 true resid norm > 7.848910387083e-02 ||r(i)||/||b|| 1.240660959436e-01 > 259 KSP Residual norm 2.102369096407e-09 % max 2.774179262325e+01 min > 1.575794808171e-05 max/min 1.760495242109e+06 > 260 KSP preconditioned resid norm 2.076242447040e-09 true resid norm > 7.842817196590e-02 ||r(i)||/||b|| 1.239697821473e-01 > 260 KSP Residual norm 2.076242447040e-09 % max 2.857180990120e+01 min > 1.574715047563e-05 max/min 1.814411435607e+06 > 261 KSP preconditioned resid norm 2.051066245234e-09 true resid norm > 7.837110133459e-02 ||r(i)||/||b|| 1.238795717860e-01 > 261 KSP Residual norm 2.051066245234e-09 % max 2.940475189721e+01 min > 1.573688340909e-05 max/min 1.868524480535e+06 > 262 KSP preconditioned resid norm 2.026784227773e-09 true resid norm > 7.831755746963e-02 ||r(i)||/||b|| 1.237949361110e-01 > 262 KSP Residual norm 2.026784227773e-09 % max 3.024018579353e+01 min > 1.572710876006e-05 max/min 1.922806426464e+06 > 263 KSP preconditioned resid norm 2.003344686512e-09 true resid norm > 7.826724200522e-02 ||r(i)||/||b|| 1.237154035016e-01 > 263 KSP Residual norm 2.003344686512e-09 % max 3.107775189605e+01 min > 1.571779197010e-05 max/min 1.977233949601e+06 > 264 KSP preconditioned resid norm 1.980700004942e-09 true resid norm > 7.821988809140e-02 ||r(i)||/||b|| 1.236405521538e-01 > 264 KSP Residual norm 1.980700004942e-09 % max 3.191715058147e+01 min > 1.570890163917e-05 max/min 2.031787537703e+06 > 265 KSP preconditioned resid norm 1.958806251109e-09 true resid norm > 7.817525644637e-02 ||r(i)||/||b|| 1.235700038397e-01 > 265 KSP Residual norm 1.958806251109e-09 % max 3.275813144817e+01 min > 1.570040916795e-05 max/min 2.086450811425e+06 > 266 KSP preconditioned resid norm 1.937622818970e-09 true resid norm > 7.813313196624e-02 ||r(i)||/||b|| 1.235034185490e-01 > 266 KSP Residual norm 1.937622818970e-09 % max 3.360048447534e+01 min > 1.569228845921e-05 max/min 2.141209968366e+06 > 267 KSP preconditioned resid norm 1.917112111503e-09 true resid norm > 7.809332083185e-02 ||r(i)||/||b|| 1.234404899160e-01 > 267 KSP Residual norm 1.917112111503e-09 % max 3.444403281522e+01 min > 1.568451564582e-05 max/min 2.196053330113e+06 > 268 KSP preconditioned resid norm 1.897239259969e-09 true resid norm > 7.805564799222e-02 ||r(i)||/||b|| 1.233809412410e-01 > 268 KSP Residual norm 1.897239259969e-09 % max 3.528862694058e+01 min > 1.567706885891e-05 max/min 2.250970972838e+06 > 269 KSP preconditioned resid norm 1.877971874523e-09 true resid norm > 7.801995501184e-02 ||r(i)||/||b|| 1.233245220884e-01 > 269 KSP Residual norm 1.877971874523e-09 % max 3.613413989135e+01 min > 1.566992802358e-05 max/min 2.305954426656e+06 > 270 KSP preconditioned resid norm 1.859279822146e-09 true resid norm > 7.798609818486e-02 ||r(i)||/||b|| 1.232710053053e-01 > 270 KSP Residual norm 1.859279822146e-09 % max 3.698046341191e+01 min > 1.566307468025e-05 max/min 2.360996430576e+06 > 271 KSP preconditioned resid norm 1.841135028406e-09 true resid norm > 7.795394691239e-02 ||r(i)||/||b|| 1.232201844568e-01 > 271 KSP Residual norm 1.841135028406e-09 % max 3.782750480653e+01 min > 1.565649181982e-05 max/min 2.416090733599e+06 > 272 KSP preconditioned resid norm 1.823511300102e-09 true resid norm > 7.792338226282e-02 ||r(i)||/||b|| 1.231718715502e-01 > 272 KSP Residual norm 1.823511300102e-09 % max 3.867518437327e+01 min > 1.565016375351e-05 max/min 2.471231929736e+06 > 273 KSP preconditioned resid norm 1.806384166231e-09 true resid norm > 7.789429574400e-02 ||r(i)||/||b|| 1.231258950942e-01 > 273 KSP Residual norm 1.806384166231e-09 % max 3.952343330397e+01 min > 1.564407597706e-05 max/min 2.526415325643e+06 > 274 KSP preconditioned resid norm 1.789730735084e-09 true resid norm > 7.786658820055e-02 ||r(i)||/||b|| 1.230820983558e-01 > 274 KSP Residual norm 1.789730735084e-09 % max 4.037219195996e+01 min > 1.563821506843e-05 max/min 2.581636828966e+06 > 275 KSP preconditioned resid norm 1.773529565569e-09 true resid norm > 7.784016885169e-02 ||r(i)||/||b|| 1.230403378399e-01 > 275 KSP Residual norm 1.773529565569e-09 % max 4.122140845150e+01 min > 1.563256858590e-05 max/min 2.636892857690e+06 > 276 KSP preconditioned resid norm 1.757760551130e-09 true resid norm > 7.781495445520e-02 ||r(i)||/||b|| 1.230004819672e-01 > 276 KSP Residual norm 1.757760551130e-09 % max 4.207103746307e+01 min > 1.562712498178e-05 max/min 2.692180264260e+06 > 277 KSP preconditioned resid norm 1.742404814808e-09 true resid norm > 7.779086855347e-02 ||r(i)||/||b|| 1.229624098827e-01 > 277 KSP Residual norm 1.742404814808e-09 % max 4.292103927837e+01 min > 1.562187351950e-05 max/min 2.747496273402e+06 > 278 KSP preconditioned resid norm 1.727444614226e-09 true resid norm > 7.776784081058e-02 ||r(i)||/||b|| 1.229260104079e-01 > 278 KSP Residual norm 1.727444614226e-09 % max 4.377137896812e+01 min > 1.561680420878e-05 max/min 2.802838428588e+06 > 279 KSP preconditioned resid norm 1.712863255388e-09 true resid norm > 7.774580642352e-02 ||r(i)||/||b|| 1.228911811100e-01 > 279 KSP Residual norm 1.712863255388e-09 % max 4.462202571076e+01 min > 1.561190773954e-05 max/min 2.858204548427e+06 > 280 KSP preconditioned resid norm 1.698645014357e-09 true resid norm > 7.772470561649e-02 ||r(i)||/||b|| 1.228578275027e-01 > 280 KSP Residual norm 1.698645014357e-09 % max 4.547295222235e+01 min > 1.560717542407e-05 max/min 2.913592689695e+06 > 281 KSP preconditioned resid norm 1.684775065960e-09 true resid norm > 7.770448314865e-02 ||r(i)||/||b|| 1.228258622679e-01 > 281 KSP Residual norm 1.684775065960e-09 % max 4.632413427620e+01 min > 1.560259914800e-05 max/min 2.969001115571e+06 > 282 KSP preconditioned resid norm 1.671239418801e-09 true resid norm > 7.768508792437e-02 ||r(i)||/||b|| 1.227952046398e-01 > 282 KSP Residual norm 1.671239418801e-09 % max 4.717555029673e+01 min > 1.559817132342e-05 max/min 3.024428269094e+06 > 283 KSP preconditioned resid norm 1.658024855925e-09 true resid norm > 7.766647260718e-02 ||r(i)||/||b|| 1.227657797946e-01 > 283 KSP Residual norm 1.658024855925e-09 % max 4.802718101487e+01 min > 1.559388484496e-05 max/min 3.079872750914e+06 > 284 KSP preconditioned resid norm 1.645118880556e-09 true resid norm > 7.764859329326e-02 ||r(i)||/||b|| 1.227375183346e-01 > 284 KSP Residual norm 1.645118880556e-09 % max 4.887900917455e+01 min > 1.558973305319e-05 max/min 3.135333299665e+06 > 285 KSP preconditioned resid norm 1.632509666424e-09 true resid norm > 7.763140921182e-02 ||r(i)||/||b|| 1.227103558140e-01 > 285 KSP Residual norm 1.632509666424e-09 % max 4.973101928201e+01 min > 1.558570970162e-05 max/min 3.190808775095e+06 > 286 KSP preconditioned resid norm 1.620186012203e-09 true resid norm > 7.761488245375e-02 ||r(i)||/||b|| 1.226842323109e-01 > 286 KSP Residual norm 1.620186012203e-09 % max 5.058319739082e+01 min > 1.558180892380e-05 max/min 3.246298144084e+06 > 287 KSP preconditioned resid norm 1.608137299684e-09 true resid norm > 7.759897773695e-02 ||r(i)||/||b|| 1.226590920554e-01 > 287 KSP Residual norm 1.608137299684e-09 % max 5.143553091712e+01 min > 1.557802520420e-05 max/min 3.301800468474e+06 > 288 KSP preconditioned resid norm 1.596353455318e-09 true resid norm > 7.758366217724e-02 ||r(i)||/||b|| 1.226348830683e-01 > 288 KSP Residual norm 1.596353455318e-09 % max 5.228800848011e+01 min > 1.557435335439e-05 max/min 3.357314893935e+06 > 289 KSP preconditioned resid norm 1.584824914814e-09 true resid norm > 7.756890510550e-02 ||r(i)||/||b|| 1.226115568716e-01 > 289 KSP Residual norm 1.584824914814e-09 % max 5.314061976413e+01 min > 1.557078848607e-05 max/min 3.412840641416e+06 > 290 KSP preconditioned resid norm 1.573542590509e-09 true resid norm > 7.755467787456e-02 ||r(i)||/||b|| 1.225890681832e-01 > 290 KSP Residual norm 1.573542590509e-09 % max 5.399335539895e+01 min > 1.556732599304e-05 max/min 3.468376998278e+06 > 291 KSP preconditioned resid norm 1.562497841257e-09 true resid norm > 7.754095371195e-02 ||r(i)||/||b|| 1.225673746845e-01 > 291 KSP Residual norm 1.562497841257e-09 % max 5.484620685555e+01 min > 1.556396153017e-05 max/min 3.523923311507e+06 > 292 KSP preconditioned resid norm 1.551682444612e-09 true resid norm > 7.752770755925e-02 ||r(i)||/||b|| 1.225464367661e-01 > 292 KSP Residual norm 1.551682444612e-09 % max 5.569916635525e+01 min > 1.556069099359e-05 max/min 3.579478981892e+06 > 293 KSP preconditioned resid norm 1.541088571101e-09 true resid norm > 7.751491595478e-02 ||r(i)||/||b|| 1.225262173427e-01 > 293 KSP Residual norm 1.541088571101e-09 % max 5.655222679014e+01 min > 1.555751050459e-05 max/min 3.635043458494e+06 > 294 KSP preconditioned resid norm 1.530708760400e-09 true resid norm > 7.750255689784e-02 ||r(i)||/||b|| 1.225066816381e-01 > 294 KSP Residual norm 1.530708760400e-09 % max 5.740538165332e+01 min > 1.555441639733e-05 max/min 3.690616233161e+06 > 295 KSP preconditioned resid norm 1.520535899256e-09 true resid norm > 7.749060974081e-02 ||r(i)||/||b|| 1.224877970152e-01 > 295 KSP Residual norm 1.520535899256e-09 % max 5.825862497755e+01 min > 1.555140519978e-05 max/min 3.746196837465e+06 > 296 KSP preconditioned resid norm 1.510563200994e-09 true resid norm > 7.747905509827e-02 ||r(i)||/||b|| 1.224695328317e-01 > 296 KSP Residual norm 1.510563200994e-09 % max 5.911195128112e+01 min > 1.554847362481e-05 max/min 3.801784838019e+06 > 297 KSP preconditioned resid norm 1.500784186497e-09 true resid norm > 7.746787474440e-02 ||r(i)||/||b|| 1.224518602786e-01 > 297 KSP Residual norm 1.500784186497e-09 % max 5.996535552011e+01 min > 1.554561855686e-05 max/min 3.857379833473e+06 > 298 KSP preconditioned resid norm 1.491192666507e-09 true resid norm > 7.745705153527e-02 ||r(i)||/||b|| 1.224347522568e-01 > 298 KSP Residual norm 1.491192666507e-09 % max 6.081883304605e+01 min > 1.554283704130e-05 max/min 3.912981451485e+06 > 299 KSP preconditioned resid norm 1.481782725168e-09 true resid norm > 7.744656933035e-02 ||r(i)||/||b|| 1.224181832532e-01 > 299 KSP Residual norm 1.481782725168e-09 % max 6.167237956835e+01 min > 1.554012627155e-05 max/min 3.968589346744e+06 > 300 KSP preconditioned resid norm 1.472548704690e-09 true resid norm > 7.743641291307e-02 ||r(i)||/||b|| 1.224021292155e-01 > 300 KSP Residual norm 1.472548704690e-09 % max 6.252599112088e+01 min > 1.553748358279e-05 max/min 4.024203197881e+06 > 301 KSP preconditioned resid norm 1.463485191052e-09 true resid norm > 7.742656794078e-02 ||r(i)||/||b|| 1.223865674723e-01 > 301 KSP Residual norm 1.463485191052e-09 % max 6.337966403224e+01 min > 1.553490644415e-05 max/min 4.079822705085e+06 > 302 KSP preconditioned resid norm 1.454587000664e-09 true resid norm > 7.741702087545e-02 ||r(i)||/||b|| 1.223714766244e-01 > 302 KSP Residual norm 1.454587000664e-09 % max 6.423339489912e+01 min > 1.553239244746e-05 max/min 4.135447589055e+06 > 303 KSP preconditioned resid norm 1.445849167908e-09 true resid norm > 7.740775893208e-02 ||r(i)||/||b|| 1.223568364630e-01 > 303 KSP Residual norm 1.445849167908e-09 % max 6.508718056257e+01 min > 1.552993929984e-05 max/min 4.191077589287e+06 > 304 KSP preconditioned resid norm 1.437266933493e-09 true resid norm > 7.739877002315e-02 ||r(i)||/||b|| 1.223426278813e-01 > 304 KSP Residual norm 1.437266933493e-09 % max 6.594101808663e+01 min > 1.552754481805e-05 max/min 4.246712462229e+06 > 305 KSP preconditioned resid norm 1.428835733563e-09 true resid norm > 7.739004272067e-02 ||r(i)||/||b|| 1.223288328156e-01 > 305 KSP Residual norm 1.428835733563e-09 % max 6.679490473918e+01 min > 1.552520692457e-05 max/min 4.302351979186e+06 > 306 KSP preconditioned resid norm 1.420551189495e-09 true resid norm > 7.738156621021e-02 ||r(i)||/||b|| 1.223154341716e-01 > 306 KSP Residual norm 1.420551189495e-09 % max 6.764883797474e+01 min > 1.552292363761e-05 max/min 4.357995926157e+06 > 307 KSP preconditioned resid norm 1.412409098347e-09 true resid norm > 7.737333024472e-02 ||r(i)||/||b|| 1.223024157521e-01 > 307 KSP Residual norm 1.412409098347e-09 % max 6.850281541896e+01 min > 1.552069306562e-05 max/min 4.413644102704e+06 > 308 KSP preconditioned resid norm 1.404405423899e-09 true resid norm > 7.736532511275e-02 ||r(i)||/||b|| 1.222897622063e-01 > 308 KSP Residual norm 1.404405423899e-09 % max 6.935683485452e+01 min > 1.551851340462e-05 max/min 4.469296320217e+06 > 309 KSP preconditioned resid norm 1.396536288240e-09 true resid norm > 7.735754161515e-02 ||r(i)||/||b|| 1.222774589933e-01 > 309 KSP Residual norm 1.396536288240e-09 % max 7.021089420858e+01 min > 1.551638293250e-05 max/min 4.524952401215e+06 > 310 KSP preconditioned resid norm 1.388797963878e-09 true resid norm > 7.734997100748e-02 ||r(i)||/||b|| 1.222654922910e-01 > 310 KSP Residual norm 1.388797963878e-09 % max 7.106499154121e+01 min > 1.551430000199e-05 max/min 4.580612179221e+06 > 311 KSP preconditioned resid norm 1.381186866318e-09 true resid norm > 7.734260499993e-02 ||r(i)||/||b|| 1.222538489959e-01 > 311 KSP Residual norm 1.381186866318e-09 % max 7.191912503503e+01 min > 1.551226304067e-05 max/min 4.636275496779e+06 > 312 KSP preconditioned resid norm 1.373699547088e-09 true resid norm > 7.733543571137e-02 ||r(i)||/||b|| 1.222425166504e-01 > 312 KSP Residual norm 1.373699547088e-09 % max 7.277329298577e+01 min > 1.551027054238e-05 max/min 4.691942206096e+06 > 313 KSP preconditioned resid norm 1.366332687169e-09 true resid norm > 7.732845565140e-02 ||r(i)||/||b|| 1.222314834147e-01 > 313 KSP Residual norm 1.366332687169e-09 % max 7.362749379365e+01 min > 1.550832106753e-05 max/min 4.747612167238e+06 > 314 KSP preconditioned resid norm 1.359083090818e-09 true resid norm > 7.732165769144e-02 ||r(i)||/||b|| 1.222207380206e-01 > 314 KSP Residual norm 1.359083090818e-09 % max 7.448172595555e+01 min > 1.550641323816e-05 max/min 4.803285247957e+06 > 315 KSP preconditioned resid norm 1.351947679741e-09 true resid norm > 7.731503505485e-02 ||r(i)||/||b|| 1.222102697565e-01 > 315 KSP Residual norm 1.351947679741e-09 % max 7.533598805789e+01 min > 1.550454573403e-05 max/min 4.858961323358e+06 > 316 KSP preconditioned resid norm 1.344923487600e-09 true resid norm > 7.730858128088e-02 ||r(i)||/||b|| 1.222000684101e-01 > 316 KSP Residual norm 1.344923487600e-09 % max 7.619027877011e+01 min > 1.550271729250e-05 max/min 4.914640274512e+06 > 317 KSP preconditioned resid norm 1.338007654833e-09 true resid norm > 7.730229022367e-02 ||r(i)||/||b|| 1.221901242666e-01 > 317 KSP Residual norm 1.338007654833e-09 % max 7.704459683868e+01 min > 1.550092669784e-05 max/min 4.970321990454e+06 > 318 KSP preconditioned resid norm 1.331197423761e-09 true resid norm > 7.729615601478e-02 ||r(i)||/||b|| 1.221804280500e-01 > 318 KSP Residual norm 1.331197423761e-09 % max 7.789894108168e+01 min > 1.549917278716e-05 max/min 5.026006365076e+06 > 319 KSP preconditioned resid norm 1.324490133971e-09 true resid norm > 7.729017306553e-02 ||r(i)||/||b|| 1.221709709264e-01 > 319 KSP Residual norm 1.324490133971e-09 % max 7.875331038375e+01 min > 1.549745444738e-05 max/min 5.081693296865e+06 > 320 KSP preconditioned resid norm 1.317883217946e-09 true resid norm > 7.728433604124e-02 ||r(i)||/||b|| 1.221617444634e-01 > 320 KSP Residual norm 1.317883217946e-09 % max 7.960770369153e+01 min > 1.549577060563e-05 max/min 5.137382690902e+06 > 321 KSP preconditioned resid norm 1.311374196936e-09 true resid norm > 7.727863984637e-02 ||r(i)||/||b|| 1.221527406065e-01 > 321 KSP Residual norm 1.311374196936e-09 % max 8.046212000946e+01 min > 1.549412023217e-05 max/min 5.193074456876e+06 > 322 KSP preconditioned resid norm 1.304960677053e-09 true resid norm > 7.727307961732e-02 ||r(i)||/||b|| 1.221439516680e-01 > 322 KSP Residual norm 1.304960677053e-09 % max 8.131655839582e+01 min > 1.549250234145e-05 max/min 5.248768507734e+06 > 323 KSP preconditioned resid norm 1.298640345569e-09 true resid norm > 7.726765070255e-02 ||r(i)||/||b|| 1.221353702952e-01 > 323 KSP Residual norm 1.298640345569e-09 % max 8.217101795924e+01 min > 1.549091598505e-05 max/min 5.304464761060e+06 > 324 KSP preconditioned resid norm 1.292410967416e-09 true resid norm > 7.726234866242e-02 ||r(i)||/||b|| 1.221269894705e-01 > 324 KSP Residual norm 1.292410967416e-09 % max 8.302549785536e+01 min > 1.548936024742e-05 max/min 5.360163139676e+06 > 325 KSP preconditioned resid norm 1.286270381860e-09 true resid norm > 7.725716924438e-02 ||r(i)||/||b|| 1.221188024720e-01 > 325 KSP Residual norm 1.286270381860e-09 % max 8.387999728379e+01 min > 1.548783425086e-05 max/min 5.415863569118e+06 > 326 KSP preconditioned resid norm 1.280216499357e-09 true resid norm > 7.725210838122e-02 ||r(i)||/||b|| 1.221108028707e-01 > 326 KSP Residual norm 1.280216499357e-09 % max 8.473451548533e+01 min > 1.548633715258e-05 max/min 5.471565977834e+06 > 327 KSP preconditioned resid norm 1.274247298561e-09 true resid norm > 7.724716218140e-02 ||r(i)||/||b|| 1.221029845154e-01 > 327 KSP Residual norm 1.274247298561e-09 % max 8.558905173932e+01 min > 1.548486813770e-05 max/min 5.527270298862e+06 > 328 KSP preconditioned resid norm 1.268360823492e-09 true resid norm > 7.724232691489e-02 ||r(i)||/||b|| 1.220953415101e-01 > 328 KSP Residual norm 1.268360823492e-09 % max 8.644360536128e+01 min > 1.548342642511e-05 max/min 5.582976467087e+06 > 329 KSP preconditioned resid norm 1.262555180844e-09 true resid norm > 7.723759900898e-02 ||r(i)||/||b|| 1.220878682074e-01 > 329 KSP Residual norm 1.262555180844e-09 % max 8.729817570067e+01 min > 1.548201126003e-05 max/min 5.638684421192e+06 > 330 KSP preconditioned resid norm 1.256828537422e-09 true resid norm > 7.723297504110e-02 ||r(i)||/||b|| 1.220805591975e-01 > 330 KSP Residual norm 1.256828537422e-09 % max 8.815276213883e+01 min > 1.548062191472e-05 max/min 5.694394102799e+06 > 331 KSP preconditioned resid norm 1.251179117715e-09 true resid norm > 7.722845172784e-02 ||r(i)||/||b|| 1.220734092902e-01 > 331 KSP Residual norm 1.251179117715e-09 % max 8.900736408702e+01 min > 1.547925768954e-05 max/min 5.750105455454e+06 > 332 KSP preconditioned resid norm 1.245605201579e-09 true resid norm > 7.722402591733e-02 ||r(i)||/||b|| 1.220664135034e-01 > 332 KSP Residual norm 1.245605201579e-09 % max 8.986198098471e+01 min > 1.547791791175e-05 max/min 5.805818424484e+06 > 333 KSP preconditioned resid norm 1.240105122040e-09 true resid norm > 7.721969459269e-02 ||r(i)||/||b|| 1.220595670686e-01 > 333 KSP Residual norm 1.240105122040e-09 % max 9.071661229787e+01 min > 1.547660192837e-05 max/min 5.861532959091e+06 > 334 KSP preconditioned resid norm 1.234677263197e-09 true resid norm > 7.721545484810e-02 ||r(i)||/||b|| 1.220528653924e-01 > 334 KSP Residual norm 1.234677263197e-09 % max 9.157125751743e+01 min > 1.547530911137e-05 max/min 5.917249009919e+06 > 335 KSP preconditioned resid norm 1.229320058231e-09 true resid norm > 7.721130389656e-02 ||r(i)||/||b|| 1.220463040695e-01 > 335 KSP Residual norm 1.229320058231e-09 % max 9.242591615789e+01 min > 1.547403885462e-05 max/min 5.972966529699e+06 > 336 KSP preconditioned resid norm 1.224031987499e-09 true resid norm > 7.720723906513e-02 ||r(i)||/||b|| 1.220398788749e-01 > 336 KSP Residual norm 1.224031987499e-09 % max 9.328058775593e+01 min > 1.547279057027e-05 max/min 6.028685474176e+06 > 337 KSP preconditioned resid norm 1.218811576730e-09 true resid norm > 7.720325777856e-02 ||r(i)||/||b|| 1.220335857379e-01 > 337 KSP Residual norm 1.218811576730e-09 % max 9.413527186918e+01 min > 1.547156369733e-05 max/min 6.084405798326e+06 > 338 KSP preconditioned resid norm 1.213657395293e-09 true resid norm > 7.719935756389e-02 ||r(i)||/||b|| 1.220274207496e-01 > 338 KSP Residual norm 1.213657395293e-09 % max 9.498996807506e+01 min > 1.547035768835e-05 max/min 6.140127461085e+06 > 339 KSP preconditioned resid norm 1.208568054553e-09 true resid norm > 7.719553604324e-02 ||r(i)||/||b|| 1.220213801513e-01 > 339 KSP Residual norm 1.208568054553e-09 % max 9.584467596970e+01 min > 1.546917201274e-05 max/min 6.195850423720e+06 > 340 KSP preconditioned resid norm 1.203542206295e-09 true resid norm > 7.719179092494e-02 ||r(i)||/||b|| 1.220154603206e-01 > 340 KSP Residual norm 1.203542206295e-09 % max 9.669939516689e+01 min > 1.546800616544e-05 max/min 6.251574645926e+06 > 341 KSP preconditioned resid norm 1.198578541230e-09 true resid norm > 7.718812001002e-02 ||r(i)||/||b|| 1.220096577817e-01 > 341 KSP Residual norm 1.198578541230e-09 % max 9.755412529717e+01 min > 1.546685964925e-05 max/min 6.307300092550e+06 > 342 KSP preconditioned resid norm 1.193675787557e-09 true resid norm > 7.718452117706e-02 ||r(i)||/||b|| 1.220039691812e-01 > 342 KSP Residual norm 1.193675787557e-09 % max 9.840886600689e+01 min > 1.546573198975e-05 max/min 6.363026727225e+06 > 343 KSP preconditioned resid norm 1.188832709597e-09 true resid norm > 7.718099238265e-02 ||r(i)||/||b|| 1.219983912892e-01 > 343 KSP Residual norm 1.188832709597e-09 % max 9.926361695742e+01 min > 1.546462272342e-05 max/min 6.418754516855e+06 > 344 KSP preconditioned resid norm 1.184048106481e-09 true resid norm > 7.717753165982e-02 ||r(i)||/||b|| 1.219929209965e-01 > 344 KSP Residual norm 1.184048106481e-09 % max 1.001183778244e+02 min > 1.546353140667e-05 max/min 6.474483427579e+06 > 345 KSP preconditioned resid norm 1.179320810902e-09 true resid norm > 7.717413711596e-02 ||r(i)||/||b|| 1.219875553115e-01 > 345 KSP Residual norm 1.179320810902e-09 % max 1.009731482967e+02 min > 1.546245760538e-05 max/min 6.530213428787e+06 > 346 KSP preconditioned resid norm 1.174649687921e-09 true resid norm > 7.717080691875e-02 ||r(i)||/||b|| 1.219822913380e-01 > 346 KSP Residual norm 1.174649687921e-09 % max 1.018279280764e+02 min > 1.546140090201e-05 max/min 6.585944489880e+06 > 347 KSP preconditioned resid norm 1.170033633815e-09 true resid norm > 7.716753931049e-02 ||r(i)||/||b|| 1.219771262975e-01 > 347 KSP Residual norm 1.170033633815e-09 % max 1.026827168774e+02 min > 1.546036089024e-05 max/min 6.641676582224e+06 > 348 KSP preconditioned resid norm 1.165471574989e-09 true resid norm > 7.716433258709e-02 ||r(i)||/||b|| 1.219720574964e-01 > 348 KSP Residual norm 1.165471574989e-09 % max 1.035375144253e+02 min > 1.545933718045e-05 max/min 6.697409676543e+06 > 349 KSP preconditioned resid norm 1.160962466922e-09 true resid norm > 7.716118511043e-02 ||r(i)||/||b|| 1.219670823454e-01 > 349 KSP Residual norm 1.160962466922e-09 % max 1.043923204565e+02 min > 1.545832939107e-05 max/min 6.753143746360e+06 > 350 KSP preconditioned resid norm 1.156505293164e-09 true resid norm > 7.715809530016e-02 ||r(i)||/||b|| 1.219621983465e-01 > 350 KSP Residual norm 1.156505293164e-09 % max 1.052471347179e+02 min > 1.545733715254e-05 max/min 6.808878766069e+06 > 351 KSP preconditioned resid norm 1.152099064375e-09 true resid norm > 7.715506162638e-02 ||r(i)||/||b|| 1.219574030814e-01 > 351 KSP Residual norm 1.152099064375e-09 % max 1.061019569664e+02 min > 1.545636010731e-05 max/min 6.864614710693e+06 > 352 KSP preconditioned resid norm 1.147742817401e-09 true resid norm > 7.715208261314e-02 ||r(i)||/||b|| 1.219526942171e-01 > 352 KSP Residual norm 1.147742817401e-09 % max 1.069567869683e+02 min > 1.545539791075e-05 max/min 6.920351555224e+06 > 353 KSP preconditioned resid norm 1.143435614387e-09 true resid norm > 7.714915683577e-02 ||r(i)||/||b|| 1.219480695016e-01 > 353 KSP Residual norm 1.143435614387e-09 % max 1.078116244987e+02 min > 1.545445022700e-05 max/min 6.976089276237e+06 > 354 KSP preconditioned resid norm 1.139176541935e-09 true resid norm > 7.714628291836e-02 ||r(i)||/||b|| 1.219435267600e-01 > 354 KSP Residual norm 1.139176541935e-09 % max 1.086664693415e+02 min > 1.545351672773e-05 max/min 7.031827852270e+06 > 355 KSP preconditioned resid norm 1.134964710284e-09 true resid norm > 7.714345952721e-02 ||r(i)||/||b|| 1.219390638843e-01 > 355 KSP Residual norm 1.134964710284e-09 % max 1.095213212888e+02 min > 1.545259710251e-05 max/min 7.087567258902e+06 > 356 KSP preconditioned resid norm 1.130799252532e-09 true resid norm > 7.714068537802e-02 ||r(i)||/||b|| 1.219346788443e-01 > 356 KSP Residual norm 1.130799252532e-09 % max 1.103761801401e+02 min > 1.545169103719e-05 max/min 7.143307478410e+06 > 357 KSP preconditioned resid norm 1.126679323889e-09 true resid norm > 7.713795922628e-02 ||r(i)||/||b|| 1.219303696729e-01 > 357 KSP Residual norm 1.126679323889e-09 % max 1.112310457026e+02 min > 1.545079823844e-05 max/min 7.199048488371e+06 > 358 KSP preconditioned resid norm 1.122604100951e-09 true resid norm > 7.713527986831e-02 ||r(i)||/||b|| 1.219261344674e-01 > 358 KSP Residual norm 1.122604100951e-09 % max 1.120859177906e+02 min > 1.544991841664e-05 max/min 7.254790269309e+06 > 359 KSP preconditioned resid norm 1.118572781015e-09 true resid norm > 7.713264614048e-02 ||r(i)||/||b|| 1.219219713885e-01 > 359 KSP Residual norm 1.118572781015e-09 % max 1.129407962252e+02 min > 1.544905128897e-05 max/min 7.310532803124e+06 > 360 KSP preconditioned resid norm 1.114584581410e-09 true resid norm > 7.713005691674e-02 ||r(i)||/||b|| 1.219178786563e-01 > 360 KSP Residual norm 1.114584581410e-09 % max 1.137956808336e+02 min > 1.544819658611e-05 max/min 7.366276069785e+06 > 361 KSP preconditioned resid norm 1.110638738862e-09 true resid norm > 7.712751110547e-02 ||r(i)||/||b|| 1.219138545454e-01 > 361 KSP Residual norm 1.110638738862e-09 % max 1.146505714496e+02 min > 1.544735404005e-05 max/min 7.422020052905e+06 > 362 KSP preconditioned resid norm 1.106734508879e-09 true resid norm > 7.712500764805e-02 ||r(i)||/||b|| 1.219098973822e-01 > 362 KSP Residual norm 1.106734508879e-09 % max 1.155054679126e+02 min > 1.544652339415e-05 max/min 7.477764734840e+06 > 363 KSP preconditioned resid norm 1.102871165161e-09 true resid norm > 7.712254552760e-02 ||r(i)||/||b|| 1.219060055596e-01 > 363 KSP Residual norm 1.102871165161e-09 % max 1.163603700679e+02 min > 1.544570439898e-05 max/min 7.533510098483e+06 > 364 KSP preconditioned resid norm 1.099047999031e-09 true resid norm > 7.712012375229e-02 ||r(i)||/||b|| 1.219021775097e-01 > 364 KSP Residual norm 1.099047999031e-09 % max 1.172152777659e+02 min > 1.544489680841e-05 max/min 7.589256129056e+06 > 365 KSP preconditioned resid norm 1.095264318893e-09 true resid norm > 7.711774136137e-02 ||r(i)||/||b|| 1.218984117138e-01 > 365 KSP Residual norm 1.095264318893e-09 % max 1.180701908624e+02 min > 1.544410038906e-05 max/min 7.645002809354e+06 > 366 KSP preconditioned resid norm 1.091519449702e-09 true resid norm > 7.711539742160e-02 ||r(i)||/||b|| 1.218947066969e-01 > 366 KSP Residual norm 1.091519449702e-09 % max 1.189251092179e+02 min > 1.544331490869e-05 max/min 7.700750125285e+06 > 367 KSP preconditioned resid norm 1.087812732460e-09 true resid norm > 7.711309103674e-02 ||r(i)||/||b|| 1.218910610423e-01 > 367 KSP Residual norm 1.087812732460e-09 % max 1.197800326979e+02 min > 1.544254014346e-05 max/min 7.756498062180e+06 > 368 KSP preconditioned resid norm 1.084143523729e-09 true resid norm > 7.711082133212e-02 ||r(i)||/||b|| 1.218874733673e-01 > 368 KSP Residual norm 1.084143523729e-09 % max 1.206349611723e+02 min > 1.544177587792e-05 max/min 7.812246604670e+06 > 369 KSP preconditioned resid norm 1.080511195162e-09 true resid norm > 7.710858745788e-02 ||r(i)||/||b|| 1.218839423287e-01 > 369 KSP Residual norm 1.080511195162e-09 % max 1.214898945154e+02 min > 1.544102189887e-05 max/min 7.867995739603e+06 > 370 KSP preconditioned resid norm 1.076915133049e-09 true resid norm > 7.710638859341e-02 ||r(i)||/||b|| 1.218804666293e-01 > 370 KSP Residual norm 1.076915133049e-09 % max 1.223448326055e+02 min > 1.544027799826e-05 max/min 7.923745454537e+06 > 371 KSP preconditioned resid norm 1.073354737886e-09 true resid norm > 7.710422394121e-02 ||r(i)||/||b|| 1.218770450086e-01 > 371 KSP Residual norm 1.073354737886e-09 % max 1.231997753251e+02 min > 1.543954397783e-05 max/min 7.979495735240e+06 > 372 KSP preconditioned resid norm 1.069829423948e-09 true resid norm > 7.710209272561e-02 ||r(i)||/||b|| 1.218736762404e-01 > 372 KSP Residual norm 1.069829423948e-09 % max 1.240547225605e+02 min > 1.543881964126e-05 max/min 8.035246569561e+06 > 373 KSP preconditioned resid norm 1.066338618893e-09 true resid norm > 7.709999419813e-02 ||r(i)||/||b|| 1.218703591416e-01 > 373 KSP Residual norm 1.066338618893e-09 % max 1.249096742015e+02 min > 1.543810479771e-05 max/min 8.090997945550e+06 > 374 KSP preconditioned resid norm 1.062881763364e-09 true resid norm > 7.709792762746e-02 ||r(i)||/||b|| 1.218670925562e-01 > 374 KSP Residual norm 1.062881763364e-09 % max 1.257646301416e+02 min > 1.543739926076e-05 max/min 8.146749851917e+06 > 375 KSP preconditioned resid norm 1.059458310615e-09 true resid norm > 7.709589230803e-02 ||r(i)||/||b|| 1.218638753691e-01 > 375 KSP Residual norm 1.059458310615e-09 % max 1.266195902777e+02 min > 1.543670285377e-05 max/min 8.202502275075e+06 > 376 KSP preconditioned resid norm 1.056067726151e-09 true resid norm > 7.709388754828e-02 ||r(i)||/||b|| 1.218607064870e-01 > 376 KSP Residual norm 1.056067726151e-09 % max 1.274745545100e+02 min > 1.543601539683e-05 max/min 8.258255205948e+06 > 377 KSP preconditioned resid norm 1.052709487372e-09 true resid norm > 7.709191268570e-02 ||r(i)||/||b|| 1.218575848627e-01 > 377 KSP Residual norm 1.052709487372e-09 % max 1.283295227415e+02 min > 1.543533671742e-05 max/min 8.314008634271e+06 > 378 KSP preconditioned resid norm 1.049383083240e-09 true resid norm > 7.708996706456e-02 ||r(i)||/||b|| 1.218545094598e-01 > 378 KSP Residual norm 1.049383083240e-09 % max 1.291844948787e+02 min > 1.543466665321e-05 max/min 8.369762546949e+06 > 379 KSP preconditioned resid norm 1.046088013948e-09 true resid norm > 7.708805005888e-02 ||r(i)||/||b|| 1.218514792888e-01 > 379 KSP Residual norm 1.046088013948e-09 % max 1.300394708306e+02 min > 1.543400503818e-05 max/min 8.425516935422e+06 > 380 KSP preconditioned resid norm 1.042823790607e-09 true resid norm > 7.708616105731e-02 ||r(i)||/||b|| 1.218484933832e-01 > 380 KSP Residual norm 1.042823790607e-09 % max 1.308944505091e+02 min > 1.543335171343e-05 max/min 8.481271789797e+06 > 381 KSP preconditioned resid norm 1.039589934939e-09 true resid norm > 7.708429945932e-02 ||r(i)||/||b|| 1.218455507940e-01 > 381 KSP Residual norm 1.039589934939e-09 % max 1.317494338290e+02 min > 1.543270652608e-05 max/min 8.537027099320e+06 > 382 KSP preconditioned resid norm 1.036385978986e-09 true resid norm > 7.708246468814e-02 ||r(i)||/||b|| 1.218426506093e-01 > 382 KSP Residual norm 1.036385978986e-09 % max 1.326044207073e+02 min > 1.543206932545e-05 max/min 8.592782854377e+06 > 383 KSP preconditioned resid norm 1.033211464824e-09 true resid norm > 7.708065617992e-02 ||r(i)||/||b|| 1.218397919379e-01 > 383 KSP Residual norm 1.033211464824e-09 % max 1.334594110638e+02 min > 1.543143995964e-05 max/min 8.648539048386e+06 > 384 KSP preconditioned resid norm 1.030065944287e-09 true resid norm > 7.707887338809e-02 ||r(i)||/||b|| 1.218369739159e-01 > 384 KSP Residual norm 1.030065944287e-09 % max 1.343144048204e+02 min > 1.543081828661e-05 max/min 8.704295671538e+06 > 385 KSP preconditioned resid norm 1.026948978700e-09 true resid norm > 7.707711577599e-02 ||r(i)||/||b|| 1.218341956950e-01 > 385 KSP Residual norm 1.026948978700e-09 % max 1.351694019016e+02 min > 1.543020417169e-05 max/min 8.760052712044e+06 > 386 KSP preconditioned resid norm 1.023860138629e-09 true resid norm > 7.707538282676e-02 ||r(i)||/||b|| 1.218314564581e-01 > 386 KSP Residual norm 1.023860138629e-09 % max 1.360244022339e+02 min > 1.542959747165e-05 max/min 8.815810165097e+06 > 387 KSP preconditioned resid norm 1.020799003623e-09 true resid norm > 7.707367403825e-02 ||r(i)||/||b|| 1.218287554116e-01 > 387 KSP Residual norm 1.020799003623e-09 % max 1.368794057461e+02 min > 1.542899805510e-05 max/min 8.871568021284e+06 > 388 KSP preconditioned resid norm 1.017765161980e-09 true resid norm > 7.707198891737e-02 ||r(i)||/||b|| 1.218260917761e-01 > 388 KSP Residual norm 1.017765161980e-09 % max 1.377344123689e+02 min > 1.542840579215e-05 max/min 8.927326272360e+06 > 389 KSP preconditioned resid norm 1.014758210512e-09 true resid norm > 7.707032698231e-02 ||r(i)||/||b|| 1.218234647898e-01 > 389 KSP Residual norm 1.014758210512e-09 % max 1.385894220351e+02 min > 1.542782055322e-05 max/min 8.983084911899e+06 > 390 KSP preconditioned resid norm 1.011777754323e-09 true resid norm > 7.706868777483e-02 ||r(i)||/||b|| 1.218208737286e-01 > 390 KSP Residual norm 1.011777754323e-09 % max 1.394444346793e+02 min > 1.542724221860e-05 max/min 9.038843929678e+06 > 391 KSP preconditioned resid norm 1.008823406584e-09 true resid norm > 7.706707083757e-02 ||r(i)||/||b|| 1.218183178695e-01 > 391 KSP Residual norm 1.008823406584e-09 % max 1.402994502381e+02 min > 1.542667066261e-05 max/min 9.094603320870e+06 > 392 KSP preconditioned resid norm 1.005894788330e-09 true resid norm > 7.706547572694e-02 ||r(i)||/||b|| 1.218157965112e-01 > 392 KSP Residual norm 1.005894788330e-09 % max 1.411544686498e+02 min > 1.542610576846e-05 max/min 9.150363077267e+06 > 393 KSP preconditioned resid norm 1.002991528252e-09 true resid norm > 7.706390201354e-02 ||r(i)||/||b|| 1.218133089752e-01 > 393 KSP Residual norm 1.002991528252e-09 % max 1.420094898544e+02 min > 1.542554741975e-05 max/min 9.206123192267e+06 > 394 KSP preconditioned resid norm 1.000113262500e-09 true resid norm > 7.706234928097e-02 ||r(i)||/||b|| 1.218108546031e-01 > 394 KSP Residual norm 1.000113262500e-09 % max 1.428645137936e+02 min > 1.542499550445e-05 max/min 9.261883658399e+06 > 395 KSP preconditioned resid norm 9.972596344905e-10 true resid norm > 7.706081711850e-02 ||r(i)||/||b|| 1.218084327457e-01 > 395 KSP Residual norm 9.972596344905e-10 % max 1.437195404107e+02 min > 1.542444990971e-05 max/min 9.317644470431e+06 > 396 KSP preconditioned resid norm 9.944302947220e-10 true resid norm > 7.705930512894e-02 ||r(i)||/||b|| 1.218060427752e-01 > 396 KSP Residual norm 9.944302947220e-10 % max 1.445745696505e+02 min > 1.542391053187e-05 max/min 9.373405619265e+06 > 397 KSP preconditioned resid norm 9.916249005950e-10 true resid norm > 7.705781292267e-02 ||r(i)||/||b|| 1.218036840757e-01 > 397 KSP Residual norm 9.916249005950e-10 % max 1.454296014594e+02 min > 1.542337726140e-05 max/min 9.429167100999e+06 > 398 KSP preconditioned resid norm 9.888431162377e-10 true resid norm > 7.705634012295e-02 ||r(i)||/||b|| 1.218013560518e-01 > 398 KSP Residual norm 9.888431162377e-10 % max 1.462846357853e+02 min > 1.542284999699e-05 max/min 9.484928908327e+06 > 399 KSP preconditioned resid norm 9.860846123368e-10 true resid norm > 7.705488636085e-02 ||r(i)||/||b|| 1.217990581203e-01 > 399 KSP Residual norm 9.860846123368e-10 % max 1.471396725772e+02 min > 1.542232863599e-05 max/min 9.540691036362e+06 > 400 KSP preconditioned resid norm 9.833490659742e-10 true resid norm > 7.705345127778e-02 ||r(i)||/||b|| 1.217967897143e-01 > 400 KSP Residual norm 9.833490659742e-10 % max 1.479947117857e+02 min > 1.542181308274e-05 max/min 9.596453477403e+06 > 401 KSP preconditioned resid norm 9.806361604678e-10 true resid norm > 7.705203451983e-02 ||r(i)||/||b|| 1.217945502744e-01 > 401 KSP Residual norm 9.806361604678e-10 % max 1.488497533627e+02 min > 1.542130323624e-05 max/min 9.652216228579e+06 > 402 KSP preconditioned resid norm 9.779455852179e-10 true resid norm > 7.705063574650e-02 ||r(i)||/||b|| 1.217923392624e-01 > 402 KSP Residual norm 9.779455852179e-10 % max 1.497047972613e+02 min > 1.542079900965e-05 max/min 9.707979279646e+06 > 403 KSP preconditioned resid norm 9.752770355574e-10 true resid norm > 7.704925462217e-02 ||r(i)||/||b|| 1.217901561479e-01 > 403 KSP Residual norm 9.752770355574e-10 % max 1.505598434358e+02 min > 1.542030029976e-05 max/min 9.763742632051e+06 > 404 KSP preconditioned resid norm 9.726302126073e-10 true resid norm > 7.704789082703e-02 ||r(i)||/||b|| 1.217880004253e-01 > 404 KSP Residual norm 9.726302126073e-10 % max 1.514148918418e+02 min > 1.541980702756e-05 max/min 9.819506273407e+06 > 405 KSP preconditioned resid norm 9.700048231357e-10 true resid norm > 7.704654403631e-02 ||r(i)||/||b|| 1.217858715812e-01 > 405 KSP Residual norm 9.700048231357e-10 % max 1.522699424359e+02 min > 1.541931909671e-05 max/min 9.875270203627e+06 > 406 KSP preconditioned resid norm 9.674005794219e-10 true resid norm > 7.704521394321e-02 ||r(i)||/||b|| 1.217837691306e-01 > 406 KSP Residual norm 9.674005794219e-10 % max 1.531249951760e+02 min > 1.541883642568e-05 max/min 9.931034414568e+06 > 407 KSP preconditioned resid norm 9.648171991231e-10 true resid norm > 7.704390024238e-02 ||r(i)||/||b|| 1.217816925910e-01 > 407 KSP Residual norm 9.648171991231e-10 % max 1.539800500208e+02 min > 1.541835892931e-05 max/min 9.986798901676e+06 > 408 KSP preconditioned resid norm 9.622544051468e-10 true resid norm > 7.704260264030e-02 ||r(i)||/||b|| 1.217796414984e-01 > 408 KSP Residual norm 9.622544051468e-10 % max 1.548351069303e+02 min > 1.541788651841e-05 max/min 1.004256366431e+07 > 409 KSP preconditioned resid norm 9.597119255252e-10 true resid norm > 7.704132084797e-02 ||r(i)||/||b|| 1.217776153959e-01 > 409 KSP Residual norm 9.597119255252e-10 % max 1.556901658654e+02 min > 1.541741912358e-05 max/min 1.009832869026e+07 > 410 KSP preconditioned resid norm 9.571894932944e-10 true resid norm > 7.704005458204e-02 ||r(i)||/||b|| 1.217756138356e-01 > 410 KSP Residual norm 9.571894932944e-10 % max 1.565452267880e+02 min > 1.541695665871e-05 max/min 1.015409397934e+07 > 411 KSP preconditioned resid norm 9.546868463763e-10 true resid norm > 7.703880356970e-02 ||r(i)||/||b|| 1.217736363864e-01 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Fri May 18 05:10:06 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Fri, 18 May 2018 11:10:06 +0100 Subject: [petsc-users] Linear iterative solver cannot converge In-Reply-To: References: <7bab65849037a08d487af6f860bc2658@cam.ac.uk> Message-ID: <0f98542b5208fc578b9fea6806dd6b42@cam.ac.uk> Hello Matt, Thank you for your reply. It might due to that the email is too long. Below is the subsequent iterations. > In general, the iterative solver is very sensitive to the system. We > recommend > starting with a solver that has worked before for someone by checking > the literature. Ok, I will try to find it. Kind Regards, Shidi 411 KSP preconditioned resid norm 9.546868463763e-10 true resid norm 7.703880356970e-02 ||r(i)||/||b|| 1.217736363864e-01 411 KSP Residual norm 9.546868463763e-10 % max 1.574002896609e+02 min 1.541649904298e-05 max/min 1.020985952920e+07 412 KSP preconditioned resid norm 9.522037274646e-10 true resid norm 7.703756753884e-02 ||r(i)||/||b|| 1.217716826181e-01 412 KSP Residual norm 9.522037274646e-10 % max 1.582553544478e+02 min 1.541604620560e-05 max/min 1.026562533202e+07 413 KSP preconditioned resid norm 9.497398839135e-10 true resid norm 7.703634622699e-02 ||r(i)||/||b|| 1.217697521158e-01 413 KSP Residual norm 9.497398839135e-10 % max 1.591104211134e+02 min 1.541559807295e-05 max/min 1.032139138297e+07 414 KSP preconditioned resid norm 9.472950676296e-10 true resid norm 7.703513938163e-02 ||r(i)||/||b|| 1.217678444804e-01 414 KSP Residual norm 9.472950676296e-10 % max 1.599654896233e+02 min 1.541515456432e-05 max/min 1.037715768310e+07 415 KSP preconditioned resid norm 9.448690349674e-10 true resid norm 7.703394674743e-02 ||r(i)||/||b|| 1.217659593083e-01 415 KSP Residual norm 9.448690349674e-10 % max 1.608205599436e+02 min 1.541471562375e-05 max/min 1.043292421793e+07 416 KSP preconditioned resid norm 9.424615466266e-10 true resid norm 7.703276808042e-02 ||r(i)||/||b|| 1.217640962139e-01 416 KSP Residual norm 9.424615466266e-10 % max 1.616756320416e+02 min 1.541428116222e-05 max/min 1.048869099636e+07 417 KSP preconditioned resid norm 9.400723675537e-10 true resid norm 7.703160313689e-02 ||r(i)||/||b|| 1.217622548118e-01 417 KSP Residual norm 9.400723675537e-10 % max 1.625307058852e+02 min 1.541385112375e-05 max/min 1.054445800601e+07 418 KSP preconditioned resid norm 9.377012668452e-10 true resid norm 7.703045169141e-02 ||r(i)||/||b|| 1.217604347459e-01 418 KSP Residual norm 9.377012668452e-10 % max 1.633857814432e+02 min 1.541342543594e-05 max/min 1.060022524664e+07 419 KSP preconditioned resid norm 9.353480176536e-10 true resid norm 7.702931350860e-02 ||r(i)||/||b|| 1.217586356440e-01 419 KSP Residual norm 9.353480176536e-10 % max 1.642408586848e+02 min 1.541300403949e-05 max/min 1.065599271005e+07 420 KSP preconditioned resid norm 9.330123970972e-10 true resid norm 7.702818836330e-02 ||r(i)||/||b|| 1.217568571502e-01 420 KSP Residual norm 9.330123970972e-10 % max 1.650959375804e+02 min 1.541258686504e-05 max/min 1.071176039597e+07 421 KSP preconditioned resid norm 9.306941861701e-10 true resid norm 7.702707604016e-02 ||r(i)||/||b|| 1.217550989241e-01 421 KSP Residual norm 9.306941861701e-10 % max 1.659510181008e+02 min 1.541217384870e-05 max/min 1.076752830132e+07 422 KSP preconditioned resid norm 9.283931696573e-10 true resid norm 7.702597632846e-02 ||r(i)||/||b|| 1.217533606326e-01 422 KSP Residual norm 9.283931696573e-10 % max 1.668061002175e+02 min 1.541176492885e-05 max/min 1.082329642241e+07 423 KSP preconditioned resid norm 9.261091360496e-10 true resid norm 7.702488900721e-02 ||r(i)||/||b|| 1.217516419265e-01 423 KSP Residual norm 9.261091360496e-10 % max 1.676611839026e+02 min 1.541136004679e-05 max/min 1.087906475442e+07 424 KSP preconditioned resid norm 9.238418774628e-10 true resid norm 7.702381387948e-02 ||r(i)||/||b|| 1.217499424945e-01 424 KSP Residual norm 9.238418774628e-10 % max 1.685162691291e+02 min 1.541095914290e-05 max/min 1.093483329406e+07 425 KSP preconditioned resid norm 9.215911895577e-10 true resid norm 7.702275074538e-02 ||r(i)||/||b|| 1.217482620205e-01 425 KSP Residual norm 9.215911895577e-10 % max 1.693713558704e+02 min 1.541056215850e-05 max/min 1.099060203829e+07 426 KSP preconditioned resid norm 9.193568714629e-10 true resid norm 7.702169940327e-02 ||r(i)||/||b|| 1.217466001858e-01 426 KSP Residual norm 9.193568714629e-10 % max 1.702264441005e+02 min 1.541016903696e-05 max/min 1.104637098349e+07 427 KSP preconditioned resid norm 9.171387256997e-10 true resid norm 7.702065966772e-02 ||r(i)||/||b|| 1.217449566974e-01 427 KSP Residual norm 9.171387256997e-10 % max 1.710815337941e+02 min 1.540977971745e-05 max/min 1.110214012991e+07 428 KSP preconditioned resid norm 9.149365581086e-10 true resid norm 7.701963134449e-02 ||r(i)||/||b|| 1.217433312483e-01 428 KSP Residual norm 9.149365581086e-10 % max 1.719366249265e+02 min 1.540939414902e-05 max/min 1.115790947157e+07 429 KSP preconditioned resid norm 9.127501777777e-10 true resid norm 7.701861425445e-02 ||r(i)||/||b|| 1.217417235552e-01 429 KSP Residual norm 9.127501777777e-10 % max 1.727917174733e+02 min 1.540901228158e-05 max/min 1.121367900263e+07 430 KSP preconditioned resid norm 9.105793969736e-10 true resid norm 7.701760821750e-02 ||r(i)||/||b|| 1.217401333335e-01 430 KSP Residual norm 9.105793969736e-10 % max 1.736468114109e+02 min 1.540863405594e-05 max/min 1.126944872468e+07 431 KSP preconditioned resid norm 9.084240310736e-10 true resid norm 7.701661305348e-02 ||r(i)||/||b|| 1.217385602984e-01 431 KSP Residual norm 9.084240310736e-10 % max 1.745019067161e+02 min 1.540825942103e-05 max/min 1.132521863423e+07 432 KSP preconditioned resid norm 9.062838984992e-10 true resid norm 7.701562859033e-02 ||r(i)||/||b|| 1.217370041779e-01 432 KSP Residual norm 9.062838984992e-10 % max 1.753570033662e+02 min 1.540788832843e-05 max/min 1.138098872658e+07 433 KSP preconditioned resid norm 9.041588206529e-10 true resid norm 7.701465466315e-02 ||r(i)||/||b|| 1.217354647115e-01 433 KSP Residual norm 9.041588206529e-10 % max 1.762121013392e+02 min 1.540752072662e-05 max/min 1.143675900009e+07 434 KSP preconditioned resid norm 9.020486218545e-10 true resid norm 7.701369109927e-02 ||r(i)||/||b|| 1.217339416261e-01 434 KSP Residual norm 9.020486218545e-10 % max 1.770672006133e+02 min 1.540715656775e-05 max/min 1.149252945115e+07 435 KSP preconditioned resid norm 8.999531292809e-10 true resid norm 7.701273774708e-02 ||r(i)||/||b|| 1.217324346821e-01 435 KSP Residual norm 8.999531292809e-10 % max 1.779223011674e+02 min 1.540679580267e-05 max/min 1.154830007785e+07 436 KSP preconditioned resid norm 8.978721729060e-10 true resid norm 7.701179443778e-02 ||r(i)||/||b|| 1.217309436126e-01 436 KSP Residual norm 8.978721729060e-10 % max 1.787774029806e+02 min 1.540643838466e-05 max/min 1.160407087718e+07 437 KSP preconditioned resid norm 8.958055854435e-10 true resid norm 7.701086102012e-02 ||r(i)||/||b|| 1.217294681788e-01 437 KSP Residual norm 8.958055854435e-10 % max 1.796325060327e+02 min 1.540608426227e-05 max/min 1.165984185044e+07 438 KSP preconditioned resid norm 8.937532022898e-10 true resid norm 7.700993734290e-02 ||r(i)||/||b|| 1.217280081414e-01 438 KSP Residual norm 8.937532022898e-10 % max 1.804876103037e+02 min 1.540573340057e-05 max/min 1.171561298711e+07 439 KSP preconditioned resid norm 8.917148614689e-10 true resid norm 7.700902325376e-02 ||r(i)||/||b|| 1.217265632596e-01 439 KSP Residual norm 8.917148614689e-10 % max 1.813427157743e+02 min 1.540538574951e-05 max/min 1.177138428877e+07 440 KSP preconditioned resid norm 8.896904035794e-10 true resid norm 7.700811860899e-02 ||r(i)||/||b|| 1.217251333064e-01 440 KSP Residual norm 8.896904035794e-10 % max 1.821978224254e+02 min 1.540504126229e-05 max/min 1.182715575526e+07 441 KSP preconditioned resid norm 8.876796717414e-10 true resid norm 7.700722326477e-02 ||r(i)||/||b|| 1.217237180544e-01 441 KSP Residual norm 8.876796717414e-10 % max 1.830529302383e+02 min 1.540469989749e-05 max/min 1.188292738297e+07 442 KSP preconditioned resid norm 8.856825115457e-10 true resid norm 7.700633707629e-02 ||r(i)||/||b|| 1.217223172747e-01 442 KSP Residual norm 8.856825115457e-10 % max 1.839080391948e+02 min 1.540436161576e-05 max/min 1.193869916730e+07 443 KSP preconditioned resid norm 8.836987710040e-10 true resid norm 7.700545991462e-02 ||r(i)||/||b|| 1.217209307635e-01 443 KSP Residual norm 8.836987710040e-10 % max 1.847631492771e+02 min 1.540402637418e-05 max/min 1.199447110703e+07 444 KSP preconditioned resid norm 8.817283005006e-10 true resid norm 7.700459163848e-02 ||r(i)||/||b|| 1.217195582974e-01 444 KSP Residual norm 8.817283005006e-10 % max 1.856182604675e+02 min 1.540369412984e-05 max/min 1.205024320159e+07 445 KSP preconditioned resid norm 8.797709527445e-10 true resid norm 7.700373211375e-02 ||r(i)||/||b|| 1.217181996645e-01 445 KSP Residual norm 8.797709527445e-10 % max 1.864733727491e+02 min 1.540336484553e-05 max/min 1.210601544657e+07 446 KSP preconditioned resid norm 8.778265827233e-10 true resid norm 7.700288121255e-02 ||r(i)||/||b|| 1.217168546627e-01 446 KSP Residual norm 8.778265827233e-10 % max 1.873284861049e+02 min 1.540303848040e-05 max/min 1.216178784097e+07 447 KSP preconditioned resid norm 8.758950476583e-10 true resid norm 7.700203881086e-02 ||r(i)||/||b|| 1.217155230958e-01 447 KSP Residual norm 8.758950476583e-10 % max 1.881836005185e+02 min 1.540271499238e-05 max/min 1.221756038540e+07 448 KSP preconditioned resid norm 8.739762069604e-10 true resid norm 7.700120478187e-02 ||r(i)||/||b|| 1.217142047635e-01 448 KSP Residual norm 8.739762069604e-10 % max 1.890387159738e+02 min 1.540239435002e-05 max/min 1.227333307263e+07 449 KSP preconditioned resid norm 8.720699221868e-10 true resid norm 7.700037900278e-02 ||r(i)||/||b|| 1.217128994717e-01 449 KSP Residual norm 8.720699221868e-10 % max 1.898938324550e+02 min 1.540207651332e-05 max/min 1.232910590276e+07 450 KSP preconditioned resid norm 8.701760569996e-10 true resid norm 7.699956135165e-02 ||r(i)||/||b|| 1.217116070275e-01 450 KSP Residual norm 8.701760569996e-10 % max 1.907489499467e+02 min 1.540176144765e-05 max/min 1.238487887214e+07 451 KSP preconditioned resid norm 8.682944771243e-10 true resid norm 7.699875171161e-02 ||r(i)||/||b|| 1.217103272463e-01 451 KSP Residual norm 8.682944771243e-10 % max 1.916040684336e+02 min 1.540144910772e-05 max/min 1.244065198628e+07 452 KSP preconditioned resid norm 8.664250503101e-10 true resid norm 7.699794997108e-02 ||r(i)||/||b|| 1.217090599517e-01 452 KSP Residual norm 8.664250503101e-10 % max 1.924591879009e+02 min 1.540113947663e-05 max/min 1.249642522834e+07 453 KSP preconditioned resid norm 8.645676462907e-10 true resid norm 7.699715601200e-02 ||r(i)||/||b|| 1.217078049571e-01 453 KSP Residual norm 8.645676462907e-10 % max 1.933143083340e+02 min 1.540083249795e-05 max/min 1.255219861393e+07 454 KSP preconditioned resid norm 8.627221367466e-10 true resid norm 7.699636972505e-02 ||r(i)||/||b|| 1.217065620897e-01 454 KSP Residual norm 8.627221367466e-10 % max 1.941694297187e+02 min 1.540052815040e-05 max/min 1.260797213072e+07 455 KSP preconditioned resid norm 8.608883952670e-10 true resid norm 7.699559100258e-02 ||r(i)||/||b|| 1.217053311792e-01 455 KSP Residual norm 8.608883952670e-10 % max 1.950245520408e+02 min 1.540022639631e-05 max/min 1.266374578023e+07 456 KSP preconditioned resid norm 8.590662973146e-10 true resid norm 7.699481973160e-02 ||r(i)||/||b|| 1.217041120472e-01 456 KSP Residual norm 8.590662973146e-10 % max 1.958796752868e+02 min 1.539992720743e-05 max/min 1.271951955670e+07 457 KSP preconditioned resid norm 8.572557201889e-10 true resid norm 7.699405581233e-02 ||r(i)||/||b|| 1.217029045359e-01 457 KSP Residual norm 8.572557201889e-10 % max 1.967347994430e+02 min 1.539963054120e-05 max/min 1.277529346672e+07 458 KSP preconditioned resid norm 8.554565429923e-10 true resid norm 7.699329914254e-02 ||r(i)||/||b|| 1.217017084837e-01 458 KSP Residual norm 8.554565429923e-10 % max 1.975899244964e+02 min 1.539933638367e-05 max/min 1.283106749365e+07 459 KSP preconditioned resid norm 8.536686465960e-10 true resid norm 7.699254961914e-02 ||r(i)||/||b|| 1.217005237276e-01 459 KSP Residual norm 8.536686465960e-10 % max 1.984450504338e+02 min 1.539904468019e-05 max/min 1.288684165512e+07 460 KSP preconditioned resid norm 8.518919136065e-10 true resid norm 7.699180714235e-02 ||r(i)||/||b|| 1.216993501100e-01 460 KSP Residual norm 8.518919136065e-10 % max 1.993001772428e+02 min 1.539875541750e-05 max/min 1.294261593481e+07 461 KSP preconditioned resid norm 8.501262283338e-10 true resid norm 7.699107161430e-02 ||r(i)||/||b|| 1.216981874761e-01 461 KSP Residual norm 8.501262283338e-10 % max 2.001553049106e+02 min 1.539846855934e-05 max/min 1.299839033598e+07 462 KSP preconditioned resid norm 8.483714767594e-10 true resid norm 7.699034293767e-02 ||r(i)||/||b|| 1.216970356721e-01 462 KSP Residual norm 8.483714767594e-10 % max 2.010104334253e+02 min 1.539818407370e-05 max/min 1.305416485887e+07 463 KSP preconditioned resid norm 8.466275465054e-10 true resid norm 7.698962102339e-02 ||r(i)||/||b|| 1.216958945572e-01 463 KSP Residual norm 8.466275465054e-10 % max 2.018655627746e+02 min 1.539790193476e-05 max/min 1.310993949889e+07 464 KSP preconditioned resid norm 8.448943268044e-10 true resid norm 7.698890577352e-02 ||r(i)||/||b|| 1.216947639765e-01 464 KSP Residual norm 8.448943268044e-10 % max 2.027206929469e+02 min 1.539762211089e-05 max/min 1.316571425685e+07 465 KSP preconditioned resid norm 8.431717084698e-10 true resid norm 7.698819709932e-02 ||r(i)||/||b|| 1.216936437900e-01 465 KSP Residual norm 8.431717084698e-10 % max 2.035758239306e+02 min 1.539734457892e-05 max/min 1.322148912673e+07 466 KSP preconditioned resid norm 8.414595838671e-10 true resid norm 7.698749491585e-02 ||r(i)||/||b|| 1.216925338631e-01 466 KSP Residual norm 8.414595838671e-10 % max 2.044309557144e+02 min 1.539706930707e-05 max/min 1.327726411029e+07 467 KSP preconditioned resid norm 8.397578468854e-10 true resid norm 7.698679913097e-02 ||r(i)||/||b|| 1.216914340504e-01 467 KSP Residual norm 8.397578468854e-10 % max 2.052860882871e+02 min 1.539679626817e-05 max/min 1.333303920580e+07 468 KSP preconditioned resid norm 8.380663929100e-10 true resid norm 7.698610965659e-02 ||r(i)||/||b|| 1.216903442126e-01 468 KSP Residual norm 8.380663929100e-10 % max 2.061412216378e+02 min 1.539652542826e-05 max/min 1.338881441779e+07 469 KSP preconditioned resid norm 8.363851187952e-10 true resid norm 7.698542641515e-02 ||r(i)||/||b|| 1.216892642271e-01 469 KSP Residual norm 8.363851187952e-10 % max 2.069963557558e+02 min 1.539625677349e-05 max/min 1.344458973379e+07 470 KSP preconditioned resid norm 8.347139228382e-10 true resid norm 7.698474931774e-02 ||r(i)||/||b|| 1.216881939532e-01 470 KSP Residual norm 8.347139228382e-10 % max 2.078514906305e+02 min 1.539599027325e-05 max/min 1.350036515622e+07 471 KSP preconditioned resid norm 8.330527047528e-10 true resid norm 7.698407828643e-02 ||r(i)||/||b|| 1.216871332680e-01 471 KSP Residual norm 8.330527047528e-10 % max 2.087066262516e+02 min 1.539572589156e-05 max/min 1.355614069266e+07 472 KSP preconditioned resid norm 8.314013656445e-10 true resid norm 7.698341323890e-02 ||r(i)||/||b|| 1.216860820412e-01 472 KSP Residual norm 8.314013656445e-10 % max 2.095617626089e+02 min 1.539546361499e-05 max/min 1.361191633131e+07 473 KSP preconditioned resid norm 8.297598079857e-10 true resid norm 7.698275409694e-02 ||r(i)||/||b|| 1.216850401492e-01 473 KSP Residual norm 8.297598079857e-10 % max 2.104168996925e+02 min 1.539520341682e-05 max/min 1.366769207237e+07 474 KSP preconditioned resid norm 8.281279355912e-10 true resid norm 7.698210078561e-02 ||r(i)||/||b|| 1.216840074735e-01 474 KSP Residual norm 8.281279355912e-10 % max 2.112720374926e+02 min 1.539494527436e-05 max/min 1.372346791285e+07 475 KSP preconditioned resid norm 8.265056535949e-10 true resid norm 7.698145322348e-02 ||r(i)||/||b|| 1.216829838855e-01 475 KSP Residual norm 8.265056535949e-10 % max 2.121271759996e+02 min 1.539468915743e-05 max/min 1.377924385679e+07 476 KSP preconditioned resid norm 8.248928684267e-10 true resid norm 7.698081134149e-02 ||r(i)||/||b|| 1.216819692760e-01 476 KSP Residual norm 8.248928684267e-10 % max 2.129823152040e+02 min 1.539443504456e-05 max/min 1.383501990087e+07 477 KSP preconditioned resid norm 8.232894877893e-10 true resid norm 7.698017506308e-02 ||r(i)||/||b|| 1.216809635239e-01 477 KSP Residual norm 8.232894877893e-10 % max 2.138374550966e+02 min 1.539418291104e-05 max/min 1.389079604499e+07 478 KSP preconditioned resid norm 8.216954206367e-10 true resid norm 7.697954431722e-02 ||r(i)||/||b|| 1.216799665171e-01 478 KSP Residual norm 8.216954206367e-10 % max 2.146925956681e+02 min 1.539393273158e-05 max/min 1.394657228998e+07 479 KSP preconditioned resid norm 8.201105771517e-10 true resid norm 7.697891903432e-02 ||r(i)||/||b|| 1.216789781454e-01 479 KSP Residual norm 8.201105771517e-10 % max 2.155477369098e+02 min 1.539368449054e-05 max/min 1.400234862825e+07 480 KSP preconditioned resid norm 8.185348687256e-10 true resid norm 7.697829914078e-02 ||r(i)||/||b|| 1.216779982925e-01 480 KSP Residual norm 8.185348687256e-10 % max 2.164028788127e+02 min 1.539343816240e-05 max/min 1.405812506145e+07 481 KSP preconditioned resid norm 8.169682079365e-10 true resid norm 7.697768457143e-02 ||r(i)||/||b|| 1.216770268555e-01 481 KSP Residual norm 8.169682079365e-10 % max 2.172580213682e+02 min 1.539319372853e-05 max/min 1.411390158532e+07 482 KSP preconditioned resid norm 8.154105085294e-10 true resid norm 7.697707525626e-02 ||r(i)||/||b|| 1.216760637237e-01 482 KSP Residual norm 8.154105085294e-10 % max 2.181131645679e+02 min 1.539295116083e-05 max/min 1.416967820458e+07 483 KSP preconditioned resid norm 8.138616853962e-10 true resid norm 7.697647113181e-02 ||r(i)||/||b|| 1.216751087967e-01 483 KSP Residual norm 8.138616853962e-10 % max 2.189683084033e+02 min 1.539271043853e-05 max/min 1.422545491762e+07 484 KSP preconditioned resid norm 8.123216545560e-10 true resid norm 7.697587213190e-02 ||r(i)||/||b|| 1.216741619700e-01 484 KSP Residual norm 8.123216545560e-10 % max 2.198234528663e+02 min 1.539247154007e-05 max/min 1.428123172384e+07 485 KSP preconditioned resid norm 8.107903331364e-10 true resid norm 7.697527818843e-02 ||r(i)||/||b|| 1.216732231359e-01 485 KSP Residual norm 8.107903331364e-10 % max 2.206785979488e+02 min 1.539223445599e-05 max/min 1.433700861170e+07 486 KSP preconditioned resid norm 8.092676393541e-10 true resid norm 7.697468924499e-02 ||r(i)||/||b|| 1.216722922052e-01 486 KSP Residual norm 8.092676393541e-10 % max 2.215337436429e+02 min 1.539199915274e-05 max/min 1.439278559234e+07 487 KSP preconditioned resid norm 8.077534924970e-10 true resid norm 7.697410523157e-02 ||r(i)||/||b|| 1.216713690674e-01 487 KSP Residual norm 8.077534924970e-10 % max 2.223888899408e+02 min 1.539176560838e-05 max/min 1.444856266650e+07 488 KSP preconditioned resid norm 8.062478129065e-10 true resid norm 7.697352609817e-02 ||r(i)||/||b|| 1.216704536432e-01 488 KSP Residual norm 8.062478129065e-10 % max 2.232440368347e+02 min 1.539153381927e-05 max/min 1.450433981799e+07 489 KSP preconditioned resid norm 8.047505219590e-10 true resid norm 7.697295177444e-02 ||r(i)||/||b|| 1.216695458216e-01 489 KSP Residual norm 8.047505219590e-10 % max 2.240991843173e+02 min 1.539130375307e-05 max/min 1.456011705783e+07 490 KSP preconditioned resid norm 8.032615420497e-10 true resid norm 7.697238220643e-02 ||r(i)||/||b|| 1.216686455173e-01 490 KSP Residual norm 8.032615420497e-10 % max 2.249543323810e+02 min 1.539107539327e-05 max/min 1.461589438249e+07 491 KSP preconditioned resid norm 8.017807965749e-10 true resid norm 7.697181733368e-02 ||r(i)||/||b|| 1.216677526347e-01 491 KSP Residual norm 8.017807965749e-10 % max 2.258094810186e+02 min 1.539084872474e-05 max/min 1.467167178737e+07 492 KSP preconditioned resid norm 8.003082099158e-10 true resid norm 7.697125710155e-02 ||r(i)||/||b|| 1.216668670874e-01 492 KSP Residual norm 8.003082099158e-10 % max 2.266646302229e+02 min 1.539062371415e-05 max/min 1.472744928554e+07 493 KSP preconditioned resid norm 7.988437074224e-10 true resid norm 7.697070145112e-02 ||r(i)||/||b|| 1.216659887823e-01 493 KSP Residual norm 7.988437074224e-10 % max 2.275197799868e+02 min 1.539040036644e-05 max/min 1.478322685373e+07 494 KSP preconditioned resid norm 7.973872153972e-10 true resid norm 7.697015032934e-02 ||r(i)||/||b|| 1.216651176356e-01 494 KSP Residual norm 7.973872153972e-10 % max 2.283749303036e+02 min 1.539017865071e-05 max/min 1.483900450324e+07 495 KSP preconditioned resid norm 7.959386610801e-10 true resid norm 7.696960368018e-02 ||r(i)||/||b|| 1.216642535586e-01 495 KSP Residual norm 7.959386610801e-10 % max 2.292300811662e+02 min 1.538995854689e-05 max/min 1.489478223530e+07 496 KSP preconditioned resid norm 7.944979726328e-10 true resid norm 7.696906145072e-02 ||r(i)||/||b|| 1.216633964678e-01 496 KSP Residual norm 7.944979726328e-10 % max 2.300852325680e+02 min 1.538974004078e-05 max/min 1.495056004574e+07 497 KSP preconditioned resid norm 7.930650791238e-10 true resid norm 7.696852358851e-02 ||r(i)||/||b|| 1.216625462801e-01 497 KSP Residual norm 7.930650791238e-10 % max 2.309403845025e+02 min 1.538952311570e-05 max/min 1.500633793304e+07 498 KSP preconditioned resid norm 7.916399105142e-10 true resid norm 7.696799004048e-02 ||r(i)||/||b|| 1.216617029118e-01 498 KSP Residual norm 7.916399105142e-10 % max 2.317955369631e+02 min 1.538930774757e-05 max/min 1.506211590315e+07 499 KSP preconditioned resid norm 7.902223976426e-10 true resid norm 7.696746075603e-02 ||r(i)||/||b|| 1.216608662829e-01 499 KSP Residual norm 7.902223976426e-10 % max 2.326506899435e+02 min 1.538909393078e-05 max/min 1.511789394424e+07 500 KSP preconditioned resid norm 7.888124722116e-10 true resid norm 7.696693568410e-02 ||r(i)||/||b|| 1.216600363126e-01 500 KSP Residual norm 7.888124722116e-10 % max 2.335058434373e+02 min 1.538888164219e-05 max/min 1.517367206185e+07 Linear solve did not converge due to DIVERGED_ITS iterations 500 KSP Object: 4 MPI processes type: gmres restart=500, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=14924, cols=14924 total: nonzeros=1393670, allocated nonzeros=1393670 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Krylov solver did not converge in 500 iters; Petsc reason: DIVERGED_ITS, with residual norm 7.88812e-10-------------------------------------------------------------------------- mpirun has exited due to process rank 0 with PID 14460 on node rook exiting improperly. There are three reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter orte_create_session_dirs is set to false. In this case, the run-time cannot detect that the abort call was an abnormal termination. Hence, the only error message you will receive is this one. This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). You can avoid this message by specifying -quiet on the mpirun command line. -------------------------------------------------------------------------- On 2018-05-18 10:58, Matthew Knepley wrote: > On Fri, May 18, 2018 at 5:54 AM, Y. Shidi wrote: > >> Hello all, >> >> I do not have much knowledge on linear iterative solvers, >> so when use PETSc krylov solvers, I try several combinations >> (e.g. cg+jacobi, gmres+hypre, etc). But I cannot get a >> converged solution. I have used 'preonly' and 'lu' check >> that the system is correctly constructed. >> >> Below is the ksp log output with 500 iterations. > > This is only 411 iterations, and we cannot see what solver was used. > > In general, the iterative solver is very sensitive to the system. We > recommend > starting with a solver that has worked before for someone by checking > the literature. > > Thanks, > > Matt > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [1] > > > Links: > ------ > [1] http://www.caam.rice.edu/~mk51/ From knepley at gmail.com Fri May 18 05:16:46 2018 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 18 May 2018 06:16:46 -0400 Subject: [petsc-users] Linear iterative solver cannot converge In-Reply-To: <0f98542b5208fc578b9fea6806dd6b42@cam.ac.uk> References: <7bab65849037a08d487af6f860bc2658@cam.ac.uk> <0f98542b5208fc578b9fea6806dd6b42@cam.ac.uk> Message-ID: On Fri, May 18, 2018 at 6:10 AM, Y. Shidi wrote: > Hello Matt, > > Thank you for your reply. > It might due to that the email is too long. > Below is the subsequent iterations. > 1) If you are solving a PDE, GMRES+Jacobi is unlikely to be a scalable solver. Note that your condition number is 10^7. 2) Requiring a residual norm below 1e-9 might not work. Your system looks ill-conditioned, and that can magnify round-off errors. If you need more accuracy, you could try iterative refinement, which would be to use one more step of Newton's method for your linear problem, or equivalently, solve the same system again for a solution update with the residual as the rhs. Thanks, Matt > In general, the iterative solver is very sensitive to the system. We >> recommend >> starting with a solver that has worked before for someone by checking >> the literature. >> > Ok, I will try to find it. > > Kind Regards, > Shidi > > 411 KSP preconditioned resid norm 9.546868463763e-10 true resid norm > 7.703880356970e-02 ||r(i)||/||b|| 1.217736363864e-01 > 411 KSP Residual norm 9.546868463763e-10 % max 1.574002896609e+02 min > 1.541649904298e-05 max/min 1.020985952920e+07 > 412 KSP preconditioned resid norm 9.522037274646e-10 true resid norm > 7.703756753884e-02 ||r(i)||/||b|| 1.217716826181e-01 > 412 KSP Residual norm 9.522037274646e-10 % max 1.582553544478e+02 min > 1.541604620560e-05 max/min 1.026562533202e+07 > 413 KSP preconditioned resid norm 9.497398839135e-10 true resid norm > 7.703634622699e-02 ||r(i)||/||b|| 1.217697521158e-01 > 413 KSP Residual norm 9.497398839135e-10 % max 1.591104211134e+02 min > 1.541559807295e-05 max/min 1.032139138297e+07 > 414 KSP preconditioned resid norm 9.472950676296e-10 true resid norm > 7.703513938163e-02 ||r(i)||/||b|| 1.217678444804e-01 > 414 KSP Residual norm 9.472950676296e-10 % max 1.599654896233e+02 min > 1.541515456432e-05 max/min 1.037715768310e+07 > 415 KSP preconditioned resid norm 9.448690349674e-10 true resid norm > 7.703394674743e-02 ||r(i)||/||b|| 1.217659593083e-01 > 415 KSP Residual norm 9.448690349674e-10 % max 1.608205599436e+02 min > 1.541471562375e-05 max/min 1.043292421793e+07 > 416 KSP preconditioned resid norm 9.424615466266e-10 true resid norm > 7.703276808042e-02 ||r(i)||/||b|| 1.217640962139e-01 > 416 KSP Residual norm 9.424615466266e-10 % max 1.616756320416e+02 min > 1.541428116222e-05 max/min 1.048869099636e+07 > 417 KSP preconditioned resid norm 9.400723675537e-10 true resid norm > 7.703160313689e-02 ||r(i)||/||b|| 1.217622548118e-01 > 417 KSP Residual norm 9.400723675537e-10 % max 1.625307058852e+02 min > 1.541385112375e-05 max/min 1.054445800601e+07 > 418 KSP preconditioned resid norm 9.377012668452e-10 true resid norm > 7.703045169141e-02 ||r(i)||/||b|| 1.217604347459e-01 > 418 KSP Residual norm 9.377012668452e-10 % max 1.633857814432e+02 min > 1.541342543594e-05 max/min 1.060022524664e+07 > 419 KSP preconditioned resid norm 9.353480176536e-10 true resid norm > 7.702931350860e-02 ||r(i)||/||b|| 1.217586356440e-01 > 419 KSP Residual norm 9.353480176536e-10 % max 1.642408586848e+02 min > 1.541300403949e-05 max/min 1.065599271005e+07 > 420 KSP preconditioned resid norm 9.330123970972e-10 true resid norm > 7.702818836330e-02 ||r(i)||/||b|| 1.217568571502e-01 > 420 KSP Residual norm 9.330123970972e-10 % max 1.650959375804e+02 min > 1.541258686504e-05 max/min 1.071176039597e+07 > 421 KSP preconditioned resid norm 9.306941861701e-10 true resid norm > 7.702707604016e-02 ||r(i)||/||b|| 1.217550989241e-01 > 421 KSP Residual norm 9.306941861701e-10 % max 1.659510181008e+02 min > 1.541217384870e-05 max/min 1.076752830132e+07 > 422 KSP preconditioned resid norm 9.283931696573e-10 true resid norm > 7.702597632846e-02 ||r(i)||/||b|| 1.217533606326e-01 > 422 KSP Residual norm 9.283931696573e-10 % max 1.668061002175e+02 min > 1.541176492885e-05 max/min 1.082329642241e+07 > 423 KSP preconditioned resid norm 9.261091360496e-10 true resid norm > 7.702488900721e-02 ||r(i)||/||b|| 1.217516419265e-01 > 423 KSP Residual norm 9.261091360496e-10 % max 1.676611839026e+02 min > 1.541136004679e-05 max/min 1.087906475442e+07 > 424 KSP preconditioned resid norm 9.238418774628e-10 true resid norm > 7.702381387948e-02 ||r(i)||/||b|| 1.217499424945e-01 > 424 KSP Residual norm 9.238418774628e-10 % max 1.685162691291e+02 min > 1.541095914290e-05 max/min 1.093483329406e+07 > 425 KSP preconditioned resid norm 9.215911895577e-10 true resid norm > 7.702275074538e-02 ||r(i)||/||b|| 1.217482620205e-01 > 425 KSP Residual norm 9.215911895577e-10 % max 1.693713558704e+02 min > 1.541056215850e-05 max/min 1.099060203829e+07 > 426 KSP preconditioned resid norm 9.193568714629e-10 true resid norm > 7.702169940327e-02 ||r(i)||/||b|| 1.217466001858e-01 > 426 KSP Residual norm 9.193568714629e-10 % max 1.702264441005e+02 min > 1.541016903696e-05 max/min 1.104637098349e+07 > 427 KSP preconditioned resid norm 9.171387256997e-10 true resid norm > 7.702065966772e-02 ||r(i)||/||b|| 1.217449566974e-01 > 427 KSP Residual norm 9.171387256997e-10 % max 1.710815337941e+02 min > 1.540977971745e-05 max/min 1.110214012991e+07 > 428 KSP preconditioned resid norm 9.149365581086e-10 true resid norm > 7.701963134449e-02 ||r(i)||/||b|| 1.217433312483e-01 > 428 KSP Residual norm 9.149365581086e-10 % max 1.719366249265e+02 min > 1.540939414902e-05 max/min 1.115790947157e+07 > 429 KSP preconditioned resid norm 9.127501777777e-10 true resid norm > 7.701861425445e-02 ||r(i)||/||b|| 1.217417235552e-01 > 429 KSP Residual norm 9.127501777777e-10 % max 1.727917174733e+02 min > 1.540901228158e-05 max/min 1.121367900263e+07 > 430 KSP preconditioned resid norm 9.105793969736e-10 true resid norm > 7.701760821750e-02 ||r(i)||/||b|| 1.217401333335e-01 > 430 KSP Residual norm 9.105793969736e-10 % max 1.736468114109e+02 min > 1.540863405594e-05 max/min 1.126944872468e+07 > 431 KSP preconditioned resid norm 9.084240310736e-10 true resid norm > 7.701661305348e-02 ||r(i)||/||b|| 1.217385602984e-01 > 431 KSP Residual norm 9.084240310736e-10 % max 1.745019067161e+02 min > 1.540825942103e-05 max/min 1.132521863423e+07 > 432 KSP preconditioned resid norm 9.062838984992e-10 true resid norm > 7.701562859033e-02 ||r(i)||/||b|| 1.217370041779e-01 > 432 KSP Residual norm 9.062838984992e-10 % max 1.753570033662e+02 min > 1.540788832843e-05 max/min 1.138098872658e+07 > 433 KSP preconditioned resid norm 9.041588206529e-10 true resid norm > 7.701465466315e-02 ||r(i)||/||b|| 1.217354647115e-01 > 433 KSP Residual norm 9.041588206529e-10 % max 1.762121013392e+02 min > 1.540752072662e-05 max/min 1.143675900009e+07 > 434 KSP preconditioned resid norm 9.020486218545e-10 true resid norm > 7.701369109927e-02 ||r(i)||/||b|| 1.217339416261e-01 > 434 KSP Residual norm 9.020486218545e-10 % max 1.770672006133e+02 min > 1.540715656775e-05 max/min 1.149252945115e+07 > 435 KSP preconditioned resid norm 8.999531292809e-10 true resid norm > 7.701273774708e-02 ||r(i)||/||b|| 1.217324346821e-01 > 435 KSP Residual norm 8.999531292809e-10 % max 1.779223011674e+02 min > 1.540679580267e-05 max/min 1.154830007785e+07 > 436 KSP preconditioned resid norm 8.978721729060e-10 true resid norm > 7.701179443778e-02 ||r(i)||/||b|| 1.217309436126e-01 > 436 KSP Residual norm 8.978721729060e-10 % max 1.787774029806e+02 min > 1.540643838466e-05 max/min 1.160407087718e+07 > 437 KSP preconditioned resid norm 8.958055854435e-10 true resid norm > 7.701086102012e-02 ||r(i)||/||b|| 1.217294681788e-01 > 437 KSP Residual norm 8.958055854435e-10 % max 1.796325060327e+02 min > 1.540608426227e-05 max/min 1.165984185044e+07 > 438 KSP preconditioned resid norm 8.937532022898e-10 true resid norm > 7.700993734290e-02 ||r(i)||/||b|| 1.217280081414e-01 > 438 KSP Residual norm 8.937532022898e-10 % max 1.804876103037e+02 min > 1.540573340057e-05 max/min 1.171561298711e+07 > 439 KSP preconditioned resid norm 8.917148614689e-10 true resid norm > 7.700902325376e-02 ||r(i)||/||b|| 1.217265632596e-01 > 439 KSP Residual norm 8.917148614689e-10 % max 1.813427157743e+02 min > 1.540538574951e-05 max/min 1.177138428877e+07 > 440 KSP preconditioned resid norm 8.896904035794e-10 true resid norm > 7.700811860899e-02 ||r(i)||/||b|| 1.217251333064e-01 > 440 KSP Residual norm 8.896904035794e-10 % max 1.821978224254e+02 min > 1.540504126229e-05 max/min 1.182715575526e+07 > 441 KSP preconditioned resid norm 8.876796717414e-10 true resid norm > 7.700722326477e-02 ||r(i)||/||b|| 1.217237180544e-01 > 441 KSP Residual norm 8.876796717414e-10 % max 1.830529302383e+02 min > 1.540469989749e-05 max/min 1.188292738297e+07 > 442 KSP preconditioned resid norm 8.856825115457e-10 true resid norm > 7.700633707629e-02 ||r(i)||/||b|| 1.217223172747e-01 > 442 KSP Residual norm 8.856825115457e-10 % max 1.839080391948e+02 min > 1.540436161576e-05 max/min 1.193869916730e+07 > 443 KSP preconditioned resid norm 8.836987710040e-10 true resid norm > 7.700545991462e-02 ||r(i)||/||b|| 1.217209307635e-01 > 443 KSP Residual norm 8.836987710040e-10 % max 1.847631492771e+02 min > 1.540402637418e-05 max/min 1.199447110703e+07 > 444 KSP preconditioned resid norm 8.817283005006e-10 true resid norm > 7.700459163848e-02 ||r(i)||/||b|| 1.217195582974e-01 > 444 KSP Residual norm 8.817283005006e-10 % max 1.856182604675e+02 min > 1.540369412984e-05 max/min 1.205024320159e+07 > 445 KSP preconditioned resid norm 8.797709527445e-10 true resid norm > 7.700373211375e-02 ||r(i)||/||b|| 1.217181996645e-01 > 445 KSP Residual norm 8.797709527445e-10 % max 1.864733727491e+02 min > 1.540336484553e-05 max/min 1.210601544657e+07 > 446 KSP preconditioned resid norm 8.778265827233e-10 true resid norm > 7.700288121255e-02 ||r(i)||/||b|| 1.217168546627e-01 > 446 KSP Residual norm 8.778265827233e-10 % max 1.873284861049e+02 min > 1.540303848040e-05 max/min 1.216178784097e+07 > 447 KSP preconditioned resid norm 8.758950476583e-10 true resid norm > 7.700203881086e-02 ||r(i)||/||b|| 1.217155230958e-01 > 447 KSP Residual norm 8.758950476583e-10 % max 1.881836005185e+02 min > 1.540271499238e-05 max/min 1.221756038540e+07 > 448 KSP preconditioned resid norm 8.739762069604e-10 true resid norm > 7.700120478187e-02 ||r(i)||/||b|| 1.217142047635e-01 > 448 KSP Residual norm 8.739762069604e-10 % max 1.890387159738e+02 min > 1.540239435002e-05 max/min 1.227333307263e+07 > 449 KSP preconditioned resid norm 8.720699221868e-10 true resid norm > 7.700037900278e-02 ||r(i)||/||b|| 1.217128994717e-01 > 449 KSP Residual norm 8.720699221868e-10 % max 1.898938324550e+02 min > 1.540207651332e-05 max/min 1.232910590276e+07 > 450 KSP preconditioned resid norm 8.701760569996e-10 true resid norm > 7.699956135165e-02 ||r(i)||/||b|| 1.217116070275e-01 > 450 KSP Residual norm 8.701760569996e-10 % max 1.907489499467e+02 min > 1.540176144765e-05 max/min 1.238487887214e+07 > 451 KSP preconditioned resid norm 8.682944771243e-10 true resid norm > 7.699875171161e-02 ||r(i)||/||b|| 1.217103272463e-01 > 451 KSP Residual norm 8.682944771243e-10 % max 1.916040684336e+02 min > 1.540144910772e-05 max/min 1.244065198628e+07 > 452 KSP preconditioned resid norm 8.664250503101e-10 true resid norm > 7.699794997108e-02 ||r(i)||/||b|| 1.217090599517e-01 > 452 KSP Residual norm 8.664250503101e-10 % max 1.924591879009e+02 min > 1.540113947663e-05 max/min 1.249642522834e+07 > 453 KSP preconditioned resid norm 8.645676462907e-10 true resid norm > 7.699715601200e-02 ||r(i)||/||b|| 1.217078049571e-01 > 453 KSP Residual norm 8.645676462907e-10 % max 1.933143083340e+02 min > 1.540083249795e-05 max/min 1.255219861393e+07 > 454 KSP preconditioned resid norm 8.627221367466e-10 true resid norm > 7.699636972505e-02 ||r(i)||/||b|| 1.217065620897e-01 > 454 KSP Residual norm 8.627221367466e-10 % max 1.941694297187e+02 min > 1.540052815040e-05 max/min 1.260797213072e+07 > 455 KSP preconditioned resid norm 8.608883952670e-10 true resid norm > 7.699559100258e-02 ||r(i)||/||b|| 1.217053311792e-01 > 455 KSP Residual norm 8.608883952670e-10 % max 1.950245520408e+02 min > 1.540022639631e-05 max/min 1.266374578023e+07 > 456 KSP preconditioned resid norm 8.590662973146e-10 true resid norm > 7.699481973160e-02 ||r(i)||/||b|| 1.217041120472e-01 > 456 KSP Residual norm 8.590662973146e-10 % max 1.958796752868e+02 min > 1.539992720743e-05 max/min 1.271951955670e+07 > 457 KSP preconditioned resid norm 8.572557201889e-10 true resid norm > 7.699405581233e-02 ||r(i)||/||b|| 1.217029045359e-01 > 457 KSP Residual norm 8.572557201889e-10 % max 1.967347994430e+02 min > 1.539963054120e-05 max/min 1.277529346672e+07 > 458 KSP preconditioned resid norm 8.554565429923e-10 true resid norm > 7.699329914254e-02 ||r(i)||/||b|| 1.217017084837e-01 > 458 KSP Residual norm 8.554565429923e-10 % max 1.975899244964e+02 min > 1.539933638367e-05 max/min 1.283106749365e+07 > 459 KSP preconditioned resid norm 8.536686465960e-10 true resid norm > 7.699254961914e-02 ||r(i)||/||b|| 1.217005237276e-01 > 459 KSP Residual norm 8.536686465960e-10 % max 1.984450504338e+02 min > 1.539904468019e-05 max/min 1.288684165512e+07 > 460 KSP preconditioned resid norm 8.518919136065e-10 true resid norm > 7.699180714235e-02 ||r(i)||/||b|| 1.216993501100e-01 > 460 KSP Residual norm 8.518919136065e-10 % max 1.993001772428e+02 min > 1.539875541750e-05 max/min 1.294261593481e+07 > 461 KSP preconditioned resid norm 8.501262283338e-10 true resid norm > 7.699107161430e-02 ||r(i)||/||b|| 1.216981874761e-01 > 461 KSP Residual norm 8.501262283338e-10 % max 2.001553049106e+02 min > 1.539846855934e-05 max/min 1.299839033598e+07 > 462 KSP preconditioned resid norm 8.483714767594e-10 true resid norm > 7.699034293767e-02 ||r(i)||/||b|| 1.216970356721e-01 > 462 KSP Residual norm 8.483714767594e-10 % max 2.010104334253e+02 min > 1.539818407370e-05 max/min 1.305416485887e+07 > 463 KSP preconditioned resid norm 8.466275465054e-10 true resid norm > 7.698962102339e-02 ||r(i)||/||b|| 1.216958945572e-01 > 463 KSP Residual norm 8.466275465054e-10 % max 2.018655627746e+02 min > 1.539790193476e-05 max/min 1.310993949889e+07 > 464 KSP preconditioned resid norm 8.448943268044e-10 true resid norm > 7.698890577352e-02 ||r(i)||/||b|| 1.216947639765e-01 > 464 KSP Residual norm 8.448943268044e-10 % max 2.027206929469e+02 min > 1.539762211089e-05 max/min 1.316571425685e+07 > 465 KSP preconditioned resid norm 8.431717084698e-10 true resid norm > 7.698819709932e-02 ||r(i)||/||b|| 1.216936437900e-01 > 465 KSP Residual norm 8.431717084698e-10 % max 2.035758239306e+02 min > 1.539734457892e-05 max/min 1.322148912673e+07 > 466 KSP preconditioned resid norm 8.414595838671e-10 true resid norm > 7.698749491585e-02 ||r(i)||/||b|| 1.216925338631e-01 > 466 KSP Residual norm 8.414595838671e-10 % max 2.044309557144e+02 min > 1.539706930707e-05 max/min 1.327726411029e+07 > 467 KSP preconditioned resid norm 8.397578468854e-10 true resid norm > 7.698679913097e-02 ||r(i)||/||b|| 1.216914340504e-01 > 467 KSP Residual norm 8.397578468854e-10 % max 2.052860882871e+02 min > 1.539679626817e-05 max/min 1.333303920580e+07 > 468 KSP preconditioned resid norm 8.380663929100e-10 true resid norm > 7.698610965659e-02 ||r(i)||/||b|| 1.216903442126e-01 > 468 KSP Residual norm 8.380663929100e-10 % max 2.061412216378e+02 min > 1.539652542826e-05 max/min 1.338881441779e+07 > 469 KSP preconditioned resid norm 8.363851187952e-10 true resid norm > 7.698542641515e-02 ||r(i)||/||b|| 1.216892642271e-01 > 469 KSP Residual norm 8.363851187952e-10 % max 2.069963557558e+02 min > 1.539625677349e-05 max/min 1.344458973379e+07 > 470 KSP preconditioned resid norm 8.347139228382e-10 true resid norm > 7.698474931774e-02 ||r(i)||/||b|| 1.216881939532e-01 > 470 KSP Residual norm 8.347139228382e-10 % max 2.078514906305e+02 min > 1.539599027325e-05 max/min 1.350036515622e+07 > 471 KSP preconditioned resid norm 8.330527047528e-10 true resid norm > 7.698407828643e-02 ||r(i)||/||b|| 1.216871332680e-01 > 471 KSP Residual norm 8.330527047528e-10 % max 2.087066262516e+02 min > 1.539572589156e-05 max/min 1.355614069266e+07 > 472 KSP preconditioned resid norm 8.314013656445e-10 true resid norm > 7.698341323890e-02 ||r(i)||/||b|| 1.216860820412e-01 > 472 KSP Residual norm 8.314013656445e-10 % max 2.095617626089e+02 min > 1.539546361499e-05 max/min 1.361191633131e+07 > 473 KSP preconditioned resid norm 8.297598079857e-10 true resid norm > 7.698275409694e-02 ||r(i)||/||b|| 1.216850401492e-01 > 473 KSP Residual norm 8.297598079857e-10 % max 2.104168996925e+02 min > 1.539520341682e-05 max/min 1.366769207237e+07 > 474 KSP preconditioned resid norm 8.281279355912e-10 true resid norm > 7.698210078561e-02 ||r(i)||/||b|| 1.216840074735e-01 > 474 KSP Residual norm 8.281279355912e-10 % max 2.112720374926e+02 min > 1.539494527436e-05 max/min 1.372346791285e+07 > 475 KSP preconditioned resid norm 8.265056535949e-10 true resid norm > 7.698145322348e-02 ||r(i)||/||b|| 1.216829838855e-01 > 475 KSP Residual norm 8.265056535949e-10 % max 2.121271759996e+02 min > 1.539468915743e-05 max/min 1.377924385679e+07 > 476 KSP preconditioned resid norm 8.248928684267e-10 true resid norm > 7.698081134149e-02 ||r(i)||/||b|| 1.216819692760e-01 > 476 KSP Residual norm 8.248928684267e-10 % max 2.129823152040e+02 min > 1.539443504456e-05 max/min 1.383501990087e+07 > 477 KSP preconditioned resid norm 8.232894877893e-10 true resid norm > 7.698017506308e-02 ||r(i)||/||b|| 1.216809635239e-01 > 477 KSP Residual norm 8.232894877893e-10 % max 2.138374550966e+02 min > 1.539418291104e-05 max/min 1.389079604499e+07 > 478 KSP preconditioned resid norm 8.216954206367e-10 true resid norm > 7.697954431722e-02 ||r(i)||/||b|| 1.216799665171e-01 > 478 KSP Residual norm 8.216954206367e-10 % max 2.146925956681e+02 min > 1.539393273158e-05 max/min 1.394657228998e+07 > 479 KSP preconditioned resid norm 8.201105771517e-10 true resid norm > 7.697891903432e-02 ||r(i)||/||b|| 1.216789781454e-01 > 479 KSP Residual norm 8.201105771517e-10 % max 2.155477369098e+02 min > 1.539368449054e-05 max/min 1.400234862825e+07 > 480 KSP preconditioned resid norm 8.185348687256e-10 true resid norm > 7.697829914078e-02 ||r(i)||/||b|| 1.216779982925e-01 > 480 KSP Residual norm 8.185348687256e-10 % max 2.164028788127e+02 min > 1.539343816240e-05 max/min 1.405812506145e+07 > 481 KSP preconditioned resid norm 8.169682079365e-10 true resid norm > 7.697768457143e-02 ||r(i)||/||b|| 1.216770268555e-01 > 481 KSP Residual norm 8.169682079365e-10 % max 2.172580213682e+02 min > 1.539319372853e-05 max/min 1.411390158532e+07 > 482 KSP preconditioned resid norm 8.154105085294e-10 true resid norm > 7.697707525626e-02 ||r(i)||/||b|| 1.216760637237e-01 > 482 KSP Residual norm 8.154105085294e-10 % max 2.181131645679e+02 min > 1.539295116083e-05 max/min 1.416967820458e+07 > 483 KSP preconditioned resid norm 8.138616853962e-10 true resid norm > 7.697647113181e-02 ||r(i)||/||b|| 1.216751087967e-01 > 483 KSP Residual norm 8.138616853962e-10 % max 2.189683084033e+02 min > 1.539271043853e-05 max/min 1.422545491762e+07 > 484 KSP preconditioned resid norm 8.123216545560e-10 true resid norm > 7.697587213190e-02 ||r(i)||/||b|| 1.216741619700e-01 > 484 KSP Residual norm 8.123216545560e-10 % max 2.198234528663e+02 min > 1.539247154007e-05 max/min 1.428123172384e+07 > 485 KSP preconditioned resid norm 8.107903331364e-10 true resid norm > 7.697527818843e-02 ||r(i)||/||b|| 1.216732231359e-01 > 485 KSP Residual norm 8.107903331364e-10 % max 2.206785979488e+02 min > 1.539223445599e-05 max/min 1.433700861170e+07 > 486 KSP preconditioned resid norm 8.092676393541e-10 true resid norm > 7.697468924499e-02 ||r(i)||/||b|| 1.216722922052e-01 > 486 KSP Residual norm 8.092676393541e-10 % max 2.215337436429e+02 min > 1.539199915274e-05 max/min 1.439278559234e+07 > 487 KSP preconditioned resid norm 8.077534924970e-10 true resid norm > 7.697410523157e-02 ||r(i)||/||b|| 1.216713690674e-01 > 487 KSP Residual norm 8.077534924970e-10 % max 2.223888899408e+02 min > 1.539176560838e-05 max/min 1.444856266650e+07 > 488 KSP preconditioned resid norm 8.062478129065e-10 true resid norm > 7.697352609817e-02 ||r(i)||/||b|| 1.216704536432e-01 > 488 KSP Residual norm 8.062478129065e-10 % max 2.232440368347e+02 min > 1.539153381927e-05 max/min 1.450433981799e+07 > 489 KSP preconditioned resid norm 8.047505219590e-10 true resid norm > 7.697295177444e-02 ||r(i)||/||b|| 1.216695458216e-01 > 489 KSP Residual norm 8.047505219590e-10 % max 2.240991843173e+02 min > 1.539130375307e-05 max/min 1.456011705783e+07 > 490 KSP preconditioned resid norm 8.032615420497e-10 true resid norm > 7.697238220643e-02 ||r(i)||/||b|| 1.216686455173e-01 > 490 KSP Residual norm 8.032615420497e-10 % max 2.249543323810e+02 min > 1.539107539327e-05 max/min 1.461589438249e+07 > 491 KSP preconditioned resid norm 8.017807965749e-10 true resid norm > 7.697181733368e-02 ||r(i)||/||b|| 1.216677526347e-01 > 491 KSP Residual norm 8.017807965749e-10 % max 2.258094810186e+02 min > 1.539084872474e-05 max/min 1.467167178737e+07 > 492 KSP preconditioned resid norm 8.003082099158e-10 true resid norm > 7.697125710155e-02 ||r(i)||/||b|| 1.216668670874e-01 > 492 KSP Residual norm 8.003082099158e-10 % max 2.266646302229e+02 min > 1.539062371415e-05 max/min 1.472744928554e+07 > 493 KSP preconditioned resid norm 7.988437074224e-10 true resid norm > 7.697070145112e-02 ||r(i)||/||b|| 1.216659887823e-01 > 493 KSP Residual norm 7.988437074224e-10 % max 2.275197799868e+02 min > 1.539040036644e-05 max/min 1.478322685373e+07 > 494 KSP preconditioned resid norm 7.973872153972e-10 true resid norm > 7.697015032934e-02 ||r(i)||/||b|| 1.216651176356e-01 > 494 KSP Residual norm 7.973872153972e-10 % max 2.283749303036e+02 min > 1.539017865071e-05 max/min 1.483900450324e+07 > 495 KSP preconditioned resid norm 7.959386610801e-10 true resid norm > 7.696960368018e-02 ||r(i)||/||b|| 1.216642535586e-01 > 495 KSP Residual norm 7.959386610801e-10 % max 2.292300811662e+02 min > 1.538995854689e-05 max/min 1.489478223530e+07 > 496 KSP preconditioned resid norm 7.944979726328e-10 true resid norm > 7.696906145072e-02 ||r(i)||/||b|| 1.216633964678e-01 > 496 KSP Residual norm 7.944979726328e-10 % max 2.300852325680e+02 min > 1.538974004078e-05 max/min 1.495056004574e+07 > 497 KSP preconditioned resid norm 7.930650791238e-10 true resid norm > 7.696852358851e-02 ||r(i)||/||b|| 1.216625462801e-01 > 497 KSP Residual norm 7.930650791238e-10 % max 2.309403845025e+02 min > 1.538952311570e-05 max/min 1.500633793304e+07 > 498 KSP preconditioned resid norm 7.916399105142e-10 true resid norm > 7.696799004048e-02 ||r(i)||/||b|| 1.216617029118e-01 > 498 KSP Residual norm 7.916399105142e-10 % max 2.317955369631e+02 min > 1.538930774757e-05 max/min 1.506211590315e+07 > 499 KSP preconditioned resid norm 7.902223976426e-10 true resid norm > 7.696746075603e-02 ||r(i)||/||b|| 1.216608662829e-01 > 499 KSP Residual norm 7.902223976426e-10 % max 2.326506899435e+02 min > 1.538909393078e-05 max/min 1.511789394424e+07 > 500 KSP preconditioned resid norm 7.888124722116e-10 true resid norm > 7.696693568410e-02 ||r(i)||/||b|| 1.216600363126e-01 > 500 KSP Residual norm 7.888124722116e-10 % max 2.335058434373e+02 min > 1.538888164219e-05 max/min 1.517367206185e+07 > Linear solve did not converge due to DIVERGED_ITS iterations 500 > KSP Object: 4 MPI processes > type: gmres > restart=500, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=500, initial guess is zero > tolerances: relative=1e-08, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: jacobi > linear system matrix = precond matrix: > Mat Object: 4 MPI processes > type: mpiaij > rows=14924, cols=14924 > total: nonzeros=1393670, allocated nonzeros=1393670 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > Krylov solver did not converge in 500 iters; Petsc reason: DIVERGED_ITS, > with residual norm 7.88812e-10------------------- > ------------------------------------------------------- > mpirun has exited due to process rank 0 with PID 14460 on > node rook exiting improperly. There are three reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter > orte_create_session_dirs is set to false. In this case, the run-time cannot > detect that the abort call was an abnormal termination. Hence, the only > error message you will receive is this one. > > This may have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here). > > You can avoid this message by specifying -quiet on the mpirun command line. > > -------------------------------------------------------------------------- > > > > On 2018-05-18 10:58, Matthew Knepley wrote: > >> On Fri, May 18, 2018 at 5:54 AM, Y. Shidi wrote: >> >> Hello all, >>> >>> I do not have much knowledge on linear iterative solvers, >>> so when use PETSc krylov solvers, I try several combinations >>> (e.g. cg+jacobi, gmres+hypre, etc). But I cannot get a >>> converged solution. I have used 'preonly' and 'lu' check >>> that the system is correctly constructed. >>> >>> Below is the ksp log output with 500 iterations. >>> >> >> This is only 411 iterations, and we cannot see what solver was used. >> >> In general, the iterative solver is very sensitive to the system. We >> recommend >> starting with a solver that has worked before for someone by checking >> the literature. >> >> Thanks, >> >> Matt >> >> >> >> > > -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] >> >> >> Links: >> ------ >> [1] http://www.caam.rice.edu/~mk51/ >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Fri May 18 08:09:11 2018 From: cpraveen at gmail.com (Praveen C) Date: Fri, 18 May 2018 18:39:11 +0530 Subject: [petsc-users] Matrix-free preconditioner using shell matrix Message-ID: Dear all I am looking for any example code that shows how to use my own matrix-free preconditioner using a shell matrix. I am using TS to solve my problem. Thanks praveen From jed at jedbrown.org Fri May 18 08:53:18 2018 From: jed at jedbrown.org (Jed Brown) Date: Fri, 18 May 2018 07:53:18 -0600 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: References: Message-ID: <87603loysx.fsf@jedbrown.org> See examples at the bottom of the page. http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCShellSetApply.html Praveen C writes: > Dear all > > I am looking for any example code that shows how to use my own matrix-free preconditioner using a shell matrix. I am using TS to solve my problem. > > Thanks > praveen From cpraveen at gmail.com Fri May 18 09:09:23 2018 From: cpraveen at gmail.com (Praveen C) Date: Fri, 18 May 2018 19:39:23 +0530 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: <87603loysx.fsf@jedbrown.org> References: <87603loysx.fsf@jedbrown.org> Message-ID: Thanks, I will check these out. I am solving a non-linear problem using TS. I will use -snes_mf so that jacobian action is computed by petsc using finite differences. I am trying to use shell matrix for preconditioner P as follows. call MatCreateShell(PETSC_COMM_WORLD, n, n, & PETSC_DETERMINE, PETSC_DETERMINE, & ctx, P, ierr); CHKERRQ(ierr) call MatShellSetOperation(P, MATOP_MULT, ApplyPC, ierr); CHKERRQ(ierr) call TSSetRHSJacobian(ts, P, P, RHSJacobian, ctx, ierr); CHKERRQ(ierr) Is ApplyPC supposed to do the following ? subroutine ApplyPC(P, x, y, ierr) ! Compute y = P^(-1) * x end subroutine ApplyPC Thanks praveen > On 18-May-2018, at 7:23 PM, Jed Brown wrote: > > See examples at the bottom of the page. > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCShellSetApply.html > > Praveen C writes: > >> Dear all >> >> I am looking for any example code that shows how to use my own matrix-free preconditioner using a shell matrix. I am using TS to solve my problem. >> >> Thanks >> praveen From jed at jedbrown.org Fri May 18 09:16:31 2018 From: jed at jedbrown.org (Jed Brown) Date: Fri, 18 May 2018 08:16:31 -0600 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: References: <87603loysx.fsf@jedbrown.org> Message-ID: <8736ypoxq8.fsf@jedbrown.org> Praveen C writes: > Thanks, I will check these out. > > I am solving a non-linear problem using TS. I will use -snes_mf so that jacobian action is computed by petsc using finite differences. > > I am trying to use shell matrix for preconditioner P as follows. You don't need to use a shell matrix to implement a shell preconditioner. When using a shell PC, you need to call KSPSetPC() -- you can unwrap that from TS -> SNES -> KSP. > call MatCreateShell(PETSC_COMM_WORLD, n, n, & > PETSC_DETERMINE, PETSC_DETERMINE, & > ctx, P, ierr); CHKERRQ(ierr) > call MatShellSetOperation(P, MATOP_MULT, ApplyPC, ierr); CHKERRQ(ierr) > call TSSetRHSJacobian(ts, P, P, RHSJacobian, ctx, ierr); CHKERRQ(ierr) > > Is ApplyPC supposed to do the following ? > > subroutine ApplyPC(P, x, y, ierr) > > ! Compute y = P^(-1) * x > > end subroutine ApplyPC > > Thanks > praveen > >> On 18-May-2018, at 7:23 PM, Jed Brown wrote: >> >> See examples at the bottom of the page. >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCShellSetApply.html >> >> Praveen C writes: >> >>> Dear all >>> >>> I am looking for any example code that shows how to use my own matrix-free preconditioner using a shell matrix. I am using TS to solve my problem. >>> >>> Thanks >>> praveen From cpraveen at gmail.com Fri May 18 09:20:24 2018 From: cpraveen at gmail.com (Praveen C) Date: Fri, 18 May 2018 19:50:24 +0530 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: <8736ypoxq8.fsf@jedbrown.org> References: <87603loysx.fsf@jedbrown.org> <8736ypoxq8.fsf@jedbrown.org> Message-ID: <82F25F98-4AEC-4AFE-89EB-D524747CE896@gmail.com> > On 18-May-2018, at 7:46 PM, Jed Brown wrote: > > You don't need to use a shell matrix to implement a shell > preconditioner. When using a shell PC, you need to call KSPSetPC() -- > you can unwrap that from TS -> SNES -> KSP. Thanks Jed. Is it the case that I cannot use a shell matrix to implement a shell preconditioner ? Best regards praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri May 18 09:29:19 2018 From: jed at jedbrown.org (Jed Brown) Date: Fri, 18 May 2018 08:29:19 -0600 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: <82F25F98-4AEC-4AFE-89EB-D524747CE896@gmail.com> References: <87603loysx.fsf@jedbrown.org> <8736ypoxq8.fsf@jedbrown.org> <82F25F98-4AEC-4AFE-89EB-D524747CE896@gmail.com> Message-ID: <87zi0xnikg.fsf@jedbrown.org> Praveen C writes: >> On 18-May-2018, at 7:46 PM, Jed Brown wrote: >> >> You don't need to use a shell matrix to implement a shell >> preconditioner. When using a shell PC, you need to call KSPSetPC() -- >> you can unwrap that from TS -> SNES -> KSP. > > > Thanks Jed. > > Is it the case that I cannot use a shell matrix to implement a shell preconditioner ? They are two different concepts. You can use either or both. From bsmith at mcs.anl.gov Fri May 18 12:05:25 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 18 May 2018 17:05:25 +0000 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: References: <87603loysx.fsf@jedbrown.org> Message-ID: > On May 18, 2018, at 9:09 AM, Praveen C wrote: > > Thanks, I will check these out. > > I am solving a non-linear problem using TS. I will use -snes_mf so that jacobian action is computed by petsc using finite differences. > > I am trying to use shell matrix for preconditioner P as follows. > > call MatCreateShell(PETSC_COMM_WORLD, n, n, & > PETSC_DETERMINE, PETSC_DETERMINE, & > ctx, P, ierr); CHKERRQ(ierr) > call MatShellSetOperation(P, MATOP_MULT, ApplyPC, ierr); CHKERRQ(ierr) This is wrong. If you have a ApplyPC you need to attach it to the PC not the mat. So you would have SNESGetPC(snes,&pc); PCSetType(pc,PCSHELL); PCShellSetAppy(pc,ApplyPC) Barry > call TSSetRHSJacobian(ts, P, P, RHSJacobian, ctx, ierr); CHKERRQ(ierr) > > Is ApplyPC supposed to do the following ? > > subroutine ApplyPC(P, x, y, ierr) > > ! Compute y = P^(-1) * x > > end subroutine ApplyPC > > Thanks > praveen > >> On 18-May-2018, at 7:23 PM, Jed Brown wrote: >> >> See examples at the bottom of the page. >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCShellSetApply.html >> >> Praveen C writes: >> >>> Dear all >>> >>> I am looking for any example code that shows how to use my own matrix-free preconditioner using a shell matrix. I am using TS to solve my problem. >>> >>> Thanks >>> praveen > From cpraveen at gmail.com Fri May 18 23:23:45 2018 From: cpraveen at gmail.com (Praveen C) Date: Sat, 19 May 2018 09:53:45 +0530 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: References: <87603loysx.fsf@jedbrown.org> Message-ID: <6BC71B3F-74CE-4E16-9DB1-71B807607308@gmail.com> Thanks Barry and Jed. > On 18-May-2018, at 10:35 PM, Smith, Barry F. wrote: > > This is wrong. If you have a ApplyPC you need to attach it to the PC not the mat. So you would have > > SNESGetPC(snes,&pc); > PCSetType(pc,PCSHELL); > PCShellSetAppy(pc,ApplyPC) I am now setting up with PCSHELL, but face a doubt on Mat objects. So in my case (-snes_mf and PCSHELL), I dont have any jacobian and precond matrix. In fortran should I do something like this call TSSetRHSJacobian(ts, PETSC_NULL_MAT, PETSC_NULL_MAT, RHSJacobian, ctx, ierr) But this gives segmentation violation when TSSetRHSJacobian is called. Thanks praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 19 10:46:18 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 19 May 2018 15:46:18 +0000 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: <6BC71B3F-74CE-4E16-9DB1-71B807607308@gmail.com> References: <87603loysx.fsf@jedbrown.org> <6BC71B3F-74CE-4E16-9DB1-71B807607308@gmail.com> Message-ID: <97E21833-8249-48A2-B851-D7A2ADE3CF13@mcs.anl.gov> > On May 18, 2018, at 11:23 PM, Praveen C wrote: > > Thanks Barry and Jed. > > >> On 18-May-2018, at 10:35 PM, Smith, Barry F. wrote: >> >> This is wrong. If you have a ApplyPC you need to attach it to the PC not the mat. So you would have >> >> SNESGetPC(snes,&pc); >> PCSetType(pc,PCSHELL); >> PCShellSetAppy(pc,ApplyPC) > > I am now setting up with PCSHELL, but face a doubt on Mat objects. > > So in my case (-snes_mf and PCSHELL), I dont have any jacobian and precond matrix. In fortran should I do something like this > > call TSSetRHSJacobian(ts, PETSC_NULL_MAT, PETSC_NULL_MAT, RHSJacobian, ctx, ierr) > > But this gives segmentation violation when TSSetRHSJacobian is called. Praveen, Ahh, we didn't have support for passing PETSC_NULL_MAT from Fortran for this routine. I have added it in the branch barry/fix-null-matrix-set-jacobian/maint which after testing will go into the maint branch. Could you please try this branch and let us know if it resolves the problem (or a new problem pops up)? Barry Unfortunately handling null objects from Fortran requires us to manually tweak the Fortran interface functions and sometimes we forget or don't realize they need fixing until someone reports a problem. > > Thanks > praveen From cpraveen at gmail.com Sat May 19 11:42:57 2018 From: cpraveen at gmail.com (Praveen C) Date: Sat, 19 May 2018 22:12:57 +0530 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: <97E21833-8249-48A2-B851-D7A2ADE3CF13@mcs.anl.gov> References: <87603loysx.fsf@jedbrown.org> <6BC71B3F-74CE-4E16-9DB1-71B807607308@gmail.com> <97E21833-8249-48A2-B851-D7A2ADE3CF13@mcs.anl.gov> Message-ID: <1BA236C4-26B7-4CF4-A2DE-980E102D2E49@gmail.com> Thanks a lot. > On 19-May-2018, at 9:16 PM, Smith, Barry F. wrote: > > Praveen, > > Ahh, we didn't have support for passing PETSC_NULL_MAT from Fortran for this routine. I have added it in the branch barry/fix-null-matrix-set-jacobian/maint which after testing will go into the maint branch. > > Could you please try this branch and let us know if it resolves the problem (or a new problem pops up)? > > Barry > > Unfortunately handling null objects from Fortran requires us to manually tweak the Fortran interface functions and sometimes we forget or don't realize they need fixing until someone reports a problem. I meanwhile passed a Mat object to TSSetRHSJacobian function. It seems that with -snes_mf and PCSHELL, my RHSJacobian is never invoked. Is this the correct behaviour ? In RHSJacobian, I am storing the current solution into my usercontext so that I can use it in my shell preconditioner subroutine RHSJacobian(ts, time, u, J, P, ctx, ierr) call VecCopy(u, ctx%p%v_u, ierr); CHKERRQ(ierr) end However, since my RHSJacobian is never invoked, I dont have access to the current solution needed in my ApplyPC. How can I access the current solution in my ApplyPC function ? Another issue I am facing is this. I attach my user context to the PC call TSGetSNES(ts, snes, ierr); CHKERRQ(ierr) call SNESGetKSP(snes, ksp, ierr); CHKERRQ(ierr) call KSPGetPC(ksp, pc, ierr); CHKERRQ(ierr) call PCSetType(pc, PCSHELL, ierr); CHKERRQ(ierr) call PCShellSetContext(pc, ctx, ierr); CHKERRQ(ierr) call PCShellSetApply(pc, ApplyPC, ierr); CHKERRQ(ierr) Inside my ApplyPC, I access ctx using subroutine ApplyPC(pc, x, y, ierr) use petscpc use mtsdata implicit none PC :: pc Vec :: x, y PetscErrorCode :: ierr ! Local variables type(tsdata) :: ctx call PCShellGetContext(pc, ctx, ierr); CHKERRQ(ierr) end But the ctx I get from above is not correct, since it does not have the data I stored in my original ctx. I am rather clueless at this stage. Thanks for your help. Best regards praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 19 12:01:15 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 19 May 2018 17:01:15 +0000 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: <1BA236C4-26B7-4CF4-A2DE-980E102D2E49@gmail.com> References: <87603loysx.fsf@jedbrown.org> <6BC71B3F-74CE-4E16-9DB1-71B807607308@gmail.com> <97E21833-8249-48A2-B851-D7A2ADE3CF13@mcs.anl.gov> <1BA236C4-26B7-4CF4-A2DE-980E102D2E49@gmail.com> Message-ID: > On May 19, 2018, at 11:42 AM, Praveen C wrote: > > Thanks a lot. > >> On 19-May-2018, at 9:16 PM, Smith, Barry F. wrote: >> >> Praveen, >> >> Ahh, we didn't have support for passing PETSC_NULL_MAT from Fortran for this routine. I have added it in the branch barry/fix-null-matrix-set-jacobian/maint which after testing will go into the maint branch. >> >> Could you please try this branch and let us know if it resolves the problem (or a new problem pops up)? >> >> Barry >> >> Unfortunately handling null objects from Fortran requires us to manually tweak the Fortran interface functions and sometimes we forget or don't realize they need fixing until someone reports a problem. > > I meanwhile passed a Mat object to TSSetRHSJacobian function. It seems that with -snes_mf and PCSHELL, my RHSJacobian is never invoked. Is this the correct behaviour ? Yes, if you want your function to be called you need to use -snes_mf_operator instead. > > In RHSJacobian, I am storing the current solution into my usercontext so that I can use it in my shell preconditioner > > subroutine RHSJacobian(ts, time, u, J, P, ctx, ierr) > > call VecCopy(u, ctx%p%v_u, ierr); CHKERRQ(ierr) > > end > > However, since my RHSJacobian is never invoked, I dont have access to the current solution needed in my ApplyPC. How can I access the current solution in my ApplyPC function ? > > Another issue I am facing is this. I attach my user context to the PC > > call TSGetSNES(ts, snes, ierr); CHKERRQ(ierr) > call SNESGetKSP(snes, ksp, ierr); CHKERRQ(ierr) > call KSPGetPC(ksp, pc, ierr); CHKERRQ(ierr) > call PCSetType(pc, PCSHELL, ierr); CHKERRQ(ierr) > call PCShellSetContext(pc, ctx, ierr); CHKERRQ(ierr) > call PCShellSetApply(pc, ApplyPC, ierr); CHKERRQ(ierr) > > Inside my ApplyPC, I access ctx using > > subroutine ApplyPC(pc, x, y, ierr) > use petscpc > use mtsdata > implicit none > PC :: pc > Vec :: x, y > PetscErrorCode :: ierr > ! Local variables > type(tsdata) :: ctx > > call PCShellGetContext(pc, ctx, ierr); CHKERRQ(ierr) > > end > > But the ctx I get from above is not correct, since it does not have the data I stored in my original ctx. This won't work. You need to use a pointer for the context not an entirely new context argument. See src/mat/examples/tutorials/ex6f.F90 for how to use derived data types and pointers with contexts in Fortran. Barry > > I am rather clueless at this stage. > > Thanks for your help. > Best regards > praveen From edoardo.alinovi at gmail.com Tue May 22 04:47:19 2018 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Tue, 22 May 2018 11:47:19 +0200 Subject: [petsc-users] Solve a 1d non linear equation in petsc Message-ID: Dear PETSC users, I need to solve a simple equation of the type f(x)=0. f is a non linear operator and I know its analytical form. This operation has to be done locally inside a processor domain. I suppose that petsc provides a very usefull way to carry out this. Would you please suggest me a proper way? Many thanks, Edoardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 22 04:55:25 2018 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 22 May 2018 05:55:25 -0400 Subject: [petsc-users] Solve a 1d non linear equation in petsc In-Reply-To: References: Message-ID: On Tue, May 22, 2018 at 5:47 AM, Edoardo alinovi wrote: > Dear PETSC users, > > I need to solve a simple equation of the type f(x)=0. f is a non linear > operator and I know its analytical form. This operation has to be done > locally inside a processor domain. I suppose that petsc provides a very > usefull way to carry out this. Would you please suggest me a proper way? > Solve it with SNES, exactly as you would a system of equations. Thanks, Matt > Many thanks, > > Edoardo > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Tue May 22 09:53:10 2018 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Tue, 22 May 2018 10:53:10 -0400 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation Message-ID: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> Hi, The given matrix+vector is bogus with SuperLU_Dist on some of our nighlty validation tests since I activated the parallel symbolic factorisation. (with -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 ) I extracted an example system and reproduced the bug with src/ksp/ksp/examples/tests/ex6.c that I can run it with 2 or 3 processes, but with 4 it gives a FPE on process #1: mpirun -n 4 ./ex6 -f AssembleurGD_resolution_no_0_0 -ksp_view -ksp_type preonly -pc_type lu -pc_factor_mat_solver_type superlu_dist -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 ... [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 467 /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [1]PETSC ERROR: [1] MatLUFactorNumeric_SuperLU_DIST line 314 /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [1]PETSC ERROR: [1] MatLUFactorNumeric line 3014 /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/interface/matrix.c [1]PETSC ERROR: [1] PCSetUp_LU line 59 /home/mefpp_ericc/petsc-3.9.2-debug/src/ksp/pc/impls/factor/lu/lu.c [1]PETSC ERROR: [1] PCSetUp line 885 /home/mefpp_ericc/petsc-3.9.2-debug/src/ksp/pc/interface/precon.c [1]PETSC ERROR: [1] KSPSetUp line 294 /home/mefpp_ericc/petsc-3.9.2-debug/src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 [1]PETSC ERROR: ./ex6 on a named lorien by eric Tue May 22 10:39:15 2018 [1]PETSC ERROR: Configure options --prefix=/opt/petsc-3.9.2_debug_openmpi-1.10.2 --with-mpi-compilers=1 --with-mpi-dir=/opt/openmpi-1.10.2 --with-make-np=12 --with-shared-libraries=1 --with-debugging=yes --with-memalign=64 --with-visibility=0 --with-64-bit-indices=0 --download-ml=yes --download-mumps=yes --download-superlu=yes --download-superlu_dist=yes --download-parmetis=yes --download-ptscotch=yes --download-metis=yes --download-suitesparse=yes --download-hypre=yes --with-blaslapack-dir=/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64 --with-mkl_pardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl --with-mkl_cpardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl --with-scalapack=1 --with-scalapack-include=/opt/intel/composer_xe_2015.2.164/mkl/include --with-scalapack-lib="-L/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64" [1]PETSC ERROR: #1 User provided function() line 0 in unknown file ... The given Matrix+Vector are available here: http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_parallel_factorisation/AssembleurGD_resolution_no_0_0 http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_parallel_factorisation/AssembleurGD_resolution_no_0_0.info If I run with -on_error_attach_debugger, I can see a division by zero here: #8 (gdb) #9 0x00007f96a2148e52 in libmetis__FM_2WayCutRefine (ctrl=0x2784d20, graph=0x2784940, ntpwgts=0x7ffdfa323060, niter=4) at /home/mefpp_ericc/petsc-3.9.2-debug/arch-linux2-c-debug/externalpackages/git.metis/libmetis/fm.c:60 60 avgvwgt = gk_min((pwgts[0]+pwgts[1])/20, 2*(pwgts[0]+pwgts[1])/nvtxs); and nvtxs value is "0"... Thanks! Eric From hzhang at mcs.anl.gov Tue May 22 10:37:54 2018 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 22 May 2018 10:37:54 -0500 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> Message-ID: Eric: Likely, you encounter a zero pivot. Run your code with '-ksp_error_if_not_converged' would show it. Adding option '-mat_superlu_dist_replacetinypivot' might help. Hong Hi, > > The given matrix+vector is bogus with SuperLU_Dist on some of our nighlty > validation tests since I activated the parallel symbolic factorisation. > (with -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 ) > > I extracted an example system and reproduced the bug with > src/ksp/ksp/examples/tests/ex6.c that I can run it with 2 or 3 processes, > but with 4 it gives a FPE on process #1: > > mpirun -n 4 ./ex6 -f AssembleurGD_resolution_no_0_0 -ksp_view -ksp_type > preonly -pc_type lu -pc_factor_mat_solver_type superlu_dist > -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 > > ... > [1]PETSC ERROR: ------------------------------ > ------------------------------------------ > [1]PETSC ERROR: Caught signal number 8 FPE: Floating Point > Exception,probably divide by zero > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/d > ocumentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 467 > /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/su > perlu_dist/superlu_dist.c > [1]PETSC ERROR: [1] MatLUFactorNumeric_SuperLU_DIST line 314 > /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/su > perlu_dist/superlu_dist.c > [1]PETSC ERROR: [1] MatLUFactorNumeric line 3014 > /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/interface/matrix.c > [1]PETSC ERROR: [1] PCSetUp_LU line 59 /home/mefpp_ericc/petsc-3.9.2- > debug/src/ksp/pc/impls/factor/lu/lu.c > [1]PETSC ERROR: [1] PCSetUp line 885 /home/mefpp_ericc/petsc-3.9.2- > debug/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: [1] KSPSetUp line 294 /home/mefpp_ericc/petsc-3.9.2- > debug/src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > [1]PETSC ERROR: ./ex6 on a named lorien by eric Tue May 22 10:39:15 2018 > [1]PETSC ERROR: Configure options --prefix=/opt/petsc-3.9.2_debug_openmpi-1.10.2 > --with-mpi-compilers=1 --with-mpi-dir=/opt/openmpi-1.10.2 > --with-make-np=12 --with-shared-libraries=1 --with-debugging=yes > --with-memalign=64 --with-visibility=0 --with-64-bit-indices=0 > --download-ml=yes --download-mumps=yes --download-superlu=yes > --download-superlu_dist=yes --download-parmetis=yes --download-ptscotch=yes > --download-metis=yes --download-suitesparse=yes --download-hypre=yes > --with-blaslapack-dir=/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64 > --with-mkl_pardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl > --with-mkl_cpardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl > --with-scalapack=1 --with-scalapack-include=/opt/ > intel/composer_xe_2015.2.164/mkl/include --with-scalapack-lib="-L/opt/i > ntel/composer_xe_2015.2.164/mkl/lib/intel64 -lmkl_scalapack_lp64 > -lmkl_blacs_openmpi_lp64" > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > ... > > The given Matrix+Vector are available here: > > http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_parallel_ > factorisation/AssembleurGD_resolution_no_0_0 > > http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_parallel_ > factorisation/AssembleurGD_resolution_no_0_0.info > > If I run with -on_error_attach_debugger, I can see a division by zero here: > > #8 > (gdb) > #9 0x00007f96a2148e52 in libmetis__FM_2WayCutRefine (ctrl=0x2784d20, > graph=0x2784940, ntpwgts=0x7ffdfa323060, niter=4) > at /home/mefpp_ericc/petsc-3.9.2-debug/arch-linux2-c-debug/exte > rnalpackages/git.metis/libmetis/fm.c:60 > 60 avgvwgt = gk_min((pwgts[0]+pwgts[1])/20, > 2*(pwgts[0]+pwgts[1])/nvtxs); > > and nvtxs value is "0"... > > Thanks! > > Eric > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue May 22 10:41:23 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 22 May 2018 15:41:23 +0000 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> Message-ID: <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> 0x00007f96a2148e52 in libmetis__FM_2WayCutRefine (ctrl=0x2784d20, graph=0x2784940, ntpwgts=0x7ffdfa323060, niter=4) at /home/mefpp_ericc/petsc-3.9.2-debug/arch-linux2-c-debug/externalpackages/git.metis/libmetis/fm.c:60 It appears the crash is in metis, not SuperLU_Dist. So either a bug in Metis or a bug in our Metis is called by ParMetis or SuperLU_Dist. Barry > On May 22, 2018, at 10:37 AM, Hong wrote: > > Eric: > Likely, you encounter a zero pivot. Run your code with '-ksp_error_if_not_converged' would show it. > Adding option '-mat_superlu_dist_replacetinypivot' might help. > Hong > > Hi, > > The given matrix+vector is bogus with SuperLU_Dist on some of our nighlty validation tests since I activated the parallel symbolic factorisation. (with -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 ) > > I extracted an example system and reproduced the bug with src/ksp/ksp/examples/tests/ex6.c that I can run it with 2 or 3 processes, but with 4 it gives a FPE on process #1: > > mpirun -n 4 ./ex6 -f AssembleurGD_resolution_no_0_0 -ksp_view -ksp_type preonly -pc_type lu -pc_factor_mat_solver_type superlu_dist -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 > > ... > [1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 467 /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [1]PETSC ERROR: [1] MatLUFactorNumeric_SuperLU_DIST line 314 /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [1]PETSC ERROR: [1] MatLUFactorNumeric line 3014 /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/interface/matrix.c > [1]PETSC ERROR: [1] PCSetUp_LU line 59 /home/mefpp_ericc/petsc-3.9.2-debug/src/ksp/pc/impls/factor/lu/lu.c > [1]PETSC ERROR: [1] PCSetUp line 885 /home/mefpp_ericc/petsc-3.9.2-debug/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: [1] KSPSetUp line 294 /home/mefpp_ericc/petsc-3.9.2-debug/src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > [1]PETSC ERROR: ./ex6 on a named lorien by eric Tue May 22 10:39:15 2018 > [1]PETSC ERROR: Configure options --prefix=/opt/petsc-3.9.2_debug_openmpi-1.10.2 --with-mpi-compilers=1 --with-mpi-dir=/opt/openmpi-1.10.2 --with-make-np=12 --with-shared-libraries=1 --with-debugging=yes --with-memalign=64 --with-visibility=0 --with-64-bit-indices=0 --download-ml=yes --download-mumps=yes --download-superlu=yes --download-superlu_dist=yes --download-parmetis=yes --download-ptscotch=yes --download-metis=yes --download-suitesparse=yes --download-hypre=yes --with-blaslapack-dir=/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64 --with-mkl_pardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl --with-mkl_cpardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl --with-scalapack=1 --with-scalapack-include=/opt/intel/composer_xe_2015.2.164/mkl/include --with-scalapack-lib="-L/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64" > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > ... > > The given Matrix+Vector are available here: > > http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_parallel_factorisation/AssembleurGD_resolution_no_0_0 > > http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_parallel_factorisation/AssembleurGD_resolution_no_0_0.info > > If I run with -on_error_attach_debugger, I can see a division by zero here: > > #8 > (gdb) > #9 0x00007f96a2148e52 in libmetis__FM_2WayCutRefine (ctrl=0x2784d20, graph=0x2784940, ntpwgts=0x7ffdfa323060, niter=4) > at /home/mefpp_ericc/petsc-3.9.2-debug/arch-linux2-c-debug/externalpackages/git.metis/libmetis/fm.c:60 > 60 avgvwgt = gk_min((pwgts[0]+pwgts[1])/20, 2*(pwgts[0]+pwgts[1])/nvtxs); > > and nvtxs value is "0"... > > Thanks! > > Eric > From xsli at lbl.gov Tue May 22 10:45:20 2018 From: xsli at lbl.gov (Xiaoye S. Li) Date: Tue, 22 May 2018 08:45:20 -0700 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> Message-ID: Indeed, I am pretty sure the bug is in ParMETIS. A few years ago, I sent a sample matrix and debug trace to George Karypis, he was going to look at it, but never did. This bug seems to show up when the graph is relatively dense. Can you try to use serial symbolic factorization and Metis? Sherry On Tue, May 22, 2018 at 8:41 AM, Smith, Barry F. wrote: > > 0x00007f96a2148e52 in libmetis__FM_2WayCutRefine (ctrl=0x2784d20, > graph=0x2784940, ntpwgts=0x7ffdfa323060, niter=4) > at /home/mefpp_ericc/petsc-3.9.2-debug/arch-linux2-c-debug/ > externalpackages/git.metis/libmetis/fm.c:60 > > It appears the crash is in metis, not SuperLU_Dist. > > So either a bug in Metis or a bug in our Metis is called by ParMetis or > SuperLU_Dist. > > Barry > > > > > > On May 22, 2018, at 10:37 AM, Hong wrote: > > > > Eric: > > Likely, you encounter a zero pivot. Run your code with > '-ksp_error_if_not_converged' would show it. > > Adding option '-mat_superlu_dist_replacetinypivot' might help. > > Hong > > > > Hi, > > > > The given matrix+vector is bogus with SuperLU_Dist on some of our > nighlty validation tests since I activated the parallel symbolic > factorisation. (with -mat_superlu_dist_colperm PARMETIS > -mat_superlu_dist_parsymbfact 1 ) > > > > I extracted an example system and reproduced the bug with > src/ksp/ksp/examples/tests/ex6.c that I can run it with 2 or 3 processes, > but with 4 it gives a FPE on process #1: > > > > mpirun -n 4 ./ex6 -f AssembleurGD_resolution_no_0_0 -ksp_view -ksp_type > preonly -pc_type lu -pc_factor_mat_solver_type superlu_dist > -mat_superlu_dist_colperm PARMETIS -mat_superlu_dist_parsymbfact 1 > > > > ... > > [1]PETSC ERROR: ------------------------------ > ------------------------------------------ > > [1]PETSC ERROR: Caught signal number 8 FPE: Floating Point > Exception,probably divide by zero > > [1]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [1]PETSC ERROR: likely location of problem given in stack below > > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > [1]PETSC ERROR: INSTEAD the line number of the start of the > function > > [1]PETSC ERROR: is given. > > [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 467 > /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/ > superlu_dist/superlu_dist.c > > [1]PETSC ERROR: [1] MatLUFactorNumeric_SuperLU_DIST line 314 > /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/impls/aij/mpi/ > superlu_dist/superlu_dist.c > > [1]PETSC ERROR: [1] MatLUFactorNumeric line 3014 > /home/mefpp_ericc/petsc-3.9.2-debug/src/mat/interface/matrix.c > > [1]PETSC ERROR: [1] PCSetUp_LU line 59 /home/mefpp_ericc/petsc-3.9.2- > debug/src/ksp/pc/impls/factor/lu/lu.c > > [1]PETSC ERROR: [1] PCSetUp line 885 /home/mefpp_ericc/petsc-3.9.2- > debug/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: [1] KSPSetUp line 294 /home/mefpp_ericc/petsc-3.9.2- > debug/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: Signal received > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > > [1]PETSC ERROR: ./ex6 on a named lorien by eric Tue May 22 10:39:15 2018 > > [1]PETSC ERROR: Configure options --prefix=/opt/petsc-3.9.2_debug_openmpi-1.10.2 > --with-mpi-compilers=1 --with-mpi-dir=/opt/openmpi-1.10.2 > --with-make-np=12 --with-shared-libraries=1 --with-debugging=yes > --with-memalign=64 --with-visibility=0 --with-64-bit-indices=0 > --download-ml=yes --download-mumps=yes --download-superlu=yes > --download-superlu_dist=yes --download-parmetis=yes --download-ptscotch=yes > --download-metis=yes --download-suitesparse=yes --download-hypre=yes > --with-blaslapack-dir=/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64 > --with-mkl_pardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl > --with-mkl_cpardiso-dir=/opt/intel/composer_xe_2015.2.164/mkl > --with-scalapack=1 --with-scalapack-include=/opt/ > intel/composer_xe_2015.2.164/mkl/include --with-scalapack-lib="-L/opt/ > intel/composer_xe_2015.2.164/mkl/lib/intel64 -lmkl_scalapack_lp64 > -lmkl_blacs_openmpi_lp64" > > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > > ... > > > > The given Matrix+Vector are available here: > > > > http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_ > parallel_factorisation/AssembleurGD_resolution_no_0_0 > > > > http://www.giref.ulaval.ca/~ericc/bug_superlu_dist_ > parallel_factorisation/AssembleurGD_resolution_no_0_0.info > > > > If I run with -on_error_attach_debugger, I can see a division by zero > here: > > > > #8 > > (gdb) > > #9 0x00007f96a2148e52 in libmetis__FM_2WayCutRefine (ctrl=0x2784d20, > graph=0x2784940, ntpwgts=0x7ffdfa323060, niter=4) > > at /home/mefpp_ericc/petsc-3.9.2-debug/arch-linux2-c-debug/ > externalpackages/git.metis/libmetis/fm.c:60 > > 60 avgvwgt = gk_min((pwgts[0]+pwgts[1])/20, > 2*(pwgts[0]+pwgts[1])/nvtxs); > > > > and nvtxs value is "0"... > > > > Thanks! > > > > Eric > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Tue May 22 10:55:25 2018 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Tue, 22 May 2018 11:55:25 -0400 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> Message-ID: On 22/05/18 11:45 AM, Xiaoye S. Li wrote: > This bug seems to show up when the graph is relatively dense. Can you > try to use serial symbolic factorization and Metis? Exactly: this bug shows up when I activate the parallel symbolic factorisation, otherwise I do not have it. Eric From Eric.Chamberland at giref.ulaval.ca Tue May 22 11:00:28 2018 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Tue, 22 May 2018 12:00:28 -0400 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> Message-ID: <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> And I will add a question: Shouldn't there be an automatic switch to parallele factorisation when num. of process is greater than 1 ? Eric On 22/05/18 11:55 AM, Eric Chamberland wrote: > Exactly: this bug shows up when I activate the parallel symbolic > factorisation, otherwise I do not have it. From xsli at lbl.gov Tue May 22 11:11:46 2018 From: xsli at lbl.gov (Xiaoye S. Li) Date: Tue, 22 May 2018 09:11:46 -0700 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> Message-ID: Numerical factorization is always parallel (based on number of MPI tasks and OMP_NUM_THREADS you set), the issue here is only related to symbolic factorization (figuring out the nonzero pattern in the LU factors). Default setting is to use sequential symbolic factorization, precisely due to the ParMETIS bugs. (The parallel symbolic factorization needs ParMETIS ordering.) Question for PETSc developers -- Is PT-Scotch an installation option? I can try to use PT_Scotch together with parallel symbolic factorization, which should be more stable than ParMETIS. Sherry On Tue, May 22, 2018 at 9:00 AM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > And I will add a question: > > Shouldn't there be an automatic switch to parallele factorisation when > num. of process is greater than 1 ? > > Eric > > > On 22/05/18 11:55 AM, Eric Chamberland wrote: > >> Exactly: this bug shows up when I activate the parallel symbolic >> factorisation, otherwise I do not have it. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Tue May 22 11:15:41 2018 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Tue, 22 May 2018 17:15:41 +0100 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> Message-ID: <7066019f-3796-c11b-5b7a-574b356d3838@imperial.ac.uk> On 22/05/18 17:11, Xiaoye S. Li wrote: > Numerical factorization is always parallel (based on number of MPI > tasks and OMP_NUM_THREADS you set), the issue here is only related to > symbolic factorization (figuring out the nonzero pattern in the LU > factors). Default setting is to use sequential symbolic factorization, > precisely due to the ParMETIS bugs. > (The parallel symbolic factorization needs ParMETIS ordering.) > > Question for PETSc developers -- > Is PT-Scotch an installation option?? I can try to use PT_Scotch > together with parallel symbolic factorization, which should be more > stable than ParMETIS. --download-ptscotch is a thing (I can't remember what other bits it needs). Cheers, Lawrence From Eric.Chamberland at giref.ulaval.ca Tue May 22 12:57:37 2018 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Tue, 22 May 2018 13:57:37 -0400 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> Message-ID: On 22/05/18 12:11 PM, Xiaoye S. Li wrote: > Default setting is to use sequential symbolic factorization, precisely > due to the ParMETIS bugs. Ok, and I saw you reported the bug "a few years ago" and still have not received a fix... I would like to "live with the patch" (ie working in sequential) but our problem is that if we compute err=|Ax-b|, the resolution with *sequential* symbolic factorisation gives ans err around 1e-6 instead of 1e-16 for parallel one (when it works). MUMPS, also gives 1e-16 error levels. In our nightly tests we have to compare computed solutions and they are now acceptable if they are of the order of 1e-15. I do *not* want to raise this comparison to 1e-6 just because there is a bug in parmetis which forces us to use the "unprecise" sequential options... In other words, I am kind of "stuck" with this non-fixed bug now... :/ Hope this can be fixed... Thanks, Eric From bsmith at mcs.anl.gov Tue May 22 13:03:22 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 22 May 2018 18:03:22 +0000 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> Message-ID: Hmm, why would > the resolution with *sequential* symbolic factorisation gives ans err around 1e-6 instead of 1e-16 for parallel one (when it works). ? One would think that doing a "sequential" symbolic factorization won't affect the answer to this huge amount? Perhaps this is the problem that needs to be addressed. Barry > On May 22, 2018, at 12:57 PM, Eric Chamberland wrote: > > On 22/05/18 12:11 PM, Xiaoye S. Li wrote: > > Default setting is to use sequential symbolic factorization, precisely > > due to the ParMETIS bugs. > > Ok, > > and I saw you reported the bug "a few years ago" and still have not received a fix... > > I would like to "live with the patch" (ie working in sequential) but our problem is that if we compute err=|Ax-b|, the resolution with *sequential* symbolic factorisation gives ans err around 1e-6 instead of 1e-16 for parallel one (when it works). MUMPS, also gives 1e-16 error levels. > > In our nightly tests we have to compare computed solutions and they are now acceptable if they are of the order of 1e-15. I do *not* want to raise this comparison to 1e-6 just because there is a bug in parmetis which forces us to use the "unprecise" sequential options... > > In other words, I am kind of "stuck" with this non-fixed bug now... :/ > > Hope this can be fixed... > > Thanks, > > Eric From Eric.Chamberland at giref.ulaval.ca Tue May 22 13:18:19 2018 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Tue, 22 May 2018 14:18:19 -0400 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> Message-ID: <17dd0fc9-db38-95f4-8c67-96cb61f1f0cd@giref.ulaval.ca> On 22/05/18 02:03 PM, Smith, Barry F. wrote: > > Hmm, why would > >> the resolution with *sequential* symbolic factorisation gives ans err around 1e-6 instead of 1e-16 for parallel one (when it works). > > ? One would think that doing a "sequential" symbolic factorization won't affect the answer to this huge amount? Perhaps this is the problem that needs to be addressed. > I do agree that this is a huge amount of difference... and if we agree this is also a bug, than it means there are not one but two bugs that deserve to be fixed... Thanks, Eric From fdkong.jd at gmail.com Tue May 22 13:22:32 2018 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 22 May 2018 12:22:32 -0600 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: <17dd0fc9-db38-95f4-8c67-96cb61f1f0cd@giref.ulaval.ca> References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> <17dd0fc9-db38-95f4-8c67-96cb61f1f0cd@giref.ulaval.ca> Message-ID: Hi Eric, I am curious if the parallel symbolic factoriation is faster than the sequential version? Do you have timing? Fande, On Tue, May 22, 2018 at 12:18 PM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > > > On 22/05/18 02:03 PM, Smith, Barry F. wrote: > >> >> Hmm, why would >> >> the resolution with *sequential* symbolic factorisation gives ans err >>> around 1e-6 instead of 1e-16 for parallel one (when it works). >>> >> >> ? One would think that doing a "sequential" symbolic factorization >> won't affect the answer to this huge amount? Perhaps this is the problem >> that needs to be addressed. >> >> > I do agree that this is a huge amount of difference... and if we agree > this is also a bug, than it means there are not one but two bugs that > deserve to be fixed... > > Thanks, > > Eric > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Tue May 22 14:22:39 2018 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Tue, 22 May 2018 15:22:39 -0400 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> <17dd0fc9-db38-95f4-8c67-96cb61f1f0cd@giref.ulaval.ca> Message-ID: <06ac7d9f-44ce-dea7-e463-09a33228edf0@giref.ulaval.ca> Hi Fande, I don't know, I am working and validating with a DEBUG version of PETSc, and this "mwe" is a 30x30 matrix... But I "hope" the parallel version is faster for large problems... if it is not maybe it should be somewhat reviewed... Eric On 22/05/18 02:22 PM, Fande Kong wrote: > Hi Eric, > > I am curious if the parallel symbolic factoriation is faster than > the sequential version? Do you have timing? > > > Fande, > > On Tue, May 22, 2018 at 12:18 PM, Eric Chamberland > > wrote: > > > > On 22/05/18 02:03 PM, Smith, Barry F. wrote: > > > Hmm, why would > > the resolution with *sequential* symbolic factorisation > gives ans err around 1e-6 instead of 1e-16 for parallel one > (when it works). > > > ? One would think that doing a "sequential" symbolic > factorization won't affect the answer to this huge amount? > Perhaps this is the problem that needs to be addressed. > > > I do agree that this is a huge amount of difference... and if we > agree this is also a bug, than it means there are not one but two > bugs that deserve to be fixed... > > Thanks, > > Eric > > From xsli at lbl.gov Tue May 22 14:25:18 2018 From: xsli at lbl.gov (Xiaoye S. Li) Date: Tue, 22 May 2018 12:25:18 -0700 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: <06ac7d9f-44ce-dea7-e463-09a33228edf0@giref.ulaval.ca> References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> <17dd0fc9-db38-95f4-8c67-96cb61f1f0cd@giref.ulaval.ca> <06ac7d9f-44ce-dea7-e463-09a33228edf0@giref.ulaval.ca> Message-ID: Is it possible to download this particular matrix, so I can do standalone investigation? Sherry On Tue, May 22, 2018 at 12:22 PM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > Hi Fande, > > I don't know, I am working and validating with a DEBUG version of PETSc, > and this "mwe" is a 30x30 matrix... > > But I "hope" the parallel version is faster for large problems... if it is > not maybe it should be somewhat reviewed... > > Eric > > > On 22/05/18 02:22 PM, Fande Kong wrote: > >> Hi Eric, >> >> I am curious if the parallel symbolic factoriation is faster than the >> sequential version? Do you have timing? >> >> >> Fande, >> >> On Tue, May 22, 2018 at 12:18 PM, Eric Chamberland < >> Eric.Chamberland at giref.ulaval.ca > >> wrote: >> >> >> >> On 22/05/18 02:03 PM, Smith, Barry F. wrote: >> >> >> Hmm, why would >> >> the resolution with *sequential* symbolic factorisation >> gives ans err around 1e-6 instead of 1e-16 for parallel one >> (when it works). >> >> >> ? One would think that doing a "sequential" symbolic >> factorization won't affect the answer to this huge amount? >> Perhaps this is the problem that needs to be addressed. >> >> >> I do agree that this is a huge amount of difference... and if we >> agree this is also a bug, than it means there are not one but two >> bugs that deserve to be fixed... >> >> Thanks, >> >> Eric >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From klindsay at ucar.edu Tue May 22 15:03:14 2018 From: klindsay at ucar.edu (Keith Lindsay) Date: Tue, 22 May 2018 14:03:14 -0600 Subject: [petsc-users] SuperLU_dist bug with parallel symbolic factorisation In-Reply-To: References: <4ee7c7df-2bc7-45f7-9f87-9128842e78c5@giref.ulaval.ca> <022AB009-DFEB-47AE-B62C-26FF0FE8F1A6@anl.gov> <898094c9-faf3-3f00-33a2-243d7b321e39@giref.ulaval.ca> <17dd0fc9-db38-95f4-8c67-96cb61f1f0cd@giref.ulaval.ca> Message-ID: Hi, I use SuperLU_dist, outside of PETSc, and use the parallel symbolic factorization functionality. In my experience it is significantly faster than the serial symbolic factorization. I don't have clean numbers on hand, but my recollection is that going from serial to parallel reduced time spent in the symbolic factorization by around an order of magnitude. Using ParMETIS also significantly reduced the memory footprint of the symbolic factorization. I suspect that the impact of ParMETIS on performance depends on the application. In my application, I was using a matrix with ~4.2e6 unknowns, an average of 20 non-zeros per row, and running on 256 cores. Keith On Tue, May 22, 2018 at 12:22 PM, Fande Kong wrote: > Hi Eric, > > I am curious if the parallel symbolic factoriation is faster than > the sequential version? Do you have timing? > > > Fande, > > On Tue, May 22, 2018 at 12:18 PM, Eric Chamberland < > Eric.Chamberland at giref.ulaval.ca> wrote: > >> >> >> On 22/05/18 02:03 PM, Smith, Barry F. wrote: >> >>> >>> Hmm, why would >>> >>> the resolution with *sequential* symbolic factorisation gives ans err >>>> around 1e-6 instead of 1e-16 for parallel one (when it works). >>>> >>> >>> ? One would think that doing a "sequential" symbolic factorization >>> won't affect the answer to this huge amount? Perhaps this is the problem >>> that needs to be addressed. >>> >>> >> I do agree that this is a huge amount of difference... and if we agree >> this is also a bug, than it means there are not one but two bugs that >> deserve to be fixed... >> >> Thanks, >> >> Eric >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Wed May 23 00:24:14 2018 From: cpraveen at gmail.com (Praveen C) Date: Wed, 23 May 2018 10:54:14 +0530 Subject: [petsc-users] Matrix-free preconditioner using shell matrix In-Reply-To: References: <87603loysx.fsf@jedbrown.org> <6BC71B3F-74CE-4E16-9DB1-71B807607308@gmail.com> <97E21833-8249-48A2-B851-D7A2ADE3CF13@mcs.anl.gov> <1BA236C4-26B7-4CF4-A2DE-980E102D2E49@gmail.com> Message-ID: <2B141ADC-555C-4F11-9D41-D80BA085ADC4@gmail.com> Thanks a lot Barry. I sorted out the user context problem with your help. But I have to fix some other issues to make it work. Best praveen > On 19-May-2018, at 10:31 PM, Smith, Barry F. wrote: > > > >> On May 19, 2018, at 11:42 AM, Praveen C wrote: >> >> Thanks a lot. >> >>> On 19-May-2018, at 9:16 PM, Smith, Barry F. wrote: >>> >>> Praveen, >>> >>> Ahh, we didn't have support for passing PETSC_NULL_MAT from Fortran for this routine. I have added it in the branch barry/fix-null-matrix-set-jacobian/maint which after testing will go into the maint branch. >>> >>> Could you please try this branch and let us know if it resolves the problem (or a new problem pops up)? >>> >>> Barry >>> >>> Unfortunately handling null objects from Fortran requires us to manually tweak the Fortran interface functions and sometimes we forget or don't realize they need fixing until someone reports a problem. >> >> I meanwhile passed a Mat object to TSSetRHSJacobian function. It seems that with -snes_mf and PCSHELL, my RHSJacobian is never invoked. Is this the correct behaviour ? > > Yes, if you want your function to be called you need to use -snes_mf_operator instead. >> >> In RHSJacobian, I am storing the current solution into my usercontext so that I can use it in my shell preconditioner >> >> subroutine RHSJacobian(ts, time, u, J, P, ctx, ierr) >> >> call VecCopy(u, ctx%p%v_u, ierr); CHKERRQ(ierr) >> >> end >> >> However, since my RHSJacobian is never invoked, I dont have access to the current solution needed in my ApplyPC. How can I access the current solution in my ApplyPC function ? >> >> Another issue I am facing is this. I attach my user context to the PC >> >> call TSGetSNES(ts, snes, ierr); CHKERRQ(ierr) >> call SNESGetKSP(snes, ksp, ierr); CHKERRQ(ierr) >> call KSPGetPC(ksp, pc, ierr); CHKERRQ(ierr) >> call PCSetType(pc, PCSHELL, ierr); CHKERRQ(ierr) >> call PCShellSetContext(pc, ctx, ierr); CHKERRQ(ierr) >> call PCShellSetApply(pc, ApplyPC, ierr); CHKERRQ(ierr) >> >> Inside my ApplyPC, I access ctx using >> >> subroutine ApplyPC(pc, x, y, ierr) >> use petscpc >> use mtsdata >> implicit none >> PC :: pc >> Vec :: x, y >> PetscErrorCode :: ierr >> ! Local variables >> type(tsdata) :: ctx >> >> call PCShellGetContext(pc, ctx, ierr); CHKERRQ(ierr) >> >> end >> >> But the ctx I get from above is not correct, since it does not have the data I stored in my original ctx. > > This won't work. You need to use a pointer for the context not an entirely new context argument. See > > src/mat/examples/tutorials/ex6f.F90 > > for how to use derived data types and pointers with contexts in Fortran. > > Barry > >> >> I am rather clueless at this stage. >> >> Thanks for your help. >> Best regards >> praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Wed May 23 00:29:53 2018 From: cpraveen at gmail.com (Praveen C) Date: Wed, 23 May 2018 10:59:53 +0530 Subject: [petsc-users] Question on PCSHELL with TS Message-ID: <51446CFF-8F15-4C57-9C40-10A4655D11CD@gmail.com> Dear all I am trying to solve a steady state problem using TSPSEUDO and am unclear about the preconditioner. The PDE is 3-d compressible Euler/NS equations solved with FV on unstructured grids. I am using matrix-free approach (-snes_mf_operator) and want to provide my own matrix-free preconditioner using PCSHELL. For the problem du/dt = R(u) the update would be [I - dt*J(u)]*du = dt*R(u), J(u) = R?(u) u = u + du Does my PCSHELL have to provide a preconditioner for J or for I - dt*J ? Thanks praveen From nahmad16 at ku.edu.tr Wed May 23 05:10:18 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Wed, 23 May 2018 15:10:18 +0500 Subject: [petsc-users] Using PETSc for performance evaluation, tuning on multicore machines Message-ID: Hi All, I am a PhD student in parallel computing working on unstructured meshes optimizations on multicore architectures. For this purpose, I want to use a simple block Jacobi iterative solver using unstructured meshes to see how I can improve performance of the solver on a multicore machine (KNL for example). I wanted to know if PETSc can be good choice for this purpose? Will it allow me to experiment with data layout, compiler optimizations (e.g. AVX512) etc.? I thank you for your assistance and useful feedback. -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 23 05:22:15 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 May 2018 06:22:15 -0400 Subject: [petsc-users] Using PETSc for performance evaluation, tuning on multicore machines In-Reply-To: References: Message-ID: On Wed, May 23, 2018 at 6:10 AM, Najeeb Ahmad wrote: > Hi All, > > I am a PhD student in parallel computing working on unstructured meshes > optimizations on multicore architectures. For this purpose, I want to use a > simple block Jacobi iterative solver using unstructured meshes to see how I > can improve performance of the solver on a multicore machine (KNL for > example). I wanted to know if PETSc can be good choice for this purpose? > Will it allow me to experiment with data layout, compiler optimizations > (e.g. AVX512) etc.? > It is intended to be. For example, there are matrix classes specialized for MKL ( http://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/Mat/MATBAIJMKL.html ), and many parts of the code, e.g. VecScatter, which have specific KNL optimizations. Thanks, Matt > I thank you for your assistance and useful feedback. > > -- > *Najeeb Ahmad* > > > *Research and Teaching Assistant* > *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * > > *Computer Science and Engineering* > *Ko? University, Istanbul, Turkey* > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 23 07:15:58 2018 From: jed at jedbrown.org (Jed Brown) Date: Wed, 23 May 2018 06:15:58 -0600 Subject: [petsc-users] Question on PCSHELL with TS In-Reply-To: <51446CFF-8F15-4C57-9C40-10A4655D11CD@gmail.com> References: <51446CFF-8F15-4C57-9C40-10A4655D11CD@gmail.com> Message-ID: <87tvqyh8jl.fsf@jedbrown.org> Praveen C writes: > Dear all > > I am trying to solve a steady state problem using TSPSEUDO and am unclear about the preconditioner. The PDE is 3-d compressible Euler/NS equations solved with FV on unstructured grids. > > I am using matrix-free approach (-snes_mf_operator) and want to provide my own matrix-free preconditioner using PCSHELL. > > For the problem > > du/dt = R(u) > > the update would be > > [I - dt*J(u)]*du = dt*R(u), J(u) = R?(u) > > u = u + du > > Does my PCSHELL have to provide a preconditioner for J or for I - dt*J ? You'll be solving with I/dt - J. From support at mega.nz Wed May 23 08:41:06 2018 From: support at mega.nz (MEGA) Date: Wed, 23 May 2018 15:41:06 +0200 (CEST) Subject: [petsc-users] Invitation MEGA Message-ID: <20180523134118.03D0721941@lu4.api.mega.nz> petsc-users at mcs.anl.gov L'utilisateur dont l'adresse courriel est jean-frederic at thebault-net.com vous a envoy? une invitation ? adh?rer ? MEGA. "" Rejoignezjean-frederic at thebault-net.com sur MEGA en cliquant ci-dessous : https://mega.nz/#newsignupcGV0c2MtdXNlcnNAbWNzLmFubC5nb3aasJmNQvf3wA Salutations cordiales, ? L'?quipe MEGA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 2498 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 460 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 534 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 589 bytes Desc: not available URL: From valerio.barnabei at gmail.com Wed May 23 10:03:19 2018 From: valerio.barnabei at gmail.com (Valerio Barnabei) Date: Wed, 23 May 2018 17:03:19 +0200 Subject: [petsc-users] [petsc4py] DMPlexCreateFluentFromFile or equivalent? Message-ID: Hello, I was wondering if there's an equivalent method of DMPlexCreateFluentFromFile in petsc4py: unfortunately I can't find anything similar in the source code, and i would love to directly import a fluent unstructured mesh in my code. Alternatively, can you suggest any implemented method in DMplex.pyx to achieve similar results? Thanks in advance for your help. Best Regards, Valerio -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 23 10:06:01 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 May 2018 11:06:01 -0400 Subject: [petsc-users] [petsc4py] DMPlexCreateFluentFromFile or equivalent? In-Reply-To: References: Message-ID: On Wed, May 23, 2018 at 11:03 AM, Valerio Barnabei < valerio.barnabei at gmail.com> wrote: > Hello, > I was wondering if there's an equivalent method of > DMPlexCreateFluentFromFile in petsc4py: unfortunately I can't find anything > similar in the source code, and i would love to directly import a fluent > unstructured mesh in my code. > Alternatively, can you suggest any implemented method in DMplex.pyx to > achieve similar results? > What kind of Fluent file? This function http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexCreateFromFile.html supports .cas files, http://www.mcs.anl.gov/petsc/petsc-current/src/dm/impls/plex/plexcreate.c.html#DMPlexCreateFromFile Thanks, Matt > Thanks in advance for your help. > Best Regards, > > Valerio > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 23 10:08:49 2018 From: jed at jedbrown.org (Jed Brown) Date: Wed, 23 May 2018 09:08:49 -0600 Subject: [petsc-users] [petsc4py] DMPlexCreateFluentFromFile or equivalent? In-Reply-To: References: Message-ID: <87fu2ih0ji.fsf@jedbrown.org> Just use createFromFile('file.cas') Valerio Barnabei writes: > Hello, > I was wondering if there's an equivalent method of > DMPlexCreateFluentFromFile in petsc4py: unfortunately I can't find anything > similar in the source code, and i would love to directly import a fluent > unstructured mesh in my code. > Alternatively, can you suggest any implemented method in DMplex.pyx to > achieve similar results? > > Thanks in advance for your help. > Best Regards, > > Valerio From jed at jedbrown.org Wed May 23 10:20:52 2018 From: jed at jedbrown.org (Jed Brown) Date: Wed, 23 May 2018 09:20:52 -0600 Subject: [petsc-users] [petsc4py] DMPlexCreateFluentFromFile or equivalent? In-Reply-To: References: <87fu2ih0ji.fsf@jedbrown.org> Message-ID: <87d0xmgzzf.fsf@jedbrown.org> Please always use "reply-all" so that your messages go to the list. This is standard mailing list etiquette. It is important to preserve threading for people who find this discussion later and so that we do not waste our time re-answering the same questions that have already been answered in private side-conversations. You'll likely get an answer faster that way too. Valerio Barnabei writes: > Thanks for you answer. > Since we're on similar topic, is there any chance to import a CFX mesh > (file.grd)? It isn't implemented yet, but can be. Do you have documentation of the format? And would you be interested in helping to implement this (with guidance from us)? > 2018-05-23 17:08 GMT+02:00 Jed Brown : > >> Just use createFromFile('file.cas') >> >> Valerio Barnabei writes: >> >> > Hello, >> > I was wondering if there's an equivalent method of >> > DMPlexCreateFluentFromFile in petsc4py: unfortunately I can't find >> anything >> > similar in the source code, and i would love to directly import a fluent >> > unstructured mesh in my code. >> > Alternatively, can you suggest any implemented method in DMplex.pyx to >> > achieve similar results? >> > >> > Thanks in advance for your help. >> > Best Regards, >> > >> > Valerio >> From valerio.barnabei at gmail.com Wed May 23 10:34:00 2018 From: valerio.barnabei at gmail.com (Valerio Barnabei) Date: Wed, 23 May 2018 17:34:00 +0200 Subject: [petsc-users] [petsc4py] DMPlexCreateFluentFromFile or equivalent? In-Reply-To: <87d0xmgzzf.fsf@jedbrown.org> References: <87fu2ih0ji.fsf@jedbrown.org> <87d0xmgzzf.fsf@jedbrown.org> Message-ID: I apologise for my behaviour, I'm not that familiar with mailing lists. I'm quite familiar with the format: it's a multiblock format, but i don't have a clear documentation about the structure of the format. As for helping in implementing, I can try but I think i still need to become more confident with the structure of PETSc itself, before attempting to add something to it. In any case, I suppose I should refer to another mailing list for that purpose, am I right? 2018-05-23 17:20 GMT+02:00 Jed Brown : > Please always use "reply-all" so that your messages go to the list. > This is standard mailing list etiquette. It is important to preserve > threading for people who find this discussion later and so that we do > not waste our time re-answering the same questions that have already > been answered in private side-conversations. You'll likely get an > answer faster that way too. > > Valerio Barnabei writes: > > > Thanks for you answer. > > Since we're on similar topic, is there any chance to import a CFX mesh > > (file.grd)? > > It isn't implemented yet, but can be. Do you have documentation of the > format? And would you be interested in helping to implement this (with > guidance from us)? > > > 2018-05-23 17:08 GMT+02:00 Jed Brown : > > > >> Just use createFromFile('file.cas') > >> > >> Valerio Barnabei writes: > >> > >> > Hello, > >> > I was wondering if there's an equivalent method of > >> > DMPlexCreateFluentFromFile in petsc4py: unfortunately I can't find > >> anything > >> > similar in the source code, and i would love to directly import a > fluent > >> > unstructured mesh in my code. > >> > Alternatively, can you suggest any implemented method in DMplex.pyx to > >> > achieve similar results? > >> > > >> > Thanks in advance for your help. > >> > Best Regards, > >> > > >> > Valerio > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 23 10:35:48 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 May 2018 11:35:48 -0400 Subject: [petsc-users] [petsc4py] DMPlexCreateFluentFromFile or equivalent? In-Reply-To: References: <87fu2ih0ji.fsf@jedbrown.org> <87d0xmgzzf.fsf@jedbrown.org> Message-ID: On Wed, May 23, 2018 at 11:34 AM, Valerio Barnabei < valerio.barnabei at gmail.com> wrote: > I apologise for my behaviour, I'm not that familiar with mailing lists. > > I'm quite familiar with the format: it's a multiblock format, but i don't > have a clear documentation about the structure of the format. > As for helping in implementing, I can try but I think i still need to > become more confident with the structure of PETSc itself, before attempting > to add something to it. > In any case, I suppose I should refer to another mailing list for that > purpose, am I right? > This is fine for that. I think the crucial concern is to locate a specification for any format you might be interested in. That would allow someone, maybe us maybe you, to support it. Thanks, Matt > 2018-05-23 17:20 GMT+02:00 Jed Brown : > >> Please always use "reply-all" so that your messages go to the list. >> This is standard mailing list etiquette. It is important to preserve >> threading for people who find this discussion later and so that we do >> not waste our time re-answering the same questions that have already >> been answered in private side-conversations. You'll likely get an >> answer faster that way too. >> >> Valerio Barnabei writes: >> >> > Thanks for you answer. >> > Since we're on similar topic, is there any chance to import a CFX mesh >> > (file.grd)? >> >> It isn't implemented yet, but can be. Do you have documentation of the >> format? And would you be interested in helping to implement this (with >> guidance from us)? >> >> > 2018-05-23 17:08 GMT+02:00 Jed Brown : >> > >> >> Just use createFromFile('file.cas') >> >> >> >> Valerio Barnabei writes: >> >> >> >> > Hello, >> >> > I was wondering if there's an equivalent method of >> >> > DMPlexCreateFluentFromFile in petsc4py: unfortunately I can't find >> >> anything >> >> > similar in the source code, and i would love to directly import a >> fluent >> >> > unstructured mesh in my code. >> >> > Alternatively, can you suggest any implemented method in DMplex.pyx >> to >> >> > achieve similar results? >> >> > >> >> > Thanks in advance for your help. >> >> > Best Regards, >> >> > >> >> > Valerio >> >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Wed May 23 10:40:34 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Wed, 23 May 2018 16:40:34 +0100 Subject: [petsc-users] Load binary matrix created in parallel Message-ID: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> Hello, Is there any way to sequentially load a binary matrix file, which is created in parallel? I tried to use PetscViewerBinaryOpen() and MatLoad() to load the matrix that is created by using 2 cores, and solved a linear system by using this matrix but the results is not correct. Kind Regards, Shidi From jed at jedbrown.org Wed May 23 10:42:07 2018 From: jed at jedbrown.org (Jed Brown) Date: Wed, 23 May 2018 09:42:07 -0600 Subject: [petsc-users] Load binary matrix created in parallel In-Reply-To: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> References: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> Message-ID: <87a7sqgz00.fsf@jedbrown.org> "Y. Shidi" writes: > Hello, > > Is there any way to sequentially load a binary matrix file, which > is created in parallel? > I tried to use PetscViewerBinaryOpen() and MatLoad() to load > the matrix that is created by using 2 cores, and solved a linear > system by using this matrix but the results is not correct. How are you evaluating? From ys453 at cam.ac.uk Wed May 23 10:45:14 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Wed, 23 May 2018 16:45:14 +0100 Subject: [petsc-users] Load binary matrix created in parallel In-Reply-To: <87a7sqgz00.fsf@jedbrown.org> References: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> <87a7sqgz00.fsf@jedbrown.org> Message-ID: <0bf608a279778cc5c73658d18452fc2d@cam.ac.uk> Hi, > How are you evaluating? I put the loaded matrix to a linear system and solve it. Cheers, Shidi On 2018-05-23 16:42, Jed Brown wrote: > "Y. Shidi" writes: > >> Hello, >> >> Is there any way to sequentially load a binary matrix file, which >> is created in parallel? >> I tried to use PetscViewerBinaryOpen() and MatLoad() to load >> the matrix that is created by using 2 cores, and solved a linear >> system by using this matrix but the results is not correct. > > How are you evaluating? From knepley at gmail.com Wed May 23 10:48:27 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 May 2018 11:48:27 -0400 Subject: [petsc-users] Load binary matrix created in parallel In-Reply-To: <0bf608a279778cc5c73658d18452fc2d@cam.ac.uk> References: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> <87a7sqgz00.fsf@jedbrown.org> <0bf608a279778cc5c73658d18452fc2d@cam.ac.uk> Message-ID: On Wed, May 23, 2018 at 11:45 AM, Y. Shidi wrote: > Hi, > > How are you evaluating? >> > I put the loaded matrix to a linear system and solve it. > Is your claim that you do MatView() and then MatLoad() and get a different matrix? This is unlikely since we test this. Are you sure you have the same rhs? Thanks, Matt > Cheers, > Shidi > > On 2018-05-23 16:42, Jed Brown wrote: > >> "Y. Shidi" writes: >> >> Hello, >>> >>> Is there any way to sequentially load a binary matrix file, which >>> is created in parallel? >>> I tried to use PetscViewerBinaryOpen() and MatLoad() to load >>> the matrix that is created by using 2 cores, and solved a linear >>> system by using this matrix but the results is not correct. >>> >> >> How are you evaluating? >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Wed May 23 10:54:46 2018 From: ys453 at cam.ac.uk (Y. Shidi) Date: Wed, 23 May 2018 16:54:46 +0100 Subject: [petsc-users] Load binary matrix created in parallel In-Reply-To: References: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> <87a7sqgz00.fsf@jedbrown.org> <0bf608a279778cc5c73658d18452fc2d@cam.ac.uk> Message-ID: Hello, > Is your claim that you do > > MatView() I did not call this function; I called MatLoad() directly. > and get a different matrix? This is unlikely since we test this. > Are you sure you have the same rhs? I will check this again. So if a binary matrix file is created, it doesn't matter if different number of processors is used. Thanks, Shidi On 2018-05-23 16:48, Matthew Knepley wrote: > On Wed, May 23, 2018 at 11:45 AM, Y. Shidi wrote: > >> Hi, >> >>> How are you evaluating? >> I put the loaded matrix to a linear system and solve it. > > Is your claim that you do > > MatView() > > and then > > MatLoad() > > and get a different matrix? This is unlikely since we test this. > Are you sure you have the same rhs? > > Thanks, > > Matt > >> Cheers, >> Shidi >> >> On 2018-05-23 16:42, Jed Brown wrote: >> "Y. Shidi" writes: >> >> Hello, >> >> Is there any way to sequentially load a binary matrix file, which >> is created in parallel? >> I tried to use PetscViewerBinaryOpen() and MatLoad() to load >> the matrix that is created by using 2 cores, and solved a linear >> system by using this matrix but the results is not correct. >> >> How are you evaluating? > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [1] > > > Links: > ------ > [1] http://www.caam.rice.edu/~mk51/ From knepley at gmail.com Wed May 23 11:01:24 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 May 2018 12:01:24 -0400 Subject: [petsc-users] Load binary matrix created in parallel In-Reply-To: References: <9d1626ad8a30f831e932e15a85ff3748@cam.ac.uk> <87a7sqgz00.fsf@jedbrown.org> <0bf608a279778cc5c73658d18452fc2d@cam.ac.uk> Message-ID: On Wed, May 23, 2018 at 11:54 AM, Y. Shidi wrote: > Hello, > > Is your claim that you do >> >> MatView() >> > I did not call this function; I called MatLoad() directly. > This is the output function. > > and get a different matrix? This is unlikely since we test this. >> Are you sure you have the same rhs? >> > I will check this again. > So if a binary matrix file is created, > it doesn't matter if different number of processors is used. > Right. Matt > Thanks, > Shidi > > On 2018-05-23 16:48, Matthew Knepley wrote: > >> On Wed, May 23, 2018 at 11:45 AM, Y. Shidi wrote: >> >> Hi, >>> >>> How are you evaluating? >>>> >>> I put the loaded matrix to a linear system and solve it. >>> >> >> Is your claim that you do >> >> MatView() >> >> and then >> >> MatLoad() >> >> and get a different matrix? This is unlikely since we test this. >> Are you sure you have the same rhs? >> >> Thanks, >> >> Matt >> >> Cheers, >>> Shidi >>> >>> On 2018-05-23 16:42, Jed Brown wrote: >>> "Y. Shidi" writes: >>> >>> Hello, >>> >>> Is there any way to sequentially load a binary matrix file, which >>> is created in parallel? >>> I tried to use PetscViewerBinaryOpen() and MatLoad() to load >>> the matrix that is created by using 2 cores, and solved a linear >>> system by using this matrix but the results is not correct. >>> >>> How are you evaluating? >>> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] >> >> >> Links: >> ------ >> [1] http://www.caam.rice.edu/~mk51/ >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zakaryah at gmail.com Wed May 23 12:10:20 2018 From: zakaryah at gmail.com (zakaryah) Date: Wed, 23 May 2018 13:10:20 -0400 Subject: [petsc-users] Simple KSP with known solution gives infinite solution Message-ID: I'm looking for some help with a terribly simple problem. I am trying to solve a linear system which is very small, as a test. The vectors are derived from a minimal 3D DMDA (3x3x3), with 3 dof, so that the RHS and solution vectors have dimension 81 and the matrix is 81x81. I have checked that the matrix is symmetric, invertible (empty null space), and that the desired solution is reasonable, i.e. has no large values. The condition number of the matrix is about 3600. When I run KSPSolve with default settings, I see convergence in a single iteration (gmres, pc_type lu, jacobi, none, or svd). However, the solution is infinite in all entries. Any suggestions for debugging this? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 23 12:40:30 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 23 May 2018 17:40:30 +0000 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: References: Message-ID: Run with -ksp_converged_reason > On May 23, 2018, at 12:10 PM, zakaryah wrote: > > I'm looking for some help with a terribly simple problem. I am trying to solve a linear system which is very small, as a test. The vectors are derived from a minimal 3D DMDA (3x3x3), with 3 dof, so that the RHS and solution vectors have dimension 81 and the matrix is 81x81. I have checked that the matrix is symmetric, invertible (empty null space), and that the desired solution is reasonable, i.e. has no large values. The condition number of the matrix is about 3600. When I run KSPSolve with default settings, I see convergence in a single iteration (gmres, pc_type lu, jacobi, none, or svd). However, the solution is infinite in all entries. Any suggestions for debugging this? Thanks! From zakaryah at gmail.com Wed May 23 12:47:50 2018 From: zakaryah at gmail.com (zakaryah) Date: Wed, 23 May 2018 13:47:50 -0400 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: References: Message-ID: Additional weird behavior: If I run with -ksp_monitor -ksp_monitor_true_residual -ksp_converged_reason -pc_type jacobi I get: 0 KSP Residual norm 1.389113910577e-03 0 KSP preconditioned resid norm 1.389113910577e-03 true resid norm -nan ||r(i)||/||b|| -nan 1 KSP Residual norm 2.214853823259e-19 1 KSP preconditioned resid norm 2.214853823259e-19 true resid norm -nan ||r(i)||/||b|| -nan Linear solve converged due to CONVERGED_RTOL iterations 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 23 12:59:05 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 23 May 2018 17:59:05 +0000 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: References: Message-ID: <078895A5-0798-4260-8061-6F4265B24BC4@mcs.anl.gov> Add -ksp_pc_side right > On May 23, 2018, at 12:47 PM, zakaryah wrote: > > Additional weird behavior: > > If I run with > > -ksp_monitor -ksp_monitor_true_residual -ksp_converged_reason -pc_type jacobi > > I get: > > 0 KSP Residual norm 1.389113910577e-03 > 0 KSP preconditioned resid norm 1.389113910577e-03 true resid norm -nan ||r(i)||/||b|| -nan > 1 KSP Residual norm 2.214853823259e-19 > 1 KSP preconditioned resid norm 2.214853823259e-19 true resid norm -nan ||r(i)||/||b|| -nan > Linear solve converged due to CONVERGED_RTOL iterations 1 > > From zakaryah at gmail.com Wed May 23 13:09:23 2018 From: zakaryah at gmail.com (zakaryah) Date: Wed, 23 May 2018 14:09:23 -0400 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: <078895A5-0798-4260-8061-6F4265B24BC4@mcs.anl.gov> References: <078895A5-0798-4260-8061-6F4265B24BC4@mcs.anl.gov> Message-ID: I get: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: KSP gmres does not support PRECONDITIONED with RIGHT [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.2, Nov, 09, 2017 [0]PETSC ERROR: ./KSPtest on a arch-linux2-c-debug named login01.hpc.rockefeller.internal by zakaryah Wed May 23 14:06:24 2018 [0]PETSC ERROR: Configure options --prefix=/ru-auth/local/home/zakaryah/PETSc-3.8.2 --download-fblaslapack --download-mumps --download-hypre --download-metis --download-parmetis --download-scalapack [0]PETSC ERROR: #1 KSPSetUpNorms_Private() line 398 in /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itcreate.c [0]PETSC ERROR: #2 KSPSetUp() line 303 in /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 23 13:18:45 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 23 May 2018 18:18:45 +0000 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: References: <078895A5-0798-4260-8061-6F4265B24BC4@mcs.anl.gov> Message-ID: <1BE092F6-F943-4330-B57E-964524A27F34@mcs.anl.gov> Please email the code to petsc-maint at mcs.anl.gov > On May 23, 2018, at 1:09 PM, zakaryah wrote: > > I get: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: KSP gmres does not support PRECONDITIONED with RIGHT > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.2, Nov, 09, 2017 > [0]PETSC ERROR: ./KSPtest on a arch-linux2-c-debug named login01.hpc.rockefeller.internal by zakaryah Wed May 23 14:06:24 2018 > [0]PETSC ERROR: Configure options --prefix=/ru-auth/local/home/zakaryah/PETSc-3.8.2 --download-fblaslapack --download-mumps --download-hypre --download-metis --download-parmetis --download-scalapack > [0]PETSC ERROR: #1 KSPSetUpNorms_Private() line 398 in /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itcreate.c > [0]PETSC ERROR: #2 KSPSetUp() line 303 in /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c > From zakaryah at gmail.com Wed May 23 15:30:53 2018 From: zakaryah at gmail.com (zakaryah) Date: Wed, 23 May 2018 16:30:53 -0400 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: <1BE092F6-F943-4330-B57E-964524A27F34@mcs.anl.gov> References: <078895A5-0798-4260-8061-6F4265B24BC4@mcs.anl.gov> <1BE092F6-F943-4330-B57E-964524A27F34@mcs.anl.gov> Message-ID: Hi Barry - I did upload the code but I think I found a solution. Previously, I was trying to use the DMDA separately from the KSP, which probably doesn't make much sense. Now I am setting things up properly with KSPSetDM, KSPSetComputeRHS, and KSPSetComputeOperators. Everything seems to be working now - sorry for the bother. On Wed, May 23, 2018 at 2:18 PM, Smith, Barry F. wrote: > > Please email the code to petsc-maint at mcs.anl.gov > > > > On May 23, 2018, at 1:09 PM, zakaryah wrote: > > > > I get: > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: No support for this operation for this object type > > [0]PETSC ERROR: KSP gmres does not support PRECONDITIONED with RIGHT > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.2, Nov, 09, 2017 > > [0]PETSC ERROR: ./KSPtest on a arch-linux2-c-debug named > login01.hpc.rockefeller.internal by zakaryah Wed May 23 14:06:24 2018 > > [0]PETSC ERROR: Configure options --prefix=/ru-auth/local/home/zakaryah/PETSc-3.8.2 > --download-fblaslapack --download-mumps --download-hypre --download-metis > --download-parmetis --download-scalapack > > [0]PETSC ERROR: #1 KSPSetUpNorms_Private() line 398 in > /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ > ksp/ksp/interface/itcreate.c > > [0]PETSC ERROR: #2 KSPSetUp() line 303 in /rugpfs/fs0/home/zakaryah/ > PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 23 15:32:41 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 23 May 2018 20:32:41 +0000 Subject: [petsc-users] Simple KSP with known solution gives infinite solution In-Reply-To: References: <078895A5-0798-4260-8061-6F4265B24BC4@mcs.anl.gov> <1BE092F6-F943-4330-B57E-964524A27F34@mcs.anl.gov> Message-ID: <8D6E4F49-26AD-4528-85F3-CB86DBFD209C@mcs.anl.gov> Ok, thanks for the update. > On May 23, 2018, at 3:30 PM, zakaryah wrote: > > Hi Barry - I did upload the code but I think I found a solution. Previously, I was trying to use the DMDA separately from the KSP, which probably doesn't make much sense. Now I am setting things up properly with KSPSetDM, KSPSetComputeRHS, and KSPSetComputeOperators. Everything seems to be working now - sorry for the bother. > > On Wed, May 23, 2018 at 2:18 PM, Smith, Barry F. wrote: > > Please email the code to petsc-maint at mcs.anl.gov > > > > On May 23, 2018, at 1:09 PM, zakaryah wrote: > > > > I get: > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: No support for this operation for this object type > > [0]PETSC ERROR: KSP gmres does not support PRECONDITIONED with RIGHT > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.2, Nov, 09, 2017 > > [0]PETSC ERROR: ./KSPtest on a arch-linux2-c-debug named login01.hpc.rockefeller.internal by zakaryah Wed May 23 14:06:24 2018 > > [0]PETSC ERROR: Configure options --prefix=/ru-auth/local/home/zakaryah/PETSc-3.8.2 --download-fblaslapack --download-mumps --download-hypre --download-metis --download-parmetis --download-scalapack > > [0]PETSC ERROR: #1 KSPSetUpNorms_Private() line 398 in /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itcreate.c > > [0]PETSC ERROR: #2 KSPSetUp() line 303 in /rugpfs/fs0/home/zakaryah/PETSc/build/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c > > > > From Michael.Becker at physik.uni-giessen.de Thu May 24 00:24:18 2018 From: Michael.Becker at physik.uni-giessen.de (Michael Becker) Date: Thu, 24 May 2018 07:24:18 +0200 Subject: [petsc-users] Poor weak scaling when solving successive linear systems Message-ID: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> Hello, I added a PETSc solver class to our particle-in-cell simulation code and all calculations seem to be correct. However, some weak scaling tests I did are rather disappointing because the solver's runtime keeps increasing with system size although the number of cores are scaled up accordingly. As a result, the solver's share of the total runtime becomes more and more dominant and the system sizes we aim for are unfeasible. It's a simple 3D Poisson problem on a structured grid with Dirichlet boundaries inside the domain, for which I found the cg/gamg combo to work the fastest. Since KSPsolve() is called during every timestep of the simulation to solve the same system with a new rhs vector, assembling the matrix and other PETSc objects should further not be a determining factor. What puzzles me is that the convergence rate is actually good (the residual decreases by an order of magnitude for every KSP iteration) and the number of KSP iterations remains constant over the course of a simulation and is equal for all tested systems. I even increased the (fixed) system size per processor to 30^3 unknowns (which is significantly more than the recommended 10,000), but runtime is still not even close to being constant. This leads me to the conclusion that either I configured PETSc wrong, I don't call the correct PETSc-related functions, or something goes terribly wrong with communication. Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves. The repeatedly called code segment is PetscScalar *b_array; VecGetArray(b, &b_array); get_b(b_array); VecRestoreArray(b, &barray); KSPSetTolerances(ksp,reltol,1E-50,1E5,1E4); PetscScalar *x_array; VecGetArray(x, &x_array); for (int i = 0; i < N_local; i++) ? x_array[i] = x_array_prev[i]; VecRestoreArray(x, &x_array); KSPSolve(ksp,b,x); KSPGetSolution(ksp,&x); for (int i = 0; i < N_local; i++) ? x_array_prev[i] = x_array[i]; set_x(x_array); I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). Thanks in advance. Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node2-007 with 125 processors, by beckerm Wed May 23 15:15:54 2018 Using Petsc Release Version 3.9.1, unknown Max Max/Min Avg Total Time (sec): 2.567e+02 1.00000 2.567e+02 Objects: 2.438e+04 1.00004 2.438e+04 Flop: 2.125e+10 1.27708 1.963e+10 2.454e+12 Flop/sec: 8.278e+07 1.27708 7.648e+07 9.560e+09 MPI Messages: 1.042e+06 3.36140 7.129e+05 8.911e+07 MPI Message Lengths: 1.344e+09 2.32209 1.439e+03 1.282e+11 MPI Reductions: 2.250e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.9829e+00 2.7% 0.0000e+00 0.0% 3.000e+03 0.0% 3.178e+03 0.0% 1.700e+01 0.1% 1: First Solve: 2.7562e+00 1.1% 3.6885e+09 0.2% 3.549e+05 0.4% 3.736e+03 1.0% 5.500e+02 2.4% 2: Remaining Solves: 2.4695e+02 96.2% 2.4504e+12 99.8% 8.875e+07 99.6% 1.430e+03 99.0% 2.192e+04 97.4% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 3 1.0 5.6386e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: First Solve BuildTwoSided 12 1.0 1.7128e-02 2.1 0.00e+00 0.0 8.8e+03 4.0e+00 0.0e+00 0 0 0 0 0 1 0 2 0 0 0 BuildTwoSidedF 30 1.0 3.3218e-01 3.8 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0 KSPSetUp 9 1.0 3.9077e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 2.7586e+00 1.0 3.26e+07 1.4 3.5e+05 3.7e+03 5.5e+02 1 0 0 1 2 100100100100100 1337 VecTDot 8 1.0 1.9397e-02 3.6 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 2784 VecNorm 6 1.0 6.3949e-03 1.6 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 6333 VecScale 24 1.0 1.2732e-04 2.1 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 40434 VecCopy 1 1.0 1.5807e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 115 1.0 8.6141e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 8 1.0 1.4498e-03 2.4 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 37246 VecAYPX 28 1.0 1.3914e-03 2.2 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 31519 VecAssemblyBegin 2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 103 1.0 6.5608e-03 3.1 0.00e+00 0.0 8.9e+04 1.4e+03 0.0e+00 0 0 0 0 0 0 0 25 9 0 0 VecScatterEnd 103 1.0 6.5023e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatMult 29 1.0 4.5694e-02 1.8 6.14e+06 1.2 3.0e+04 2.1e+03 0.0e+00 0 0 0 0 0 1 19 8 5 0 15687 MatMultAdd 24 1.0 2.1485e-02 3.6 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 1 4 5 1 0 7032 MatMultTranspose 24 1.0 1.6713e-02 2.6 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 9040 MatSolve 4 0.0 2.2173e-05 0.0 2.64e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12 MatSOR 48 1.0 8.0235e-02 1.8 1.09e+07 1.3 2.7e+04 1.5e+03 8.0e+00 0 0 0 0 0 3 34 8 3 1 15626 MatLUFactorSym 1 1.0 5.4121e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.4782e-05 5.2 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9 MatResidual 24 1.0 3.5083e-02 2.0 4.55e+06 1.3 2.7e+04 1.5e+03 0.0e+00 0 0 0 0 0 1 14 8 3 0 14794 MatAssemblyBegin 94 1.0 3.3449e-01 3.5 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0 MatAssemblyEnd 94 1.0 1.4880e-01 1.1 0.00e+00 0.0 6.3e+04 2.1e+02 2.3e+02 0 0 0 0 1 5 0 18 1 42 0 MatGetRow 3102093 1.3 4.6411e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 15 0 0 0 0 0 MatGetRowIJ 1 0.0 7.8678e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 6 1.0 4.4495e-01 2.1 0.00e+00 0.0 5.5e+04 1.7e+04 1.2e+01 0 0 0 1 0 12 0 15 71 2 0 MatCreateSubMat 4 1.0 3.7476e-02 1.0 0.00e+00 0.0 2.9e+03 2.7e+02 6.4e+01 0 0 0 0 0 1 0 1 0 12 0 MatGetOrdering 1 0.0 1.3685e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 6 1.0 6.0548e-02 1.2 0.00e+00 0.0 2.7e+04 1.0e+03 1.2e+01 0 0 0 0 0 2 0 8 2 2 0 MatCoarsen 6 1.0 3.5690e-02 1.1 0.00e+00 0.0 5.3e+04 5.8e+02 3.3e+01 0 0 0 0 0 1 0 15 2 6 0 MatZeroEntries 6 1.0 3.4430e-03 7.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 6 1.0 2.8406e-01 1.0 1.13e+07 1.6 6.3e+04 2.6e+03 9.2e+01 0 0 0 0 0 10 33 18 13 17 4307 MatPtAPSymbolic 6 1.0 1.6401e-01 1.0 0.00e+00 0.0 3.4e+04 2.7e+03 4.2e+01 0 0 0 0 0 6 0 10 7 8 0 MatPtAPNumeric 6 1.0 1.2070e-01 1.0 1.13e+07 1.6 2.9e+04 2.6e+03 4.8e+01 0 0 0 0 0 4 33 8 6 9 10136 MatGetLocalMat 6 1.0 4.4053e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 1.0330e-02 1.9 0.00e+00 0.0 2.0e+04 3.5e+03 0.0e+00 0 0 0 0 0 0 0 6 5 0 0 SFSetGraph 12 1.0 1.5497e-05 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 12 1.0 2.5882e-02 1.5 0.00e+00 0.0 2.6e+04 6.2e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0 SFBcastBegin 45 1.0 2.1088e-03 2.5 0.00e+00 0.0 5.4e+04 6.9e+02 0.0e+00 0 0 0 0 0 0 0 15 3 0 0 SFBcastEnd 45 1.0 2.0310e-02 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 2.2022e+00 1.0 0.00e+00 0.0 2.0e+05 5.2e+03 2.8e+02 1 0 0 1 1 80 0 56 78 52 0 GAMG: partLevel 6 1.0 3.2547e-01 1.0 1.13e+07 1.6 6.6e+04 2.5e+03 1.9e+02 0 0 0 0 1 12 33 19 13 35 3759 repartition 2 1.0 1.2660e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0 Invert-Sort 2 1.0 9.5701e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 Move A 2 1.0 2.2409e-02 1.0 0.00e+00 0.0 1.4e+03 5.3e+02 3.4e+01 0 0 0 0 0 1 0 0 0 6 0 Move P 2 1.0 1.6271e-02 1.0 0.00e+00 0.0 1.4e+03 1.3e+01 3.4e+01 0 0 0 0 0 1 0 0 0 6 0 PCSetUp 2 1.0 2.5381e+00 1.0 1.13e+07 1.6 2.7e+05 4.5e+03 5.1e+02 1 0 0 1 2 92 33 75 90 93 482 PCSetUpOnBlocks 4 1.0 3.4523e-04 1.9 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 4 1.0 1.2703e-01 1.1 1.82e+07 1.3 8.6e+04 1.2e+03 8.0e+00 0 0 0 0 0 4 56 24 8 1 16335 --- Event Stage 2: Remaining Solves KSPSolve 999 1.0 1.2762e+02 1.0 2.12e+10 1.3 8.8e+07 1.4e+03 2.2e+04 48100 99 97 97 50100 99 98100 19200 VecTDot 7968 1.0 1.0869e+01 6.1 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 2 2 0 0 35 2 2 0 0 36 4948 VecNorm 5982 1.0 4.5561e+00 3.6 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 1 2 0 0 27 8863 VecScale 23904 1.0 1.1319e-01 2.2 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 45298 VecCopy 999 1.0 1.6182e-01 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 83664 1.0 8.0856e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 7968 1.0 1.3577e+00 2.3 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 39613 VecAYPX 27888 1.0 1.3048e+00 2.2 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 33468 VecScatterBegin 100599 1.0 6.5181e+00 3.4 0.00e+00 0.0 8.8e+07 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0 VecScatterEnd 100599 1.0 5.5370e+01 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 9 0 0 0 0 0 MatMult 28887 1.0 4.2860e+01 1.8 6.12e+09 1.2 3.0e+07 2.1e+03 0.0e+00 11 29 33 49 0 12 29 33 49 0 16661 MatMultAdd 23904 1.0 1.4803e+01 2.6 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 4 6 18 8 0 4 6 18 8 0 10166 MatMultTranspose 23904 1.0 1.5364e+01 2.4 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 4 6 18 8 0 4 6 18 8 0 9795 MatSolve 3984 0.0 1.9884e-02 0.0 2.63e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 13 MatSOR 47808 1.0 6.8888e+01 1.7 1.08e+10 1.3 2.7e+07 1.5e+03 8.0e+03 25 51 30 32 35 26 51 30 32 36 18054 MatResidual 23904 1.0 3.1872e+01 1.9 4.54e+09 1.3 2.7e+07 1.5e+03 0.0e+00 8 21 30 32 0 8 21 30 32 0 16219 PCSetUpOnBlocks 3984 1.0 4.9551e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 3984 1.0 1.0819e+02 1.1 1.81e+10 1.3 8.5e+07 1.2e+03 8.0e+03 42 84 96 80 35 43 84 96 81 36 19056 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 9 11424 0. DMKSP interface 1 0 0 0. Vector 5 52 2371496 0. Matrix 0 72 14138216 0. Distributed Mesh 1 0 0 0. Index Set 2 12 133768 0. IS L to G Mapping 1 0 0 0. Star Forest Graph 2 0 0 0. Discrete System 1 0 0 0. Vec Scatter 1 13 16016 0. Preconditioner 1 9 9676 0. Viewer 1 0 0 0. --- Event Stage 1: First Solve Krylov Solver 8 0 0 0. Vector 140 92 2204792 0. Matrix 140 68 21738552 0. Matrix Coarsen 6 6 3816 0. Index Set 110 100 543240 0. Star Forest Graph 12 12 10368 0. Vec Scatter 31 18 22176 0. Preconditioner 8 0 0 0. --- Event Stage 2: Remaining Solves Vector 23904 23904 1295501184 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 2.26021e-05 Average time for zero size MPI_Send(): 1.52473e-05 #PETSc Option Table entries: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type cg -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4 ----------------------------------------- Libraries compiled on 2018-05-03 16:11:18 on node52-021 Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon Using PETSc directory: /home/ritsat/beckerm/petsc Using PETSc arch: arch-linux-amd-opt ----------------------------------------- Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3 ----------------------------------------- Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include ----------------------------------------- Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl ----------------------------------------- -------------- next part -------------- ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-017 with 1000 processors, by beckerm Wed May 23 23:30:46 2018 Using Petsc Release Version 3.9.1, unknown Max Max/Min Avg Total Time (sec): 2.915e+02 1.00000 2.915e+02 Objects: 2.127e+04 1.00005 2.127e+04 Flop: 1.922e+10 1.26227 1.851e+10 1.851e+13 Flop/sec: 6.595e+07 1.26227 6.349e+07 6.349e+10 MPI Messages: 1.075e+06 3.98874 7.375e+05 7.375e+08 MPI Message Lengths: 1.175e+09 2.32017 1.403e+03 1.034e+12 MPI Reductions: 1.199e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.5171e+01 8.6% 0.0000e+00 0.0% 2.700e+04 0.0% 3.178e+03 0.0% 1.700e+01 0.1% 1: First Solve: 3.3123e+00 1.1% 3.1911e+10 0.2% 3.675e+06 0.5% 3.508e+03 1.2% 6.090e+02 5.1% 2: Remaining Solves: 2.6301e+02 90.2% 1.8475e+13 99.8% 7.338e+08 99.5% 1.392e+03 98.7% 1.135e+04 94.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 3 1.0 4.4584e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: First Solve BuildTwoSided 12 1.0 2.2965e-02 1.3 0.00e+00 0.0 8.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 1 0 2 0 0 0 BuildTwoSidedF 30 1.0 4.6278e-01 3.1 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 9 0 2 5 0 0 KSPSetUp 9 1.0 2.1761e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 1 0 0 0 2 0 KSPSolve 1 1.0 3.3169e+00 1.0 3.35e+07 1.4 3.7e+06 3.5e+03 6.1e+02 1 0 0 1 5 100100100100100 9621 VecDotNorm2 4 1.0 4.2942e-03 3.1 4.32e+05 1.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 1 0 0 1 100602 VecMDot 3 1.0 3.0557e-0222.9 3.24e+05 1.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 1 1 0 0 0 10603 VecNorm 6 1.0 2.8319e-03 2.3 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 114409 VecScale 32 1.0 6.2370e-04 1.4 2.70e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 422668 VecSet 124 1.0 5.1708e-03 7.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 8 1.0 1.8489e-03 2.6 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 233648 VecAYPX 25 1.0 3.2370e-03 6.2 1.96e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 59376 VecMAXPY 6 1.0 1.8075e-03 1.9 6.48e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 358516 VecAssemblyBegin 3 1.0 4.0531e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 3 1.0 1.0014e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 108 1.0 1.7932e-02 6.6 0.00e+00 0.0 8.4e+05 1.4e+03 0.0e+00 0 0 0 0 0 0 0 23 9 0 0 VecScatterEnd 108 1.0 8.1686e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 MatMult 29 1.0 6.4687e-02 2.6 6.14e+06 1.2 2.8e+05 2.0e+03 0.0e+00 0 0 0 0 0 1 19 8 4 0 91734 MatMultAdd 24 1.0 5.0761e-02 6.0 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 1 4 4 1 0 25391 MatMultTranspose 24 1.0 2.7226e-02 4.8 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 0 4 4 1 0 47340 MatSolve 4 0.0 4.7922e-05 0.0 1.10e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 229 MatSOR 48 1.0 9.3626e-02 1.9 1.09e+07 1.3 2.6e+05 1.5e+03 0.0e+00 0 0 0 0 0 2 33 7 3 0 111703 MatLUFactorSym 1 1.0 9.6083e-05 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 7.1049e-0537.2 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 463 MatResidual 24 1.0 4.1369e-02 2.3 4.55e+06 1.3 2.6e+05 1.5e+03 0.0e+00 0 0 0 0 0 1 14 7 3 0 105139 MatAssemblyBegin 102 1.0 4.6537e-01 2.9 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 9 0 2 5 0 0 MatAssemblyEnd 102 1.0 1.4218e-01 1.1 0.00e+00 0.0 6.2e+05 2.0e+02 2.5e+02 0 0 0 0 2 4 0 17 1 41 0 MatGetRow 3102093 1.3 5.4764e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 MatGetRowIJ 1 0.0 1.5974e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 6 1.0 4.6659e-01 2.1 0.00e+00 0.0 5.7e+05 1.6e+04 1.2e+01 0 0 0 1 0 10 0 15 72 2 0 MatCreateSubMat 6 1.0 2.8245e-02 1.0 0.00e+00 0.0 2.2e+04 3.3e+02 9.4e+01 0 0 0 0 1 1 0 1 0 15 0 MatGetOrdering 1 0.0 1.4687e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 6 1.0 1.1661e-01 1.1 0.00e+00 0.0 2.6e+05 9.9e+02 1.2e+01 0 0 0 0 0 3 0 7 2 2 0 MatCoarsen 6 1.0 5.6789e-02 1.0 0.00e+00 0.0 7.1e+05 4.4e+02 5.6e+01 0 0 0 0 0 2 0 19 2 9 0 MatZeroEntries 6 1.0 3.5298e-03 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 6 1.0 3.7699e-01 1.0 1.11e+07 1.6 6.3e+05 2.5e+03 9.2e+01 0 0 0 0 1 11 33 17 12 15 27514 MatPtAPSymbolic 6 1.0 2.2081e-01 1.0 0.00e+00 0.0 3.2e+05 2.7e+03 4.2e+01 0 0 0 0 0 7 0 9 7 7 0 MatPtAPNumeric 6 1.0 1.5378e-01 1.0 1.11e+07 1.6 3.0e+05 2.3e+03 4.8e+01 0 0 0 0 0 5 33 8 6 8 67450 MatGetLocalMat 6 1.0 4.8461e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 1.6591e-02 2.4 0.00e+00 0.0 1.9e+05 3.4e+03 0.0e+00 0 0 0 0 0 0 0 5 5 0 0 SFSetGraph 12 1.0 4.1962e-05 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 12 1.0 3.2055e-02 1.2 0.00e+00 0.0 2.7e+05 5.8e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0 SFBcastBegin 68 1.0 2.7685e-03 2.8 0.00e+00 0.0 7.2e+05 5.1e+02 0.0e+00 0 0 0 0 0 0 0 20 3 0 0 SFBcastEnd 68 1.0 3.0165e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 2.5855e+00 1.0 0.00e+00 0.0 2.2e+06 4.7e+03 3.1e+02 1 0 0 1 3 78 0 59 79 51 0 GAMG: partLevel 6 1.0 4.1722e-01 1.0 1.11e+07 1.6 6.5e+05 2.4e+03 2.4e+02 0 0 0 0 2 13 33 18 12 40 24861 repartition 3 1.0 3.8280e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 Invert-Sort 3 1.0 3.2971e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0 Move A 3 1.0 1.6580e-02 1.1 0.00e+00 0.0 9.5e+03 7.4e+02 5.0e+01 0 0 0 0 0 0 0 0 0 8 0 Move P 3 1.0 1.4499e-02 1.1 0.00e+00 0.0 1.3e+04 1.3e+01 5.0e+01 0 0 0 0 0 0 0 0 0 8 0 PCSetUp 2 1.0 3.0173e+00 1.0 1.11e+07 1.6 2.8e+06 4.2e+03 5.8e+02 1 0 0 1 5 91 33 77 91 96 3438 PCSetUpOnBlocks 4 1.0 4.0102e-04 2.7 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 82 PCApply 4 1.0 1.6476e-01 1.3 1.82e+07 1.3 8.2e+05 1.2e+03 0.0e+00 0 0 0 0 0 4 54 22 7 0 105522 --- Event Stage 2: Remaining Solves KSPSolve 999 1.0 1.3831e+02 1.1 1.92e+10 1.3 7.3e+08 1.4e+03 1.1e+04 46100 99 97 95 51100 99 98100 133578 VecDotNorm2 3450 1.0 5.0804e+00 2.2 3.73e+08 1.0 0.0e+00 0.0e+00 3.4e+03 1 2 0 0 29 1 2 0 0 30 73340 VecMDot 2451 1.0 9.5447e+00 3.6 2.35e+08 1.0 0.0e+00 0.0e+00 2.5e+03 2 1 0 0 20 2 1 0 0 22 24644 VecNorm 5448 1.0 8.6350e+00 3.0 2.94e+08 1.0 0.0e+00 0.0e+00 5.4e+03 2 2 0 0 45 2 2 0 0 48 34070 VecScale 27600 1.0 5.1987e-01 1.4 2.33e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 437362 VecSet 72450 1.0 8.8635e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 6900 1.0 8.0184e-01 1.4 3.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 464680 VecAYPX 21699 1.0 1.0895e+00 2.3 1.72e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 155541 VecMAXPY 4902 1.0 1.0245e+00 1.4 4.70e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 459209 VecScatterBegin 87249 1.0 6.4859e+00 2.9 0.00e+00 0.0 7.3e+08 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0 VecScatterEnd 87249 1.0 5.9416e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 12 0 0 0 0 13 0 0 0 0 0 MatMult 25149 1.0 3.5269e+01 1.6 5.34e+09 1.2 2.5e+08 2.0e+03 0.0e+00 9 28 33 48 0 10 28 34 49 0 146467 MatMultAdd 20700 1.0 2.7336e+01 3.9 1.19e+09 1.6 1.3e+08 6.5e+02 0.0e+00 7 6 18 8 0 8 6 18 8 0 40666 MatMultTranspose 20700 1.0 1.6038e+01 3.0 1.19e+09 1.6 1.3e+08 6.5e+02 0.0e+00 3 6 18 8 0 3 6 18 8 0 69313 MatSolve 3450 0.0 3.8933e-02 0.0 9.47e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 243 MatSOR 41400 1.0 6.3412e+01 1.6 9.37e+09 1.3 2.2e+08 1.5e+03 0.0e+00 20 49 30 32 0 22 49 30 32 0 141687 MatResidual 20700 1.0 2.5046e+01 1.6 3.93e+09 1.3 2.2e+08 1.5e+03 0.0e+00 6 20 30 32 0 7 20 30 32 0 149782 PCSetUpOnBlocks 3450 1.0 5.4419e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 3450 1.0 1.1329e+02 1.1 1.57e+10 1.3 7.0e+08 1.2e+03 0.0e+00 38 81 96 80 0 42 81 96 81 0 132043 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 9 11416 0. DMKSP interface 1 0 0 0. Vector 5 110 15006256 0. Matrix 0 65 14780672 0. Distributed Mesh 1 0 0 0. Index Set 2 18 171852 0. IS L to G Mapping 1 0 0 0. Star Forest Graph 2 0 0 0. Discrete System 1 0 0 0. Vec Scatter 1 13 16016 0. Preconditioner 1 9 9676 0. Viewer 1 0 0 0. --- Event Stage 1: First Solve Krylov Solver 8 0 0 0. Vector 210 104 2238504 0. Matrix 148 83 22951356 0. Matrix Coarsen 6 6 3816 0. Index Set 128 112 590828 0. Star Forest Graph 12 12 10368 0. Vec Scatter 34 21 25872 0. Preconditioner 8 0 0 0. --- Event Stage 2: Remaining Solves Vector 20700 20700 1128260400 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 3.46184e-05 Average time for zero size MPI_Send(): 1.66161e-05 #PETSc Option Table entries: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type gcr -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4 ----------------------------------------- Libraries compiled on 2018-05-03 16:11:18 on node52-021 Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon Using PETSc directory: /home/ritsat/beckerm/petsc Using PETSc arch: arch-linux-amd-opt ----------------------------------------- Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3 ----------------------------------------- Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include ----------------------------------------- Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl ----------------------------------------- From lawrence.mitchell at imperial.ac.uk Thu May 24 02:39:36 2018 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Thu, 24 May 2018 08:39:36 +0100 Subject: [petsc-users] Poor weak scaling when solving successive linear systems In-Reply-To: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> Message-ID: > On 24 May 2018, at 06:24, Michael Becker wrote: > > Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves. The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run. Is this deliberate? 125 proc: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type cg -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg 1000 proc: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type gcr -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg That aside, it looks like you have quite a bit of load imbalance. e.g. in the smoother, where you're doing MatSOR, you have: 125 proc: Calls Time Max/Min time MatSOR 47808 1.0 6.8888e+01 1.7 1000 proc: MatSOR 41400 1.0 6.3412e+01 1.6 VecScatters show similar behaviour. How is your problem distributed across the processes? Cheers, Lawrence From Michael.Becker at physik.uni-giessen.de Thu May 24 04:10:28 2018 From: Michael.Becker at physik.uni-giessen.de (Michael Becker) Date: Thu, 24 May 2018 11:10:28 +0200 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> Message-ID: <9a2ca501-39a6-8b28-7ffc-cd5bdd6aaf23@physik.uni-giessen.de> CG/GCR: I accidentally kept gcr in the batch file. That's still from when I was experimenting with the different methods. The performance is quite similar though. I use the following setup for the ksp object and the vectors: ierr=PetscInitialize(&argc, &argv, (char*)0, (char*)0);CHKERRQ(ierr); ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr); ierr=DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED, ???????????? DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0], dims[1], dims[2], 1, 1, l_Nx, l_Ny, l_Nz, &da);CHKERRQ(ierr); ierr=DMSetFromOptions(da);CHKERRQ(ierr); ierr=DMSetUp(da);CHKERRQ(ierr); ierr=KSPSetDM(ksp, da);CHKERRQ(ierr); ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr); ierr=VecDuplicate(b, &x);CHKERRQ(ierr); ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr); ierr=VecSet(x,0);CHKERRQ(ierr); ierr=VecSet(b,0);CHKERRQ(ierr); For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and every element has value 30. VecGetLocalSize() returns 27000 for every rank. Is there something I didn't consider? Michael Am 24.05.2018 um 09:39 schrieb Lawrence Mitchell: > >> On 24 May 2018, at 06:24, Michael Becker wrote: >> >> Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves. > The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run. Is this deliberate? > > 125 proc: > > -gamg_est_ksp_type cg > -ksp_norm_type unpreconditioned > -ksp_type cg > -log_view > -mg_levels_esteig_ksp_max_it 10 > -mg_levels_esteig_ksp_type cg > -mg_levels_ksp_max_it 1 > -mg_levels_ksp_norm_type none > -mg_levels_ksp_type richardson > -mg_levels_pc_sor_its 1 > -mg_levels_pc_type sor > -pc_gamg_type classical > -pc_type gamg > > 1000 proc: > > -gamg_est_ksp_type cg > -ksp_norm_type unpreconditioned > -ksp_type gcr > -log_view > -mg_levels_esteig_ksp_max_it 10 > -mg_levels_esteig_ksp_type cg > -mg_levels_ksp_max_it 1 > -mg_levels_ksp_norm_type none > -mg_levels_ksp_type richardson > -mg_levels_pc_sor_its 1 > -mg_levels_pc_type sor > -pc_gamg_type classical > -pc_type gamg > > > That aside, it looks like you have quite a bit of load imbalance. e.g. in the smoother, where you're doing MatSOR, you have: > > 125 proc: > Calls Time Max/Min time > MatSOR 47808 1.0 6.8888e+01 1.7 > > 1000 proc: > > MatSOR 41400 1.0 6.3412e+01 1.6 > > VecScatters show similar behaviour. > > How is your problem distributed across the processes? > > Cheers, > > Lawrence > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 24 08:22:21 2018 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 24 May 2018 09:22:21 -0400 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: <9a2ca501-39a6-8b28-7ffc-cd5bdd6aaf23@physik.uni-giessen.de> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <9a2ca501-39a6-8b28-7ffc-cd5bdd6aaf23@physik.uni-giessen.de> Message-ID: The KSPSolve time goes from 128 to 138 seconds in going from 125 to 1000 processes. Is this the problem? And as Lawrence pointed out there is a lot of "load" imbalance. (This could come from a poor network). VecAXPY has no communication and has significant imbalance. But you seem to have perfect actual load imbalance but this can come from cache effects.... And you are spending almost half the solve time in VecScatter. If you really have this nice regular partitioning of the problem, then your communication is slow, even on 125 processors. (So it is not a scaling issue here, but if you do a one processor test you should see it). Note, AMG coarse grids get bigger as the problem gets bigger, so it is not perfectly scalable pre-asymptotically. Nothing is really, because you don't saturate communication until you have at least a 3^D process grid and various random things will cause some non-perfect weak speedup. Mark On Thu, May 24, 2018 at 5:10 AM, Michael Becker < Michael.Becker at physik.uni-giessen.de> wrote: > CG/GCR: I accidentally kept gcr in the batch file. That's still from when > I was experimenting with the different methods. The performance is quite > similar though. > > I use the following setup for the ksp object and the vectors: > > ierr=PetscInitialize(&argc, &argv, (char*)0, (char*)0);CHKERRQ(ierr); > > ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr); > > ierr=DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_ > BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED, > DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0], dims[1], > dims[2], 1, 1, l_Nx, l_Ny, l_Nz, &da);CHKERRQ(ierr); > ierr=DMSetFromOptions(da);CHKERRQ(ierr); > ierr=DMSetUp(da);CHKERRQ(ierr); > ierr=KSPSetDM(ksp, da);CHKERRQ(ierr); > > ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr); > > ierr=VecDuplicate(b, &x);CHKERRQ(ierr); > > ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr); > ierr=VecSet(x,0);CHKERRQ(ierr); > ierr=VecSet(b,0);CHKERRQ(ierr); > > For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and every > element has value 30. VecGetLocalSize() returns 27000 for every rank. Is > there something I didn't consider? > > Michael > > > > Am 24.05.2018 um 09:39 schrieb Lawrence Mitchell: > > On 24 May 2018, at 06:24, Michael Becker wrote: > > Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves. > > The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run. Is this deliberate? > > 125 proc: > > -gamg_est_ksp_type cg > -ksp_norm_type unpreconditioned > -ksp_type cg > -log_view > -mg_levels_esteig_ksp_max_it 10 > -mg_levels_esteig_ksp_type cg > -mg_levels_ksp_max_it 1 > -mg_levels_ksp_norm_type none > -mg_levels_ksp_type richardson > -mg_levels_pc_sor_its 1 > -mg_levels_pc_type sor > -pc_gamg_type classical > -pc_type gamg > > 1000 proc: > > -gamg_est_ksp_type cg > -ksp_norm_type unpreconditioned > -ksp_type gcr > -log_view > -mg_levels_esteig_ksp_max_it 10 > -mg_levels_esteig_ksp_type cg > -mg_levels_ksp_max_it 1 > -mg_levels_ksp_norm_type none > -mg_levels_ksp_type richardson > -mg_levels_pc_sor_its 1 > -mg_levels_pc_type sor > -pc_gamg_type classical > -pc_type gamg > > > That aside, it looks like you have quite a bit of load imbalance. e.g. in the smoother, where you're doing MatSOR, you have: > > 125 proc: > Calls Time Max/Min time > MatSOR 47808 1.0 6.8888e+01 1.7 > > 1000 proc: > > MatSOR 41400 1.0 6.3412e+01 1.6 > > VecScatters show similar behaviour. > > How is your problem distributed across the processes? > > Cheers, > > Lawrence > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Michael.Becker at physik.uni-giessen.de Thu May 24 09:49:29 2018 From: Michael.Becker at physik.uni-giessen.de (Michael Becker) Date: Thu, 24 May 2018 16:49:29 +0200 Subject: [petsc-users] Poor weak scaling when solvingsuccessivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <9a2ca501-39a6-8b28-7ffc-cd5bdd6aaf23@physik.uni-giessen.de> Message-ID: <57805b45-1203-98f5-ffe2-fbca3a6d0aa3@physik.uni-giessen.de> Yes, the time increment is the problem. Not because of these 8% in particular, but it gets worse with more processes. Performance does improve drastically on one processor (which I previously never tested); I attached the log_view file. If communication speed is the problem, then I assume fewer processors per node would improve performance and I could investigate that (principally). I assume there's no way to reduce the data volume. But thanks either way, this helped a lot. Michael Am 24.05.2018 um 15:22 schrieb Mark Adams: > The KSPSolve time goes from 128 to 138 seconds in going from 125 to > 1000 processes. Is this the problem? > > And as Lawrence pointed out there is a lot of "load" imbalance. (This > could come from a poor network). VecAXPY has no communication and has > significant imbalance. But you seem to have perfect actual load > imbalance but this can come from cache effects.... > > And you are spending almost half the solve time in VecScatter. If you > really have this nice regular partitioning of the problem, then your > communication is slow, even on 125 processors. (So it is not a scaling > issue here, but if you do a one processor test you should see it). > > Note, AMG coarse grids get bigger as the problem gets bigger, so it is > not perfectly scalable pre-asymptotically. Nothing is really, because > you don't saturate communication until you have at least a 3^D process > grid and various random things will cause some non-perfect weak speedup. > > Mark > > > On Thu, May 24, 2018 at 5:10 AM, Michael Becker > > wrote: > > CG/GCR: I accidentally kept gcr in the batch file. That's still > from when I was experimenting with the different methods. The > performance is quite similar though. > > I use the following setup for the ksp object and the vectors: > > ierr=PetscInitialize(&argc, &argv, (char*)0, > (char*)0);CHKERRQ(ierr); > > ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr); > > ierr=DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED, > ???????????? DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0], > dims[1], dims[2], 1, 1, l_Nx, l_Ny, l_Nz, &da);CHKERRQ(ierr); > ierr=DMSetFromOptions(da);CHKERRQ(ierr); > ierr=DMSetUp(da);CHKERRQ(ierr); > ierr=KSPSetDM(ksp, da);CHKERRQ(ierr); > > ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr); > > ierr=VecDuplicate(b, &x);CHKERRQ(ierr); > > ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr); > ierr=VecSet(x,0);CHKERRQ(ierr); > ierr=VecSet(b,0);CHKERRQ(ierr); > > For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and > every element has value 30. VecGetLocalSize() returns 27000 for > every rank. Is there something I didn't consider? > > Michael > > > > Am 24.05.2018 um 09:39 schrieb Lawrence Mitchell: >>> On 24 May 2018, at 06:24, Michael Becker >>> wrote: >>> >>> Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves. >> The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run. Is this deliberate? >> >> 125 proc: >> >> -gamg_est_ksp_type cg >> -ksp_norm_type unpreconditioned >> -ksp_type cg >> -log_view >> -mg_levels_esteig_ksp_max_it 10 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 1 >> -mg_levels_ksp_norm_type none >> -mg_levels_ksp_type richardson >> -mg_levels_pc_sor_its 1 >> -mg_levels_pc_type sor >> -pc_gamg_type classical >> -pc_type gamg >> >> 1000 proc: >> >> -gamg_est_ksp_type cg >> -ksp_norm_type unpreconditioned >> -ksp_type gcr >> -log_view >> -mg_levels_esteig_ksp_max_it 10 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 1 >> -mg_levels_ksp_norm_type none >> -mg_levels_ksp_type richardson >> -mg_levels_pc_sor_its 1 >> -mg_levels_pc_type sor >> -pc_gamg_type classical >> -pc_type gamg >> >> >> That aside, it looks like you have quite a bit of load imbalance. e.g. in the smoother, where you're doing MatSOR, you have: >> >> 125 proc: >> Calls Time Max/Min time >> MatSOR 47808 1.0 6.8888e+01 1.7 >> >> 1000 proc: >> >> MatSOR 41400 1.0 6.3412e+01 1.6 >> >> VecScatters show similar behaviour. >> >> How is your problem distributed across the processes? >> >> Cheers, >> >> Lawrence >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 24 09:58:19 2018 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 24 May 2018 10:58:19 -0400 Subject: [petsc-users] Poor weak scaling when solvingsuccessivelinearsystems In-Reply-To: <57805b45-1203-98f5-ffe2-fbca3a6d0aa3@physik.uni-giessen.de> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <9a2ca501-39a6-8b28-7ffc-cd5bdd6aaf23@physik.uni-giessen.de> <57805b45-1203-98f5-ffe2-fbca3a6d0aa3@physik.uni-giessen.de> Message-ID: On Thu, May 24, 2018 at 10:49 AM, Michael Becker < Michael.Becker at physik.uni-giessen.de> wrote: > Yes, the time increment is the problem. Not because of these 8% in > particular, but it gets worse with more processes. > It will never be perfect. 8% is not bad but there is clearly some bad stuff going on here. Note, the VecScatter time could be catching load imbalance for elsewhere in the code. So this could be all load imbalance. > Performance does improve drastically on one processor (which I previously > never tested); > You also get all the memory bandwidth with one process, assuming you have a multi-core processor. This is a big win also. So one processor test is mixing different things. If you can fill one socket then that would isolate communication, including things like packing send buffers. > I attached the log_view file. If communication speed is the problem, then > I assume fewer processors per node would improve performance and I could > investigate that (principally). I assume there's no way to reduce the data > volume. > > But thanks either way, this helped a lot. > > Michael > > > > Am 24.05.2018 um 15:22 schrieb Mark Adams: > > The KSPSolve time goes from 128 to 138 seconds in going from 125 to 1000 > processes. Is this the problem? > > And as Lawrence pointed out there is a lot of "load" imbalance. (This > could come from a poor network). VecAXPY has no communication and has > significant imbalance. But you seem to have perfect actual load imbalance > but this can come from cache effects.... > > And you are spending almost half the solve time in VecScatter. If you > really have this nice regular partitioning of the problem, then your > communication is slow, even on 125 processors. (So it is not a scaling > issue here, but if you do a one processor test you should see it). > > Note, AMG coarse grids get bigger as the problem gets bigger, so it is not > perfectly scalable pre-asymptotically. Nothing is really, because you don't > saturate communication until you have at least a 3^D process grid and > various random things will cause some non-perfect weak speedup. > > Mark > > > On Thu, May 24, 2018 at 5:10 AM, Michael Becker < > Michael.Becker at physik.uni-giessen.de> wrote: > >> CG/GCR: I accidentally kept gcr in the batch file. That's still from when >> I was experimenting with the different methods. The performance is quite >> similar though. >> >> I use the following setup for the ksp object and the vectors: >> >> ierr=PetscInitialize(&argc, &argv, (char*)0, (char*)0);CHKERRQ(ierr); >> >> ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr); >> >> ierr=DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BO >> UNDARY_GHOSTED,DM_BOUNDARY_GHOSTED, >> DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0], dims[1], >> dims[2], 1, 1, l_Nx, l_Ny, l_Nz, &da);CHKERRQ(ierr); >> ierr=DMSetFromOptions(da);CHKERRQ(ierr); >> ierr=DMSetUp(da);CHKERRQ(ierr); >> ierr=KSPSetDM(ksp, da);CHKERRQ(ierr); >> >> ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr); >> >> ierr=VecDuplicate(b, &x);CHKERRQ(ierr); >> >> ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr); >> ierr=VecSet(x,0);CHKERRQ(ierr); >> ierr=VecSet(b,0);CHKERRQ(ierr); >> >> For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and every >> element has value 30. VecGetLocalSize() returns 27000 for every rank. Is >> there something I didn't consider? >> >> Michael >> >> >> >> Am 24.05.2018 um 09:39 schrieb Lawrence Mitchell: >> >> On 24 May 2018, at 06:24, Michael Becker wrote: >> >> Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves. >> >> The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run. Is this deliberate? >> >> 125 proc: >> >> -gamg_est_ksp_type cg >> -ksp_norm_type unpreconditioned >> -ksp_type cg >> -log_view >> -mg_levels_esteig_ksp_max_it 10 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 1 >> -mg_levels_ksp_norm_type none >> -mg_levels_ksp_type richardson >> -mg_levels_pc_sor_its 1 >> -mg_levels_pc_type sor >> -pc_gamg_type classical >> -pc_type gamg >> >> 1000 proc: >> >> -gamg_est_ksp_type cg >> -ksp_norm_type unpreconditioned >> -ksp_type gcr >> -log_view >> -mg_levels_esteig_ksp_max_it 10 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 1 >> -mg_levels_ksp_norm_type none >> -mg_levels_ksp_type richardson >> -mg_levels_pc_sor_its 1 >> -mg_levels_pc_type sor >> -pc_gamg_type classical >> -pc_type gamg >> >> >> That aside, it looks like you have quite a bit of load imbalance. e.g. in the smoother, where you're doing MatSOR, you have: >> >> 125 proc: >> Calls Time Max/Min time >> MatSOR 47808 1.0 6.8888e+01 1.7 >> >> 1000 proc: >> >> MatSOR 41400 1.0 6.3412e+01 1.6 >> >> VecScatters show similar behaviour. >> >> How is your problem distributed across the processes? >> >> Cheers, >> >> Lawrence >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Thu May 24 11:24:10 2018 From: jczhang at mcs.anl.gov (Junchao Zhang) Date: Thu, 24 May 2018 11:24:10 -0500 Subject: [petsc-users] Poor weak scaling when solving successive linear systems In-Reply-To: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> Message-ID: I noticed you used PETSc 3.9.1. You can give a try to the master branch. I added some VecScatter optimizations recently. I don't know if it could help. --Junchao Zhang On Thu, May 24, 2018 at 12:24 AM, Michael Becker < Michael.Becker at physik.uni-giessen.de> wrote: > Hello, > > I added a PETSc solver class to our particle-in-cell simulation code and > all calculations seem to be correct. However, some weak scaling tests I did > are rather disappointing because the solver's runtime keeps increasing with > system size although the number of cores are scaled up accordingly. As a > result, the solver's share of the total runtime becomes more and more > dominant and the system sizes we aim for are unfeasible. > > It's a simple 3D Poisson problem on a structured grid with Dirichlet > boundaries inside the domain, for which I found the cg/gamg combo to work > the fastest. Since KSPsolve() is called during every timestep of the > simulation to solve the same system with a new rhs vector, assembling the > matrix and other PETSc objects should further not be a determining factor. > > What puzzles me is that the convergence rate is actually good (the > residual decreases by an order of magnitude for every KSP iteration) and > the number of KSP iterations remains constant over the course of a > simulation and is equal for all tested systems. > > I even increased the (fixed) system size per processor to 30^3 unknowns > (which is significantly more than the recommended 10,000), but runtime is > still not even close to being constant. > > This leads me to the conclusion that either I configured PETSc wrong, I > don't call the correct PETSc-related functions, or something goes terribly > wrong with communication. > > Could you have a look at the attached log_view files and tell me if > something is particularly odd? The system size per processor is 30^3 and > the simulation ran over 1000 timesteps, which means KSPsolve() was called > equally often. I introduced two new logging states - one for the first > solve and the final setup and one for the remaining solves. > > The repeatedly called code segment is > > PetscScalar *b_array; > VecGetArray(b, &b_array); > get_b(b_array); > VecRestoreArray(b, &barray); > > KSPSetTolerances(ksp,reltol,1E-50,1E5,1E4); > > PetscScalar *x_array; > VecGetArray(x, &x_array); > for (int i = 0; i < N_local; i++) > x_array[i] = x_array_prev[i]; > VecRestoreArray(x, &x_array); > > KSPSolve(ksp,b,x); > > KSPGetSolution(ksp,&x); > for (int i = 0; i < N_local; i++) > x_array_prev[i] = x_array[i]; > > set_x(x_array); > > I noticed that for every individual KSP iteration, six vector objects are > created and destroyed (with CG, more with e.g. GMRES). This seems kind of > wasteful, is this supposed to be like this? Is this even the reason for my > problems? Apart from that, everything seems quite normal to me (but I'm not > the expert here). > > > Thanks in advance. > > Michael > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 24 12:50:33 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 24 May 2018 17:50:33 +0000 Subject: [petsc-users] Poor weak scaling when solving successive linear systems In-Reply-To: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> Message-ID: <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Please send the log file for 1000 with cg as the solver. You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) > On May 24, 2018, at 12:24 AM, Michael Becker wrote: > > I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. > This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). > > > Thanks in advance. > > Michael > > > > From michael.wick.1980 at gmail.com Thu May 24 21:03:31 2018 From: michael.wick.1980 at gmail.com (Mike Wick) Date: Thu, 24 May 2018 19:03:31 -0700 Subject: [petsc-users] understanding of the preconditioner setted from KSPSetOperators Message-ID: Hi PETSc team: I just want to get some clarifications on the use of preconditioners in PETSc. Suppose I provide the Pmat as the second argument in KSPSetOperators as a user-specified matrix and it is different from Amat, the actual matrix to be solved. When I specify a preconditioner, say ASM, what does that mean? I am assuming that we are doing a preonly-type thing for solving Pmat, is that right? There is no iterative solver applied to Pmat over the iterations. Best, Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 24 22:04:16 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 25 May 2018 03:04:16 +0000 Subject: [petsc-users] understanding of the preconditioner setted from KSPSetOperators In-Reply-To: References: Message-ID: <90AC5BB6-2DB7-4546-B836-8728B99C7E34@anl.gov> > On May 24, 2018, at 9:03 PM, Mike Wick wrote: > > Hi PETSc team: > > I just want to get some clarifications on the use of preconditioners in PETSc. > > Suppose I provide the Pmat as the second argument in KSPSetOperators as a user-specified matrix and it is different from Amat, the actual matrix to be solved. When I specify a preconditioner, say ASM, what does that mean? It means the Pmat is used to construct the preconditioner. In the case of ASM this means that overlapping blocks of the Pmat are extracted and used to apply the preconditioner. The Amat is only used to do the matrix vector products needed by the Krylov method. > I am assuming that we are doing a preonly-type thing for solving Pmat, is that right? There is no iterative solver applied to Pmat over the iterations. Not directly though there may be, depending on the preconditioner, an iteration inside the application of the preconditioner. For example with ASM one could approximately solve each block with, say, 10 iterations of GMRES inside each application of the preconditioner. Barry > > Best, > > Mike From Michael.Becker at physik.uni-giessen.de Fri May 25 01:26:24 2018 From: Michael.Becker at physik.uni-giessen.de (Michael Becker) Date: Fri, 25 May 2018 08:26:24 +0200 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: Ok, I'll do that (and switching to master branch). This is probably going to take a couple days because the cluster is currently under some heavy load. Michael Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: > Please send the log file for 1000 with cg as the solver. > > You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) > > > >> On May 24, 2018, at 12:24 AM, Michael Becker wrote: >> >> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). > Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. > > > >> This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). >> >> >> Thanks in advance. >> >> Michael >> >> >> >> From nahmad16 at ku.edu.tr Sat May 26 14:34:31 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Sun, 27 May 2018 00:34:31 +0500 Subject: [petsc-users] Preprocessing Matrix Market matrices and right hand sides Message-ID: Hi, I am trying to load a matrix and a right hand side into Petsc from a file. The matrix and rhs are preprocessed from matrix market format using the following python script: import scipy.io, PetscBinaryIO A = scipy.io.mmread('sherman2.mtx') PetscBinaryIO.PetscBinaryIO().writeMatSciPy(open('sherman2','w'), A) B = scipy.io.mmread('sherman2_rhs1.mtx') PetscBinaryIO.PetscBinaryIO().writeVec(open('sherman2_rhs','w'), B) The binary matrix sherman2 and rhs sherman2_rhs are then being loaded into Petsc using the following code: PetscErrorCode ierr; PetscViewer fd; /* viewer */ Mat A; /* linear system matrix */ Vec b; /* RHS */ char file[2][PETSC_MAX_PATH_LEN]; PetscBool flg, PetscPreLoad = PETSC_FALSE; PetscDraw dc; PetscInt size; PetscReal value; PetscInitialize(&argc,&args,(char*)0,help);if (ierr) return ierr; //PetscPrintf(MPI_COMM ierr = PetscOptionsGetString(PETSC_NULL,PETSC_NULL,"-f",file[0],PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); if (!flg) SETERRQ(PETSC_COMM_WORLD,1,"Must indicate binary file with the -f option"); // Load Matrix A ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file[0],FILE_MODE_READ,&fd);CHKERRQ(ierr); ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); ierr = MatSetFromOptions(A);CHKERRQ(ierr); ierr = MatLoad(A,fd);CHKERRQ(ierr); ierr = MatView(A, PETSC_VIEWER_STDOUT_WORLD); // Load RHS flg = PETSC_FALSE; ierr = PetscOptionsGetString(PETSC_NULL,PETSC_NULL,"-rhs",file[1],PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); if (!flg) SETERRQ(PETSC_COMM_WORLD,1,"Must indicate rhs file with the -rhs option"); ierr = VecCreate(PETSC_COMM_WORLD, &b);CHKERRQ(ierr); ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,file[1],FILE_MODE_READ,&fd);CHKERRQ(ierr); ierr = VecSetFromOptions(b);CHKERRQ(ierr); ierr = VecLoad(b,fd);CHKERRQ(ierr); While the matrix loads correctly, when I try to load the vector b, the behavior of the program becomes unexpected afterwards. For instance, if I try to view the vector, sometimes it prints and sometimes not and even when it prints, it only shows some entries in process 0 and then stops. AM I DOING SOMETHING WRONG IN PREPROCESSING? Or am I missing something? Thank you -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat May 26 18:02:16 2018 From: jed at jedbrown.org (Jed Brown) Date: Sat, 26 May 2018 17:02:16 -0600 Subject: [petsc-users] Preprocessing Matrix Market matrices and right hand sides In-Reply-To: References: Message-ID: <87wovqc96v.fsf@jedbrown.org> Can you reproduce using src/ksp/ksp/examples/tutorials/ex10.c? Najeeb Ahmad writes: > Hi, > > I am trying to load a matrix and a right hand side into Petsc from a file. > The matrix and rhs are preprocessed from matrix market format using the > following python script: > > import scipy.io, PetscBinaryIO > > A = scipy.io.mmread('sherman2.mtx') > PetscBinaryIO.PetscBinaryIO().writeMatSciPy(open('sherman2','w'), A) > > B = scipy.io.mmread('sherman2_rhs1.mtx') > PetscBinaryIO.PetscBinaryIO().writeVec(open('sherman2_rhs','w'), B) > > The binary matrix sherman2 and rhs sherman2_rhs are then being loaded into > Petsc using the following code: > > PetscErrorCode ierr; > PetscViewer fd; /* viewer */ > Mat A; /* linear system matrix */ > Vec b; /* RHS */ > char file[2][PETSC_MAX_PATH_LEN]; > PetscBool flg, PetscPreLoad = PETSC_FALSE; > PetscDraw dc; > PetscInt size; > PetscReal value; > > PetscInitialize(&argc,&args,(char*)0,help);if (ierr) return ierr; > //PetscPrintf(MPI_COMM > ierr = > PetscOptionsGetString(PETSC_NULL,PETSC_NULL,"-f",file[0],PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); > if (!flg) SETERRQ(PETSC_COMM_WORLD,1,"Must indicate binary file with the > -f option"); > // Load Matrix A > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,file[0],FILE_MODE_READ,&fd);CHKERRQ(ierr); > > ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); > ierr = MatSetFromOptions(A);CHKERRQ(ierr); > ierr = MatLoad(A,fd);CHKERRQ(ierr); > ierr = MatView(A, PETSC_VIEWER_STDOUT_WORLD); > > > // Load RHS > flg = PETSC_FALSE; > ierr = > PetscOptionsGetString(PETSC_NULL,PETSC_NULL,"-rhs",file[1],PETSC_MAX_PATH_LEN,&flg);CHKERRQ(ierr); > if (!flg) SETERRQ(PETSC_COMM_WORLD,1,"Must indicate rhs file with the > -rhs option"); > > ierr = VecCreate(PETSC_COMM_WORLD, &b);CHKERRQ(ierr); > ierr = PetscViewerDestroy(&fd);CHKERRQ(ierr); > ierr = > PetscViewerBinaryOpen(PETSC_COMM_WORLD,file[1],FILE_MODE_READ,&fd);CHKERRQ(ierr); > ierr = VecSetFromOptions(b);CHKERRQ(ierr); > > ierr = VecLoad(b,fd);CHKERRQ(ierr); > > > While the matrix loads correctly, when I try to load the vector b, the > behavior of the program becomes unexpected afterwards. For instance, if I > try to view the vector, sometimes it prints and sometimes not and even when > it prints, it only shows some entries in process 0 and then stops. > > AM I DOING SOMETHING WRONG IN PREPROCESSING? Or am I missing something? > > Thank you > > > > -- > *Najeeb Ahmad* > > > *Research and Teaching Assistant* > *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * > > *Computer Science and Engineering* > *Ko? University, Istanbul, Turkey* From mbuerkle at web.de Sun May 27 21:39:55 2018 From: mbuerkle at web.de (Marius Buerkle) Date: Mon, 28 May 2018 04:39:55 +0200 Subject: [petsc-users] MatGetValues / MatGetRow Message-ID: An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun May 27 21:58:49 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 28 May 2018 02:58:49 +0000 Subject: [petsc-users] MatGetValues / MatGetRow In-Reply-To: References: Message-ID: > On May 27, 2018, at 9:39 PM, Marius Buerkle wrote: > > Hi ! > > Are MatGetValues / MatGetRow implemented for MATELEMENTAL? A quick check seems to say no. > Or is there another way to access the elements of an ELEMENTAL matrix? There is no obvious way to access the elements of an Elemental matrix except to directly access the elemental data structure (which presumably is hairy) or to MatConvert(A,MATDENSE,MAT_INITIAL_MATRIX,&Adense); and then access in the dense format (note this conversion is expensive and needs to do communication). Why do you wish to access the values directly instead of use them indirectly with the API, such as matrix-vector products, factorizations etc? Barry > > best, > marius From bsmith at mcs.anl.gov Mon May 28 00:20:32 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 28 May 2018 05:20:32 +0000 Subject: [petsc-users] MatGetValues / MatGetRow In-Reply-To: References: <76B25822-6FD1-411D-BA60-E3353226538B@mcs.anl.gov> Message-ID: <27F8AD6F-CC02-49FF-A725-1E3FFA67FA2C@mcs.anl.gov> > On May 28, 2018, at 12:15 AM, Marius Buerkle wrote: > > No M is stored as MPIAIJ, as all but the A11 and A33 blocks are very sparse. I want to add A11 and A33 which are calculuted seperatly using MATSOLVERELEMENTAL into the bigger matrix M and then use a sparse solver on M. Hmm, how big are the A11 and A33? Who calculates them separately? I would use MatConvert() on these two "separate" matrices as I mentioned before and call MatSetValues() with the resulting dense matrices to put the "dense" values into the sparse M matrix. Barry > > > > > > On May 27, 2018, at 11:23 PM, Marius Buerkle wrote: > > > > Thanks for the swift reply. I want to from, eventually invert, a sparse matrix of the form M=[A11 A12 A13 , A21 A22 A23 , A31 A32 A33], where A11 and A33 are dense square matrices which have a (much) smaller dimension than the other (sparse) sub matrices of M. A11 and A33 are obtained from a separate calculation. > > I don't understand how this relates to your elemental question? > > How are you storing the M matrix? As Elemental matrix? > > Barry > > > > > > > > > > > > >> > >> Hi ! > >> > >> Are MatGetValues / MatGetRow implemented for MATELEMENTAL? > > > > A quick check seems to say no. > > > >> Or is there another way to access the elements of an ELEMENTAL matrix? > > > > There is no obvious way to access the elements of an Elemental matrix except to directly access the elemental data structure (which presumably is hairy) or to MatConvert(A,MATDENSE,MAT_INITIAL_MATRIX,&Adense); and then access in the dense format (note this conversion is expensive and needs to do communication). > > > > > > Why do you wish to access the values directly instead of use them indirectly with the API, such as matrix-vector products, factorizations etc? > > > > > > > > Barry > > > >> > >> best, > >> marius > > > From mbuerkle at web.de Mon May 28 00:36:54 2018 From: mbuerkle at web.de (Marius Buerkle) Date: Mon, 28 May 2018 07:36:54 +0200 Subject: [petsc-users] MatGetValues / MatGetRow In-Reply-To: <27F8AD6F-CC02-49FF-A725-1E3FFA67FA2C@mcs.anl.gov> References: <76B25822-6FD1-411D-BA60-E3353226538B@mcs.anl.gov> <27F8AD6F-CC02-49FF-A725-1E3FFA67FA2C@mcs.anl.gov> Message-ID: An HTML attachment was scrubbed... URL: From nahmad16 at ku.edu.tr Mon May 28 09:58:49 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Mon, 28 May 2018 19:58:49 +0500 Subject: [petsc-users] Re-configuring PETSc Message-ID: Hi All, I have Petsc release version 3.9.2 configured with the following options: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --download-fblaslapack=1 Now I want to use PCILU in my code and when I set the PC type to PCILU in the code, I get the following error: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for possible LU and Cholesky solvers [0]PETSC ERROR: Could not locate a solver package. Perhaps you must ./configure with --download- [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad Mon May 28 17:52:41 2018 [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --download-fblaslapack=1 [0]PETSC ERROR: #1 MatGetFactor() line 4318 in /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: #3 PCSetUp() line 923 in /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #4 KSPSetUp() line 381 in /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #5 KSPSolve() line 612 in /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #6 SolveSystem() line 60 in /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c I assume that I am missing LU package like SuperLU_dist for instance and I need to download and configure it with Petsc. I am wondering what is the best way to reconfigure Petsc to download and use the appropriate package to support PCILU? You advice is highly appreciated. -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon May 28 10:23:51 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 28 May 2018 10:23:51 -0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: Message-ID: On Mon, 28 May 2018, Najeeb Ahmad wrote: > Hi All, > > I have Petsc release version 3.9.2 configured with the following options: > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > --download-fblaslapack=1 > > Now I want to use PCILU in my code and when I set the PC type to PCILU in > the code, I get the following error: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > possible LU and Cholesky solvers > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > ./configure with --download- > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad Mon > May 28 17:52:41 2018 > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > --with-fc=mpiifort --download-fblaslapack=1 > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: #3 PCSetUp() line 923 in > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #5 KSPSolve() line 612 in > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 SolveSystem() line 60 in > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > I assume that I am missing LU package like SuperLU_dist for instance and I > need to download and configure it with Petsc. yes - petsc has sequential LU - but you need superlu_dist/mumps for parallel lu. > > I am wondering what is the best way to reconfigure Petsc to download and > use the appropriate package to support PCILU? Rerun configure with the additional option --download-superlu_dist=1. You can do this with current PETSC_ARCH you are using [i.e reinstall over the current build] - or use a different PETSC_ARCH - so both builds exist and useable. Satish > > You advice is highly appreciated. > > From nahmad16 at ku.edu.tr Mon May 28 10:32:45 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Mon, 28 May 2018 20:32:45 +0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: Message-ID: Thanks a lot Satish for your prompt reply. I just checked that SuperLU_dist package works only for matrices of type aij. and uses lu preconditioner. I am currently working with baij matrix. What is the best preconditioner choice for baij matrices on parallel machines? Thanks On Mon, May 28, 2018 at 8:23 PM, Satish Balay wrote: > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > Hi All, > > > > I have Petsc release version 3.9.2 configured with the following options: > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > > --download-fblaslapack=1 > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU in > > the code, I get the following error: > > > > [0]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [0]PETSC ERROR: See > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > > possible LU and Cholesky solvers > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > ./configure with --download- > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for > > trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad > Mon > > May 28 17:52:41 2018 > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > --with-fc=mpiifort --download-fblaslapack=1 > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > I assume that I am missing LU package like SuperLU_dist for instance and > I > > need to download and configure it with Petsc. > > yes - petsc has sequential LU - but you need superlu_dist/mumps for > parallel lu. > > > > > I am wondering what is the best way to reconfigure Petsc to download and > > use the appropriate package to support PCILU? > > Rerun configure with the additional option --download-superlu_dist=1. > > You can do this with current PETSC_ARCH you are using [i.e reinstall > over the current build] - or use a different PETSC_ARCH - so both > builds exist and useable. > > Satish > > > > > You advice is highly appreciated. > > > > > > -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon May 28 10:51:39 2018 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 28 May 2018 10:51:39 -0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: Message-ID: I guess you would have to switch to AIJ format for superlu_dist. [superlu_dist internally has its own format - so there is a conversion there anyway] Satish On Mon, 28 May 2018, Najeeb Ahmad wrote: > Thanks a lot Satish for your prompt reply. > > I just checked that SuperLU_dist package works only for matrices of type > aij. and uses lu preconditioner. I am currently working with baij matrix. > What is the best preconditioner choice for baij matrices on parallel > machines? > > Thanks > > On Mon, May 28, 2018 at 8:23 PM, Satish Balay wrote: > > > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > > > Hi All, > > > > > > I have Petsc release version 3.9.2 configured with the following options: > > > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > > > --download-fblaslapack=1 > > > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU in > > > the code, I get the following error: > > > > > > [0]PETSC ERROR: --------------------- Error Message > > > -------------------------------------------------------------- > > > [0]PETSC ERROR: See > > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > > > possible LU and Cholesky solvers > > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > > ./configure with --download- > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > > for > > > trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad > > Mon > > > May 28 17:52:41 2018 > > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > > --with-fc=mpiifort --download-fblaslapack=1 > > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > > > > I assume that I am missing LU package like SuperLU_dist for instance and > > I > > > need to download and configure it with Petsc. > > > > yes - petsc has sequential LU - but you need superlu_dist/mumps for > > parallel lu. > > > > > > > > I am wondering what is the best way to reconfigure Petsc to download and > > > use the appropriate package to support PCILU? > > > > Rerun configure with the additional option --download-superlu_dist=1. > > > > You can do this with current PETSC_ARCH you are using [i.e reinstall > > over the current build] - or use a different PETSC_ARCH - so both > > builds exist and useable. > > > > Satish > > > > > > > > You advice is highly appreciated. > > > > > > > > > > > > > From nahmad16 at ku.edu.tr Mon May 28 11:07:47 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Mon, 28 May 2018 21:07:47 +0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: Message-ID: Thank you Satish for your kind help :) On Mon, May 28, 2018 at 8:51 PM, Satish Balay wrote: > I guess you would have to switch to AIJ format for superlu_dist. > > [superlu_dist internally has its own format - so there is a conversion > there anyway] > > Satish > > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > Thanks a lot Satish for your prompt reply. > > > > I just checked that SuperLU_dist package works only for matrices of type > > aij. and uses lu preconditioner. I am currently working with baij matrix. > > What is the best preconditioner choice for baij matrices on parallel > > machines? > > > > Thanks > > > > On Mon, May 28, 2018 at 8:23 PM, Satish Balay wrote: > > > > > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > > > > > Hi All, > > > > > > > > I have Petsc release version 3.9.2 configured with the following > options: > > > > > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc > --with-fc=mpiifort > > > > --download-fblaslapack=1 > > > > > > > > Now I want to use PCILU in my code and when I set the PC type to > PCILU in > > > > the code, I get the following error: > > > > > > > > [0]PETSC ERROR: --------------------- Error Message > > > > -------------------------------------------------------------- > > > > [0]PETSC ERROR: See > > > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html > for > > > > possible LU and Cholesky solvers > > > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > > > ./configure with --download- > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html > > > for > > > > trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by > nahmad > > > Mon > > > > May 28 17:52:41 2018 > > > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > > > --with-fc=mpiifort --download-fblaslapack=1 > > > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > > > > > > > I assume that I am missing LU package like SuperLU_dist for instance > and > > > I > > > > need to download and configure it with Petsc. > > > > > > yes - petsc has sequential LU - but you need superlu_dist/mumps for > > > parallel lu. > > > > > > > > > > > I am wondering what is the best way to reconfigure Petsc to download > and > > > > use the appropriate package to support PCILU? > > > > > > Rerun configure with the additional option --download-superlu_dist=1. > > > > > > You can do this with current PETSC_ARCH you are using [i.e reinstall > > > over the current build] - or use a different PETSC_ARCH - so both > > > builds exist and useable. > > > > > > Satish > > > > > > > > > > > You advice is highly appreciated. > > > > > > > > > > > > > > > > > > > > > > -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 28 11:28:57 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 28 May 2018 16:28:57 +0000 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: Message-ID: <8523D3F6-6994-40F8-B82B-28B63BA5AAA3@anl.gov> > On May 28, 2018, at 9:58 AM, Najeeb Ahmad wrote: > > Hi All, > > I have Petsc release version 3.9.2 configured with the following options: > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --download-fblaslapack=1 > > Now I want to use PCILU in my code and when I set the PC type to PCILU in the code, I get the following error: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for possible LU and Cholesky solvers > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must ./configure with --download- > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad Mon May 28 17:52:41 2018 > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --download-fblaslapack=1 > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: #3 PCSetUp() line 923 in /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #4 KSPSetUp() line 381 in /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #5 KSPSolve() line 612 in /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 SolveSystem() line 60 in /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > I assume that I am missing LU package like SuperLU_dist for instance and I need to download and configure it with Petsc. > > I am wondering what is the best way to reconfigure Petsc to download and use the appropriate package to support PCILU? There is no support for parallel ILU in PETSc (or with its external packages). You can use -pc_type bjacobi or -pc_type asm and -sub_pc_type ilu and get block Jacobi or overlapping additive Schwarz with ILU on each block (which is "sort of" a parallel ILU). Barry > > You advice is highly appreciated. > > -- > Najeeb Ahmad > > Research and Teaching Assistant > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > Computer Science and Engineering > Ko? University, Istanbul, Turkey > From bsmith at mcs.anl.gov Mon May 28 11:32:24 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 28 May 2018 16:32:24 +0000 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: Message-ID: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> > On May 28, 2018, at 10:32 AM, Najeeb Ahmad wrote: > > Thanks a lot Satish for your prompt reply. > > I just checked that SuperLU_dist package works only for matrices of type aij. and uses lu preconditioner. I am currently working with baij matrix. What is the best preconditioner choice for baij matrices on parallel machines? The best preconditioner is always problem specific. Where does your problem come from? CFD? Structural mechanics? other apps? Anyways you probably want to make your code be able to switch between AIJ and BAIJ at run time since the different formats support somewhat different solvers. If your code alls MatSetFromOptions then you can switch via the command line option -mat_type aij or baij Barry > > Thanks > > On Mon, May 28, 2018 at 8:23 PM, Satish Balay wrote: > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > Hi All, > > > > I have Petsc release version 3.9.2 configured with the following options: > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > > --download-fblaslapack=1 > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU in > > the code, I get the following error: > > > > [0]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [0]PETSC ERROR: See > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > > possible LU and Cholesky solvers > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > ./configure with --download- > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > > trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad Mon > > May 28 17:52:41 2018 > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > --with-fc=mpiifort --download-fblaslapack=1 > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > I assume that I am missing LU package like SuperLU_dist for instance and I > > need to download and configure it with Petsc. > > yes - petsc has sequential LU - but you need superlu_dist/mumps for parallel lu. > > > > > I am wondering what is the best way to reconfigure Petsc to download and > > use the appropriate package to support PCILU? > > Rerun configure with the additional option --download-superlu_dist=1. > > You can do this with current PETSC_ARCH you are using [i.e reinstall > over the current build] - or use a different PETSC_ARCH - so both > builds exist and useable. > > Satish > > > > > You advice is highly appreciated. > > > > > > > > > -- > Najeeb Ahmad > > Research and Teaching Assistant > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > Computer Science and Engineering > Ko? University, Istanbul, Turkey > From nahmad16 at ku.edu.tr Tue May 29 03:05:36 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Tue, 29 May 2018 13:05:36 +0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> References: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> Message-ID: On Mon, May 28, 2018 at 9:32 PM, Smith, Barry F. wrote: > > > > On May 28, 2018, at 10:32 AM, Najeeb Ahmad wrote: > > > > Thanks a lot Satish for your prompt reply. > > > > I just checked that SuperLU_dist package works only for matrices of type > aij. and uses lu preconditioner. I am currently working with baij matrix. > What is the best preconditioner choice for baij matrices on parallel > machines? > > The best preconditioner is always problem specific. Where does your > problem come from? CFD? Structural mechanics? other apps? > I am interested in writing solver for reservoir simulation employing FVM and unstructured grids. My main objective is to study performance of the code with different data structures/data layouts and architecture specific optimizations, specifically targeting the multicore architectures like KNL for instance. Later the study may be extended to include GPUs. The options for switching between AIJ and BAIJ etc. are therefore very useful for my study. The purpose why I wanted to change the preconditioner is that the default preconditioner is giving me different iterations count for different number of processesors. I would rather like a preconditioner that would give me same iteration count for any processor count so that I can better compare the performance results. Your suggestions in this regard are highly appreciated, specifically with reference to the following points: - Is it possible to explicitly use high bandwidth memory in PETSc for selected object placement (e.g. using memkind library for instance)? - What would it take to take advantage of architecture specific compiler flags to achieve good performance on a given platform (e.g. -xMIC-AVX512 for AVX512 on KNL, #pragma SIMD etc.). Sorry for some very basic questions as I am a novice PETSc user. Thanks for your time :) > > Anyways you probably want to make your code be able to switch between > AIJ and BAIJ at run time since the different formats support somewhat > different solvers. If your code alls MatSetFromOptions then you can switch > via the command line option -mat_type aij or baij > > Barry > > > > > Thanks > > > > On Mon, May 28, 2018 at 8:23 PM, Satish Balay wrote: > > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > > > Hi All, > > > > > > I have Petsc release version 3.9.2 configured with the following > options: > > > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc > --with-fc=mpiifort > > > --download-fblaslapack=1 > > > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU > in > > > the code, I get the following error: > > > > > > [0]PETSC ERROR: --------------------- Error Message > > > -------------------------------------------------------------- > > > [0]PETSC ERROR: See > > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > > > possible LU and Cholesky solvers > > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > > ./configure with --download- > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for > > > trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by > nahmad Mon > > > May 28 17:52:41 2018 > > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > > --with-fc=mpiifort --download-fblaslapack=1 > > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > > > > I assume that I am missing LU package like SuperLU_dist for instance > and I > > > need to download and configure it with Petsc. > > > > yes - petsc has sequential LU - but you need superlu_dist/mumps for > parallel lu. > > > > > > > > I am wondering what is the best way to reconfigure Petsc to download > and > > > use the appropriate package to support PCILU? > > > > Rerun configure with the additional option --download-superlu_dist=1. > > > > You can do this with current PETSC_ARCH you are using [i.e reinstall > > over the current build] - or use a different PETSC_ARCH - so both > > builds exist and useable. > > > > Satish > > > > > > > > You advice is highly appreciated. > > > > > > > > > > > > > > > > -- > > Najeeb Ahmad > > > > Research and Teaching Assistant > > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > > Computer Science and Engineering > > Ko? University, Istanbul, Turkey > > > > -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From nahmad16 at ku.edu.tr Tue May 29 03:08:21 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Tue, 29 May 2018 13:08:21 +0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: <8523D3F6-6994-40F8-B82B-28B63BA5AAA3@anl.gov> References: <8523D3F6-6994-40F8-B82B-28B63BA5AAA3@anl.gov> Message-ID: On Mon, May 28, 2018 at 9:28 PM, Smith, Barry F. wrote: > > > > On May 28, 2018, at 9:58 AM, Najeeb Ahmad wrote: > > > > Hi All, > > > > I have Petsc release version 3.9.2 configured with the following options: > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > --download-fblaslapack=1 > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU > in the code, I get the following error: > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/ > linearsolvertable.html for possible LU and Cholesky solvers > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > ./configure with --download- > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad > Mon May 28 17:52:41 2018 > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > --with-fc=mpiifort --download-fblaslapack=1 > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > [0]PETSC ERROR: #3 PCSetUp() line 923 in /home/nahmad/PETSc/petsc/src/ > ksp/pc/interface/precon.c > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in /home/nahmad/PETSc/petsc/src/ > ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #5 KSPSolve() line 612 in /home/nahmad/PETSc/petsc/src/ > ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 SolveSystem() line 60 in /home/nahmad/Aramco/petsc/ > petsc/BlockSolveTest/src/main.c > > > > > > I assume that I am missing LU package like SuperLU_dist for instance and > I need to download and configure it with Petsc. > > > > I am wondering what is the best way to reconfigure Petsc to download and > use the appropriate package to support PCILU? > > > There is no support for parallel ILU in PETSc (or with its external > packages). You can use -pc_type bjacobi or -pc_type asm and -sub_pc_type > ilu and get block Jacobi or overlapping additive Schwarz with ILU on each > >block (which is "sort of" a parallel ILU). > > > > > > > Barry > Thank you for the info and the useful suggestions Barry. I will try it out. > > > > > You advice is highly appreciated. > > > > -- > > Najeeb Ahmad > > > > Research and Teaching Assistant > > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > > Computer Science and Engineering > > Ko? University, Istanbul, Turkey > > > > -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbuerkle at web.de Tue May 29 03:31:55 2018 From: mbuerkle at web.de (Marius Buerkle) Date: Tue, 29 May 2018 10:31:55 +0200 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? Message-ID: Hi, ? I tried to set?MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore? entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? From knepley at gmail.com Tue May 29 05:40:46 2018 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 29 May 2018 06:40:46 -0400 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> Message-ID: On Tue, May 29, 2018 at 4:05 AM, Najeeb Ahmad wrote: > > > On Mon, May 28, 2018 at 9:32 PM, Smith, Barry F. > wrote: > >> >> >> > On May 28, 2018, at 10:32 AM, Najeeb Ahmad wrote: >> > >> > Thanks a lot Satish for your prompt reply. >> > >> > I just checked that SuperLU_dist package works only for matrices of >> type aij. and uses lu preconditioner. I am currently working with baij >> matrix. What is the best preconditioner choice for baij matrices on >> parallel machines? >> >> The best preconditioner is always problem specific. Where does your >> problem come from? CFD? Structural mechanics? other apps? >> > > I am interested in writing solver for reservoir simulation > employing FVM and unstructured grids. My main objective is to study > performance of the code with different data structures/data layouts and > architecture specific optimizations, specifically targeting the multicore > architectures like KNL for instance. Later the study may be extended to > include GPUs. The options for switching between AIJ and BAIJ etc. are > therefore very useful for my study. > I would strongly encourage you to create a simple performance model first. It should demonstrate that the investigation can produce a positive result. For example, performance for stencil codes like this is usually limited by memory bandwidth. The bandwidth advantage would be at most 3x with KNL/GPU vs Skylake, and it would be nil if the Skylake were to get MCDRAM as we all expect. > The purpose why I wanted to change the preconditioner is that the > default preconditioner is giving me different iterations count for > different number of processesors. I would rather like a preconditioner that > would give me same iteration count for any processor count so that I can > better compare the performance results. > > Your suggestions in this regard are highly appreciated, > specifically with reference to the following points: > > - Is it possible to explicitly use high bandwidth memory in PETSc > for selected object placement (e.g. using memkind library for instance)? > I believe Richard has some extensions for this. Richard? > - What would it take to take advantage of architecture specific > compiler flags to achieve good performance on a given platform (e.g. > -xMIC-AVX512 for AVX512 on KNL, #pragma SIMD etc.). > Usually intrinsics. We already have this for MatMult() for some architectures. Thanks, Matt > > Sorry for some very basic questions as I am a novice PETSc user. > > Thanks for your time :) > >> >> Anyways you probably want to make your code be able to switch >> between AIJ and BAIJ at run time since the different formats support >> somewhat different solvers. If your code alls MatSetFromOptions then you >> can switch via the command line option -mat_type aij or baij >> >> Barry >> >> > >> > Thanks >> > >> > On Mon, May 28, 2018 at 8:23 PM, Satish Balay >> wrote: >> > On Mon, 28 May 2018, Najeeb Ahmad wrote: >> > >> > > Hi All, >> > > >> > > I have Petsc release version 3.9.2 configured with the following >> options: >> > > >> > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc >> --with-fc=mpiifort >> > > --download-fblaslapack=1 >> > > >> > > Now I want to use PCILU in my code and when I set the PC type to >> PCILU in >> > > the code, I get the following error: >> > > >> > > [0]PETSC ERROR: --------------------- Error Message >> > > -------------------------------------------------------------- >> > > [0]PETSC ERROR: See >> > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for >> > > possible LU and Cholesky solvers >> > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must >> > > ./configure with --download- >> > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/d >> ocumentation/faq.html for >> > > trouble shooting. >> > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown >> > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by >> nahmad Mon >> > > May 28 17:52:41 2018 >> > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc >> > > --with-fc=mpiifort --download-fblaslapack=1 >> > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in >> > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c >> > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in >> > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c >> > > [0]PETSC ERROR: #3 PCSetUp() line 923 in >> > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c >> > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in >> > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c >> > > [0]PETSC ERROR: #5 KSPSolve() line 612 in >> > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c >> > > [0]PETSC ERROR: #6 SolveSystem() line 60 in >> > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c >> > > >> > > >> > > I assume that I am missing LU package like SuperLU_dist for instance >> and I >> > > need to download and configure it with Petsc. >> > >> > yes - petsc has sequential LU - but you need superlu_dist/mumps for >> parallel lu. >> > >> > > >> > > I am wondering what is the best way to reconfigure Petsc to download >> and >> > > use the appropriate package to support PCILU? >> > >> > Rerun configure with the additional option --download-superlu_dist=1. >> > >> > You can do this with current PETSC_ARCH you are using [i.e reinstall >> > over the current build] - or use a different PETSC_ARCH - so both >> > builds exist and useable. >> > >> > Satish >> > >> > > >> > > You advice is highly appreciated. >> > > >> > > >> > >> > >> > >> > >> > -- >> > Najeeb Ahmad >> > >> > Research and Teaching Assistant >> > PARallel and MultiCORE Computing Laboratory (ParCoreLab) >> > Computer Science and Engineering >> > Ko? University, Istanbul, Turkey >> > >> >> > > > -- > *Najeeb Ahmad* > > > *Research and Teaching Assistant* > *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * > > *Computer Science and Engineering* > *Ko? University, Istanbul, Turkey* > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Michael.Becker at physik.uni-giessen.de Tue May 29 06:18:42 2018 From: Michael.Becker at physik.uni-giessen.de (Michael Becker) Date: Tue, 29 May 2018 13:18:42 +0200 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: Hello again, here are the updated log_view files for 125 and 1000 processors. I ran both problems twice, the first time with all processors per node allocated ("-1.txt"), the second with only half on twice the number of nodes ("-2.txt"). >> On May 24, 2018, at 12:24 AM, Michael Becker wrote: >> >> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). > Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. I mean this, right in the log_view output: > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > ... > > --- Event Stage 1: First Solve > > ... > > --- Event Stage 2: Remaining Solves > > Vector 23904 23904 1295501184 0. I logged the exact number of KSP iterations over the 999 timesteps and its exactly 23904/6 = 3984. Michael Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: > Please send the log file for 1000 with cg as the solver. > > You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) > > > >> On May 24, 2018, at 12:24 AM, Michael Becker wrote: >> >> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). > Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. > > > >> This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). >> >> >> Thanks in advance. >> >> Michael >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-022 with 125 processors, by beckerm Fri May 25 09:33:10 2018 Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500 Max Max/Min Avg Total Time (sec): 2.916e+02 1.00000 2.916e+02 Objects: 2.438e+04 1.00004 2.438e+04 Flop: 2.125e+10 1.27708 1.963e+10 2.454e+12 Flop/sec: 7.287e+07 1.27708 6.733e+07 8.416e+09 MPI Messages: 1.042e+06 3.36140 7.129e+05 8.911e+07 MPI Message Lengths: 1.344e+09 2.32209 1.439e+03 1.282e+11 MPI Reductions: 2.250e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.3792e+01 15.0% 0.0000e+00 0.0% 3.000e+03 0.0% 3.178e+03 0.0% 1.700e+01 0.1% 1: First Solve: 2.5655e+00 0.9% 3.6885e+09 0.2% 3.549e+05 0.4% 3.736e+03 1.0% 5.500e+02 2.4% 2: Remaining Solves: 2.4525e+02 84.1% 2.4504e+12 99.8% 8.875e+07 99.6% 1.430e+03 99.0% 2.192e+04 97.4% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 3 1.0 4.0317e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: First Solve BuildTwoSided 12 1.0 9.7456e-03 1.5 0.00e+00 0.0 8.8e+03 4.0e+00 0.0e+00 0 0 0 0 0 0 0 2 0 0 0 BuildTwoSidedF 30 1.0 2.9124e-01 3.7 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 6 0 2 5 0 0 KSPSetUp 9 1.0 3.9537e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 2.5694e+00 1.0 3.26e+07 1.4 3.5e+05 3.7e+03 5.5e+02 1 0 0 1 2 100100100100100 1436 VecTDot 8 1.0 1.0836e-02 6.5 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 4983 VecNorm 6 1.0 2.1179e-03 3.5 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 19123 VecScale 24 1.0 2.5225e-04 4.4 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 20408 VecCopy 1 1.0 1.3018e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 115 1.0 7.8964e-04 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 8 1.0 1.0571e-03 1.8 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 51081 VecAYPX 28 1.0 1.4100e-03 2.2 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 31104 VecAssemblyBegin 2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 2 1.0 1.9073e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 103 1.0 6.7844e-03 3.4 0.00e+00 0.0 8.9e+04 1.4e+03 0.0e+00 0 0 0 0 0 0 0 25 9 0 0 VecScatterEnd 103 1.0 5.8765e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatMult 29 1.0 4.4128e-02 1.7 6.14e+06 1.2 3.0e+04 2.1e+03 0.0e+00 0 0 0 0 0 1 19 8 5 0 16244 MatMultAdd 24 1.0 1.6727e-02 2.7 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 9033 MatMultTranspose 24 1.0 1.5692e-02 2.4 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 9628 MatSolve 4 0.0 2.2888e-05 0.0 2.64e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12 MatSOR 48 1.0 7.2616e-02 1.7 1.09e+07 1.3 2.7e+04 1.5e+03 8.0e+00 0 0 0 0 0 3 34 8 3 1 17266 MatLUFactorSym 1 1.0 6.6996e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.5020e-05 5.2 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9 MatResidual 24 1.0 3.6082e-02 2.1 4.55e+06 1.3 2.7e+04 1.5e+03 0.0e+00 0 0 0 0 0 1 14 8 3 0 14385 MatAssemblyBegin 94 1.0 2.9352e-01 3.4 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 6 0 2 5 0 0 MatAssemblyEnd 94 1.0 8.8632e-02 1.1 0.00e+00 0.0 6.3e+04 2.1e+02 2.3e+02 0 0 0 0 1 3 0 18 1 42 0 MatGetRow 3102093 1.3 4.2884e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 15 0 0 0 0 0 MatGetRowIJ 1 0.0 8.8215e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 6 1.0 4.7427e-01 2.5 0.00e+00 0.0 5.5e+04 1.7e+04 1.2e+01 0 0 0 1 0 13 0 15 71 2 0 MatCreateSubMat 4 1.0 8.0028e-03 1.0 0.00e+00 0.0 2.9e+03 2.7e+02 6.4e+01 0 0 0 0 0 0 0 1 0 12 0 MatGetOrdering 1 0.0 1.3018e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 6 1.0 5.7495e-02 1.2 0.00e+00 0.0 2.7e+04 1.0e+03 1.2e+01 0 0 0 0 0 2 0 8 2 2 0 MatCoarsen 6 1.0 2.0511e-02 1.1 0.00e+00 0.0 5.3e+04 5.8e+02 3.3e+01 0 0 0 0 0 1 0 15 2 6 0 MatZeroEntries 6 1.0 3.5179e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 6 1.0 2.6506e-01 1.0 1.13e+07 1.6 6.3e+04 2.6e+03 9.2e+01 0 0 0 0 0 10 33 18 13 17 4615 MatPtAPSymbolic 6 1.0 1.5077e-01 1.0 0.00e+00 0.0 3.4e+04 2.7e+03 4.2e+01 0 0 0 0 0 6 0 10 7 8 0 MatPtAPNumeric 6 1.0 1.1295e-01 1.0 1.13e+07 1.6 2.9e+04 2.6e+03 4.8e+01 0 0 0 0 0 4 33 8 6 9 10831 MatGetLocalMat 6 1.0 4.4863e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 1.0457e-02 1.7 0.00e+00 0.0 2.0e+04 3.5e+03 0.0e+00 0 0 0 0 0 0 0 6 5 0 0 SFSetGraph 12 1.0 2.1935e-05 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 12 1.0 1.6796e-02 1.1 0.00e+00 0.0 2.6e+04 6.2e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0 SFBcastBegin 45 1.0 2.0542e-03 2.5 0.00e+00 0.0 5.4e+04 6.9e+02 0.0e+00 0 0 0 0 0 0 0 15 3 0 0 SFBcastEnd 45 1.0 8.7860e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 2.0872e+00 1.0 0.00e+00 0.0 2.0e+05 5.2e+03 2.8e+02 1 0 0 1 1 81 0 56 78 52 0 GAMG: partLevel 6 1.0 2.7715e-01 1.0 1.13e+07 1.6 6.6e+04 2.5e+03 1.9e+02 0 0 0 0 1 11 33 19 13 35 4414 repartition 2 1.0 8.5306e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0 Invert-Sort 2 1.0 1.1656e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 Move A 2 1.0 4.7252e-03 1.1 0.00e+00 0.0 1.4e+03 5.3e+02 3.4e+01 0 0 0 0 0 0 0 0 0 6 0 Move P 2 1.0 4.5433e-03 1.1 0.00e+00 0.0 1.4e+03 1.3e+01 3.4e+01 0 0 0 0 0 0 0 0 0 6 0 PCSetUp 2 1.0 2.3749e+00 1.0 1.13e+07 1.6 2.7e+05 4.5e+03 5.1e+02 1 0 0 1 2 93 33 75 90 93 515 PCSetUpOnBlocks 4 1.0 2.7108e-04 1.4 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 4 1.0 1.1422e-01 1.1 1.82e+07 1.3 8.6e+04 1.2e+03 8.0e+00 0 0 0 0 0 4 56 24 8 1 18166 --- Event Stage 2: Remaining Solves KSPSolve 999 1.0 1.2777e+02 1.1 2.12e+10 1.3 8.8e+07 1.4e+03 2.2e+04 42100 99 97 97 50100 99 98100 19178 VecTDot 7968 1.0 1.1053e+01 6.1 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 1 2 0 0 35 1 2 0 0 36 4866 VecNorm 5982 1.0 4.1078e+00 6.9 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 1 2 0 0 27 9830 VecScale 23904 1.0 1.1072e-01 2.3 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 46310 VecCopy 999 1.0 1.2563e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 83664 1.0 6.9843e-01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 7968 1.0 1.0304e+00 1.8 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 52196 VecAYPX 27888 1.0 1.3915e+00 2.3 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 31384 VecScatterBegin 100599 1.0 6.4764e+00 3.5 0.00e+00 0.0 8.8e+07 1.4e+03 0.0e+00 1 0 99 97 0 2 0 99 98 0 0 VecScatterEnd 100599 1.0 5.6109e+01 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 9 0 0 0 0 0 MatMult 28887 1.0 4.4493e+01 1.8 6.12e+09 1.2 3.0e+07 2.1e+03 0.0e+00 10 29 33 49 0 12 29 33 49 0 16049 MatMultAdd 23904 1.0 1.4431e+01 2.5 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 10428 MatMultTranspose 23904 1.0 1.5629e+01 2.4 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 9629 MatSolve 3984 0.0 1.9469e-02 0.0 2.63e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14 MatSOR 47808 1.0 6.8757e+01 1.7 1.08e+10 1.3 2.7e+07 1.5e+03 8.0e+03 22 51 30 32 35 26 51 30 32 36 18089 MatResidual 23904 1.0 3.1760e+01 1.9 4.54e+09 1.3 2.7e+07 1.5e+03 0.0e+00 7 21 30 32 0 8 21 30 32 0 16276 PCSetUpOnBlocks 3984 1.0 5.4686e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 3984 1.0 1.0766e+02 1.1 1.81e+10 1.3 8.5e+07 1.2e+03 8.0e+03 36 84 96 80 35 43 84 96 81 36 19149 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 9 11424 0. DMKSP interface 1 0 0 0. Vector 5 52 2371496 0. Matrix 0 72 14138216 0. Distributed Mesh 1 0 0 0. Index Set 2 12 133768 0. IS L to G Mapping 1 0 0 0. Star Forest Graph 2 0 0 0. Discrete System 1 0 0 0. Vec Scatter 1 13 16432 0. Preconditioner 1 9 9676 0. Viewer 1 0 0 0. --- Event Stage 1: First Solve Krylov Solver 8 0 0 0. Vector 140 92 2204792 0. Matrix 140 68 21738552 0. Matrix Coarsen 6 6 3816 0. Index Set 110 100 543240 0. Star Forest Graph 12 12 10368 0. Vec Scatter 31 18 22752 0. Preconditioner 8 0 0 0. --- Event Stage 2: Remaining Solves Vector 23904 23904 1295501184 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 1.71661e-05 Average time for zero size MPI_Send(): 1.54705e-05 #PETSc Option Table entries: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type cg -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4 ----------------------------------------- Libraries compiled on 2018-05-25 07:05:14 on node1-001 Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon Using PETSc directory: /home/ritsat/beckerm/petsc Using PETSc arch: arch-linux-amd-opt ----------------------------------------- Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3 ----------------------------------------- Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include ----------------------------------------- Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl ----------------------------------------- -------------- next part -------------- ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-028 with 125 processors, by beckerm Fri May 25 10:11:49 2018 Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500 Max Max/Min Avg Total Time (sec): 2.488e+02 1.00000 2.488e+02 Objects: 2.438e+04 1.00004 2.438e+04 Flop: 2.125e+10 1.27708 1.963e+10 2.454e+12 Flop/sec: 8.539e+07 1.27708 7.890e+07 9.862e+09 MPI Messages: 1.042e+06 3.36140 7.129e+05 8.911e+07 MPI Message Lengths: 1.344e+09 2.32209 1.439e+03 1.282e+11 MPI Reductions: 2.250e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.9069e+00 2.8% 0.0000e+00 0.0% 3.000e+03 0.0% 3.178e+03 0.0% 1.700e+01 0.1% 1: First Solve: 2.5499e+00 1.0% 3.6885e+09 0.2% 3.549e+05 0.4% 3.736e+03 1.0% 5.500e+02 2.4% 2: Remaining Solves: 2.3939e+02 96.2% 2.4504e+12 99.8% 8.875e+07 99.6% 1.430e+03 99.0% 2.192e+04 97.4% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 3 1.0 5.2118e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: First Solve BuildTwoSided 12 1.0 6.8238e-03 1.8 0.00e+00 0.0 8.8e+03 4.0e+00 0.0e+00 0 0 0 0 0 0 0 2 0 0 0 BuildTwoSidedF 30 1.0 3.0505e-01 4.1 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0 KSPSetUp 9 1.0 3.2511e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 2.5530e+00 1.0 3.26e+07 1.4 3.5e+05 3.7e+03 5.5e+02 1 0 0 1 2 100100100100100 1445 VecTDot 8 1.0 6.3581e-03 3.8 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 8493 VecNorm 6 1.0 1.4081e-03 2.7 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 28762 VecScale 24 1.0 1.2040e-04 2.1 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 42756 VecCopy 1 1.0 1.5712e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 115 1.0 8.0633e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 8 1.0 1.1771e-03 1.4 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 45877 VecAYPX 28 1.0 1.3962e-03 1.7 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 31412 VecAssemblyBegin 2 1.0 2.3842e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 2 1.0 2.3842e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 103 1.0 6.2523e-03 3.1 0.00e+00 0.0 8.9e+04 1.4e+03 0.0e+00 0 0 0 0 0 0 0 25 9 0 0 VecScatterEnd 103 1.0 3.7810e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatMult 29 1.0 3.4570e-02 1.5 6.14e+06 1.2 3.0e+04 2.1e+03 0.0e+00 0 0 0 0 0 1 19 8 5 0 20735 MatMultAdd 24 1.0 1.3932e-02 2.5 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 10845 MatMultTranspose 24 1.0 1.3560e-02 2.5 1.37e+06 1.6 1.6e+04 6.5e+02 0.0e+00 0 0 0 0 0 0 4 5 1 0 11143 MatSolve 4 0.0 2.1935e-05 0.0 2.64e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12 MatSOR 48 1.0 7.0858e-02 1.3 1.09e+07 1.3 2.7e+04 1.5e+03 8.0e+00 0 0 0 0 0 3 34 8 3 1 17694 MatLUFactorSym 1 1.0 4.9114e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 5.6028e-0519.6 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2 MatResidual 24 1.0 2.6907e-02 1.7 4.55e+06 1.3 2.7e+04 1.5e+03 0.0e+00 0 0 0 0 0 1 14 8 3 0 19289 MatAssemblyBegin 94 1.0 3.0747e-01 3.6 0.00e+00 0.0 7.1e+03 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0 MatAssemblyEnd 94 1.0 8.1496e-02 1.1 0.00e+00 0.0 6.3e+04 2.1e+02 2.3e+02 0 0 0 0 1 3 0 18 1 42 0 MatGetRow 3102093 1.3 4.3272e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 15 0 0 0 0 0 MatGetRowIJ 1 0.0 7.1526e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 6 1.0 4.7091e-01 2.6 0.00e+00 0.0 5.5e+04 1.7e+04 1.2e+01 0 0 0 1 0 12 0 15 71 2 0 MatCreateSubMat 4 1.0 6.9880e-03 1.0 0.00e+00 0.0 2.9e+03 2.7e+02 6.4e+01 0 0 0 0 0 0 0 1 0 12 0 MatGetOrdering 1 0.0 1.2994e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 6 1.0 5.7326e-02 1.2 0.00e+00 0.0 2.7e+04 1.0e+03 1.2e+01 0 0 0 0 0 2 0 8 2 2 0 MatCoarsen 6 1.0 1.6099e-02 1.0 0.00e+00 0.0 5.3e+04 5.8e+02 3.3e+01 0 0 0 0 0 1 0 15 2 6 0 MatZeroEntries 6 1.0 3.4292e-03 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 6 1.0 2.6140e-01 1.0 1.13e+07 1.6 6.3e+04 2.6e+03 9.2e+01 0 0 0 0 0 10 33 18 13 17 4680 MatPtAPSymbolic 6 1.0 1.4820e-01 1.0 0.00e+00 0.0 3.4e+04 2.7e+03 4.2e+01 0 0 0 0 0 6 0 10 7 8 0 MatPtAPNumeric 6 1.0 1.0990e-01 1.0 1.13e+07 1.6 2.9e+04 2.6e+03 4.8e+01 0 0 0 0 0 4 33 8 6 9 11131 MatGetLocalMat 6 1.0 4.5252e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 8.3039e-03 1.6 0.00e+00 0.0 2.0e+04 3.5e+03 0.0e+00 0 0 0 0 0 0 0 6 5 0 0 SFSetGraph 12 1.0 1.4544e-05 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 12 1.0 1.2054e-02 1.1 0.00e+00 0.0 2.6e+04 6.2e+02 0.0e+00 0 0 0 0 0 0 0 7 1 0 0 SFBcastBegin 45 1.0 2.0356e-03 2.3 0.00e+00 0.0 5.4e+04 6.9e+02 0.0e+00 0 0 0 0 0 0 0 15 3 0 0 SFBcastEnd 45 1.0 5.1184e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 2.0881e+00 1.0 0.00e+00 0.0 2.0e+05 5.2e+03 2.8e+02 1 0 0 1 1 82 0 56 78 52 0 GAMG: partLevel 6 1.0 2.7127e-01 1.0 1.13e+07 1.6 6.6e+04 2.5e+03 1.9e+02 0 0 0 0 1 11 33 19 13 35 4510 repartition 2 1.0 6.8378e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0 Invert-Sort 2 1.0 6.4492e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 Move A 2 1.0 4.3352e-03 1.1 0.00e+00 0.0 1.4e+03 5.3e+02 3.4e+01 0 0 0 0 0 0 0 0 0 6 0 Move P 2 1.0 3.6781e-03 1.1 0.00e+00 0.0 1.4e+03 1.3e+01 3.4e+01 0 0 0 0 0 0 0 0 0 6 0 PCSetUp 2 1.0 2.3659e+00 1.0 1.13e+07 1.6 2.7e+05 4.5e+03 5.1e+02 1 0 0 1 2 93 33 75 90 93 517 PCSetUpOnBlocks 4 1.0 3.0303e-04 1.7 1.29e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 4 1.0 1.0916e-01 1.0 1.82e+07 1.3 8.6e+04 1.2e+03 8.0e+00 0 0 0 0 0 4 56 24 8 1 19009 --- Event Stage 2: Remaining Solves KSPSolve 999 1.0 1.2028e+02 1.0 2.12e+10 1.3 8.8e+07 1.4e+03 2.2e+04 47100 99 97 97 49100 99 98100 20373 VecTDot 7968 1.0 6.3756e+00 3.7 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 1 2 0 0 35 1 2 0 0 36 8436 VecNorm 5982 1.0 4.5791e+00 7.1 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 1 2 0 0 27 8818 VecScale 23904 1.0 1.0700e-01 2.1 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 47920 VecCopy 999 1.0 1.1231e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 83664 1.0 7.0631e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 7968 1.0 1.1656e+00 1.4 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 46141 VecAYPX 27888 1.0 1.3165e+00 1.6 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 33173 VecScatterBegin 100599 1.0 6.1421e+00 3.2 0.00e+00 0.0 8.8e+07 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0 VecScatterEnd 100599 1.0 3.6060e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 8 0 0 0 0 0 MatMult 28887 1.0 3.5612e+01 1.6 6.12e+09 1.2 3.0e+07 2.1e+03 0.0e+00 11 29 33 49 0 11 29 33 49 0 20052 MatMultAdd 23904 1.0 1.1237e+01 2.0 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 13392 MatMultTranspose 23904 1.0 1.3723e+01 2.5 1.37e+09 1.6 1.6e+07 6.5e+02 0.0e+00 4 6 18 8 0 4 6 18 8 0 10966 MatSolve 3984 0.0 1.9485e-02 0.0 2.63e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 13 MatSOR 47808 1.0 6.6101e+01 1.3 1.08e+10 1.3 2.7e+07 1.5e+03 8.0e+03 25 51 30 32 35 26 51 30 32 36 18816 MatResidual 23904 1.0 2.6469e+01 1.7 4.54e+09 1.3 2.7e+07 1.5e+03 0.0e+00 8 21 30 32 0 8 21 30 32 0 19530 PCSetUpOnBlocks 3984 1.0 5.2657e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 3984 1.0 1.0306e+02 1.0 1.81e+10 1.3 8.5e+07 1.2e+03 8.0e+03 41 84 96 80 35 43 84 96 81 36 20004 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 9 11424 0. DMKSP interface 1 0 0 0. Vector 5 52 2371496 0. Matrix 0 72 14138216 0. Distributed Mesh 1 0 0 0. Index Set 2 12 133768 0. IS L to G Mapping 1 0 0 0. Star Forest Graph 2 0 0 0. Discrete System 1 0 0 0. Vec Scatter 1 13 16432 0. Preconditioner 1 9 9676 0. Viewer 1 0 0 0. --- Event Stage 1: First Solve Krylov Solver 8 0 0 0. Vector 140 92 2204792 0. Matrix 140 68 21738552 0. Matrix Coarsen 6 6 3816 0. Index Set 110 100 543240 0. Star Forest Graph 12 12 10368 0. Vec Scatter 31 18 22752 0. Preconditioner 8 0 0 0. --- Event Stage 2: Remaining Solves Vector 23904 23904 1295501184 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 2.13623e-05 Average time for zero size MPI_Send(): 1.46084e-05 #PETSc Option Table entries: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type cg -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4 ----------------------------------------- Libraries compiled on 2018-05-25 07:05:14 on node1-001 Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon Using PETSc directory: /home/ritsat/beckerm/petsc Using PETSc arch: arch-linux-amd-opt ----------------------------------------- Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3 ----------------------------------------- Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include ----------------------------------------- Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl ----------------------------------------- -------------- next part -------------- ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-010 with 1000 processors, by beckerm Tue May 29 11:18:21 2018 Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500 Max Max/Min Avg Total Time (sec): 3.316e+02 1.00000 3.316e+02 Objects: 2.440e+04 1.00004 2.440e+04 Flop: 2.124e+10 1.27708 2.041e+10 2.041e+13 Flop/sec: 6.405e+07 1.27708 6.156e+07 6.156e+10 MPI Messages: 1.238e+06 3.99536 8.489e+05 8.489e+08 MPI Message Lengths: 1.343e+09 2.32238 1.393e+03 1.183e+12 MPI Reductions: 2.256e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.5695e+01 7.7% 0.0000e+00 0.0% 2.700e+04 0.0% 3.178e+03 0.0% 1.700e+01 0.1% 1: First Solve: 3.1540e+00 1.0% 3.0885e+10 0.2% 3.675e+06 0.4% 3.508e+03 1.1% 6.220e+02 2.8% 2: Remaining Solves: 3.0274e+02 91.3% 2.0380e+13 99.8% 8.452e+08 99.6% 1.384e+03 98.9% 2.191e+04 97.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 3 1.0 5.2404e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: First Solve BuildTwoSided 12 1.0 2.2128e-02 1.4 0.00e+00 0.0 8.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 1 0 2 0 0 0 BuildTwoSidedF 30 1.0 3.9611e-01 2.4 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 7 0 2 5 0 0 KSPSetUp 9 1.0 6.6152e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 3.1572e+00 1.0 3.25e+07 1.4 3.7e+06 3.5e+03 6.2e+02 1 0 0 1 3 100100100100100 9782 VecTDot 8 1.0 1.7718e-02 2.9 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 24382 VecNorm 6 1.0 1.7011e-03 2.3 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 190463 VecScale 24 1.0 1.6880e-04 2.5 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 282104 VecCopy 1 1.0 1.3494e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 124 1.0 8.7070e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 8 1.0 1.2469e-03 1.6 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 346451 VecAYPX 28 1.0 1.6160e-03 2.2 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 219183 VecAssemblyBegin 3 1.0 3.0994e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 3 1.0 4.0531e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 108 1.0 7.2830e-03 2.8 0.00e+00 0.0 8.4e+05 1.4e+03 0.0e+00 0 0 0 0 0 0 0 23 9 0 0 VecScatterEnd 108 1.0 5.8424e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatMult 29 1.0 3.7680e-02 1.4 6.14e+06 1.2 2.8e+05 2.0e+03 0.0e+00 0 0 0 0 0 1 19 8 4 0 157481 MatMultAdd 24 1.0 3.1618e-02 4.1 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 1 4 4 1 0 40763 MatMultTranspose 24 1.0 1.7325e-02 3.0 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 0 4 4 1 0 74394 MatSolve 4 0.0 4.8161e-05 0.0 1.10e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 228 MatSOR 48 1.0 8.3678e-02 1.4 1.09e+07 1.3 2.6e+05 1.5e+03 8.0e+00 0 0 0 0 0 2 34 7 3 1 124983 MatLUFactorSym 1 1.0 1.0300e-04 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 7.1049e-0537.2 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 463 MatResidual 24 1.0 2.9594e-02 1.6 4.55e+06 1.3 2.6e+05 1.5e+03 0.0e+00 0 0 0 0 0 1 14 7 3 0 146971 MatAssemblyBegin 102 1.0 3.9857e-01 2.3 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 8 0 2 5 0 0 MatAssemblyEnd 102 1.0 1.3652e-01 1.0 0.00e+00 0.0 6.2e+05 2.0e+02 2.5e+02 0 0 0 0 1 4 0 17 1 40 0 MatGetRow 3102093 1.3 4.5841e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 MatGetRowIJ 1 0.0 1.6928e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 6 1.0 5.0106e-01 2.2 0.00e+00 0.0 5.7e+05 1.6e+04 1.2e+01 0 0 0 1 0 11 0 15 72 2 0 MatCreateSubMat 6 1.0 2.8865e-02 1.0 0.00e+00 0.0 2.2e+04 3.3e+02 9.4e+01 0 0 0 0 0 1 0 1 0 15 0 MatGetOrdering 1 0.0 1.3614e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 6 1.0 1.1707e-01 1.1 0.00e+00 0.0 2.6e+05 9.9e+02 1.2e+01 0 0 0 0 0 3 0 7 2 2 0 MatCoarsen 6 1.0 5.5459e-02 1.0 0.00e+00 0.0 7.1e+05 4.4e+02 5.6e+01 0 0 0 0 0 2 0 19 2 9 0 MatZeroEntries 6 1.0 3.5138e-03 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 6 1.0 3.8423e-01 1.0 1.11e+07 1.6 6.3e+05 2.5e+03 9.2e+01 0 0 0 0 0 12 34 17 12 15 26996 MatPtAPSymbolic 6 1.0 2.1874e-01 1.0 0.00e+00 0.0 3.2e+05 2.7e+03 4.2e+01 0 0 0 0 0 7 0 9 7 7 0 MatPtAPNumeric 6 1.0 1.5509e-01 1.0 1.11e+07 1.6 3.0e+05 2.3e+03 4.8e+01 0 0 0 0 0 5 34 8 6 8 66883 MatGetLocalMat 6 1.0 4.7982e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 1.4448e-02 2.0 0.00e+00 0.0 1.9e+05 3.4e+03 0.0e+00 0 0 0 0 0 0 0 5 5 0 0 SFSetGraph 12 1.0 1.8835e-05 9.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 12 1.0 3.1634e-02 1.2 0.00e+00 0.0 2.7e+05 5.8e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0 SFBcastBegin 68 1.0 2.7318e-03 2.8 0.00e+00 0.0 7.2e+05 5.1e+02 0.0e+00 0 0 0 0 0 0 0 20 3 0 0 SFBcastEnd 68 1.0 3.0540e-02 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 GAMG: createProl 6 1.0 2.4582e+00 1.0 0.00e+00 0.0 2.2e+06 4.7e+03 3.1e+02 1 0 0 1 1 78 0 59 79 50 0 GAMG: partLevel 6 1.0 4.2463e-01 1.0 1.11e+07 1.6 6.5e+05 2.4e+03 2.4e+02 0 0 0 0 1 13 34 18 12 39 24427 repartition 3 1.0 3.3462e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 Invert-Sort 3 1.0 3.3751e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0 Move A 3 1.0 1.7274e-02 1.1 0.00e+00 0.0 9.5e+03 7.4e+02 5.0e+01 0 0 0 0 0 1 0 0 0 8 0 Move P 3 1.0 1.4379e-02 1.1 0.00e+00 0.0 1.3e+04 1.3e+01 5.0e+01 0 0 0 0 0 0 0 0 0 8 0 PCSetUp 2 1.0 2.9061e+00 1.0 1.11e+07 1.6 2.8e+06 4.2e+03 5.8e+02 1 0 0 1 3 92 34 77 91 94 3569 PCSetUpOnBlocks 4 1.0 4.0293e-04 3.0 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 82 PCApply 4 1.0 1.3660e-01 1.0 1.82e+07 1.3 8.2e+05 1.2e+03 8.0e+00 0 0 0 0 0 4 56 22 7 1 127272 --- Event Stage 2: Remaining Solves KSPSolve 999 1.0 1.7180e+02 1.1 2.12e+10 1.3 8.4e+08 1.4e+03 2.2e+04 51100 99 97 97 56100 99 98100 118630 VecTDot 7964 1.0 1.1651e+01 2.4 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 2 2 0 0 35 3 2 0 0 36 36911 VecNorm 5980 1.0 1.2358e+01 3.1 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 2 2 0 0 27 3 2 0 0 27 26129 VecScale 23892 1.0 1.3560e-01 2.5 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 349607 VecCopy 999 1.0 1.4402e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 83622 1.0 8.0502e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 7964 1.0 1.2322e+00 1.6 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 349007 VecAYPX 27874 1.0 1.6865e+00 2.1 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 209016 VecScatterBegin 100549 1.0 7.0089e+00 2.9 0.00e+00 0.0 8.4e+08 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0 VecScatterEnd 100549 1.0 6.5406e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 13 0 0 0 0 14 0 0 0 0 0 MatMult 28873 1.0 3.9569e+01 1.5 6.11e+09 1.2 2.8e+08 2.0e+03 0.0e+00 10 29 33 48 0 11 29 34 49 0 149320 MatMultAdd 23892 1.0 3.2684e+01 4.0 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 8 6 18 8 0 9 6 18 8 0 39256 MatMultTranspose 23892 1.0 2.0947e+01 3.1 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 3 6 18 8 0 4 6 18 8 0 61253 MatSolve 3982 0.0 4.6725e-02 0.0 1.09e+07 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 234 MatSOR 47784 1.0 8.3053e+01 1.3 1.08e+10 1.3 2.6e+08 1.5e+03 8.0e+03 23 51 30 32 35 25 51 30 32 36 124862 MatResidual 23892 1.0 3.1663e+01 1.7 4.53e+09 1.3 2.6e+08 1.5e+03 0.0e+00 7 21 30 32 0 8 21 30 32 0 136752 PCSetUpOnBlocks 3982 1.0 5.4314e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 3982 1.0 1.4286e+02 1.0 1.81e+10 1.3 8.1e+08 1.2e+03 8.0e+03 43 85 96 81 35 47 85 96 82 36 120863 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 9 11424 0. DMKSP interface 1 0 0 0. Vector 5 52 2382208 0. Matrix 0 65 14780672 0. Distributed Mesh 1 0 0 0. Index Set 2 18 171852 0. IS L to G Mapping 1 0 0 0. Star Forest Graph 2 0 0 0. Discrete System 1 0 0 0. Vec Scatter 1 13 16432 0. Preconditioner 1 9 9676 0. Viewer 1 0 0 0. --- Event Stage 1: First Solve Krylov Solver 8 0 0 0. Vector 152 104 2238504 0. Matrix 148 83 22951356 0. Matrix Coarsen 6 6 3816 0. Index Set 128 112 590828 0. Star Forest Graph 12 12 10368 0. Vec Scatter 34 21 26544 0. Preconditioner 8 0 0 0. --- Event Stage 2: Remaining Solves Vector 23892 23892 1302241424 0. ======================================================================================================================== Average time to get PetscTime(): 1.19209e-07 Average time for MPI_Barrier(): 3.35693e-05 Average time for zero size MPI_Send(): 1.84231e-05 #PETSc Option Table entries: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type cg -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4 ----------------------------------------- Libraries compiled on 2018-05-25 07:05:14 on node1-001 Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon Using PETSc directory: /home/ritsat/beckerm/petsc Using PETSc arch: arch-linux-amd-opt ----------------------------------------- Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3 ----------------------------------------- Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include ----------------------------------------- Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl ----------------------------------------- -------------- next part -------------- ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /home/ritsat/beckerm/ppp_test/plasmapic on a arch-linux-amd-opt named node1-010 with 1000 processors, by beckerm Tue May 29 11:37:28 2018 Using Petsc Development GIT revision: v3.9.2-503-g9e88a8b GIT Date: 2018-05-24 08:01:24 -0500 Max Max/Min Avg Total Time (sec): 2.914e+02 1.00000 2.914e+02 Objects: 2.440e+04 1.00004 2.440e+04 Flop: 2.124e+10 1.27708 2.041e+10 2.041e+13 Flop/sec: 7.289e+07 1.27708 7.005e+07 7.005e+10 MPI Messages: 1.238e+06 3.99536 8.489e+05 8.489e+08 MPI Message Lengths: 1.343e+09 2.32238 1.393e+03 1.183e+12 MPI Reductions: 2.256e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.5285e+01 8.7% 0.0000e+00 0.0% 2.700e+04 0.0% 3.178e+03 0.0% 1.700e+01 0.1% 1: First Solve: 3.1432e+00 1.1% 3.0885e+10 0.2% 3.675e+06 0.4% 3.508e+03 1.1% 6.220e+02 2.8% 2: Remaining Solves: 2.6295e+02 90.2% 2.0380e+13 99.8% 8.452e+08 99.6% 1.384e+03 98.9% 2.191e+04 97.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 3 1.0 5.2595e-04 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 1: First Solve BuildTwoSided 12 1.0 1.2117e-02 1.5 0.00e+00 0.0 8.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 2 0 0 0 BuildTwoSidedF 30 1.0 4.6420e-01 3.2 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 9 0 2 5 0 0 KSPSetUp 9 1.0 5.0461e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 3.1475e+00 1.0 3.25e+07 1.4 3.7e+06 3.5e+03 6.2e+02 1 0 0 1 3 100100100100100 9812 VecTDot 8 1.0 6.8884e-03 3.7 4.32e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 1 0 0 1 62713 VecNorm 6 1.0 1.6949e-03 2.6 3.24e+05 1.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 1 0 0 1 191160 VecScale 24 1.0 1.4758e-04 2.4 5.43e+04 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 322665 VecCopy 1 1.0 1.4782e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 124 1.0 8.3828e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 8 1.0 1.2336e-03 1.5 4.32e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 350201 VecAYPX 28 1.0 1.5278e-03 2.0 3.58e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 231838 VecAssemblyBegin 3 1.0 8.8215e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 3 1.0 5.0068e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 108 1.0 6.5110e-03 2.9 0.00e+00 0.0 8.4e+05 1.4e+03 0.0e+00 0 0 0 0 0 0 0 23 9 0 0 VecScatterEnd 108 1.0 4.8535e-02 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 MatMult 29 1.0 3.4050e-02 1.4 6.14e+06 1.2 2.8e+05 2.0e+03 0.0e+00 0 0 0 0 0 1 19 8 4 0 174274 MatMultAdd 24 1.0 2.1619e-02 3.0 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 1 4 4 1 0 59618 MatMultTranspose 24 1.0 1.6357e-02 2.4 1.37e+06 1.6 1.5e+05 6.5e+02 0.0e+00 0 0 0 0 0 0 4 4 1 0 78797 MatSolve 4 0.0 5.1498e-05 0.0 1.10e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 213 MatSOR 48 1.0 7.6160e-02 1.3 1.09e+07 1.3 2.6e+05 1.5e+03 8.0e+00 0 0 0 0 0 2 34 7 3 1 137320 MatLUFactorSym 1 1.0 1.0586e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 7.1049e-0537.2 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 463 MatResidual 24 1.0 2.6069e-02 1.5 4.55e+06 1.3 2.6e+05 1.5e+03 0.0e+00 0 0 0 0 0 1 14 7 3 0 166848 MatAssemblyBegin 102 1.0 4.6661e-01 3.0 0.00e+00 0.0 6.5e+04 1.0e+04 0.0e+00 0 0 0 0 0 10 0 2 5 0 0 MatAssemblyEnd 102 1.0 1.0982e-01 1.1 0.00e+00 0.0 6.2e+05 2.0e+02 2.5e+02 0 0 0 0 1 3 0 17 1 40 0 MatGetRow 3102093 1.3 5.3594e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 MatGetRowIJ 1 0.0 1.6928e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 6 1.0 4.6788e-01 2.3 0.00e+00 0.0 5.7e+05 1.6e+04 1.2e+01 0 0 0 1 0 10 0 15 72 2 0 MatCreateSubMat 6 1.0 1.8935e-02 1.0 0.00e+00 0.0 2.2e+04 3.3e+02 9.4e+01 0 0 0 0 0 1 0 1 0 15 0 MatGetOrdering 1 0.0 1.4997e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatIncreaseOvrlp 6 1.0 1.1188e-01 1.1 0.00e+00 0.0 2.6e+05 9.9e+02 1.2e+01 0 0 0 0 0 3 0 7 2 2 0 MatCoarsen 6 1.0 3.2188e-02 1.0 0.00e+00 0.0 7.1e+05 4.4e+02 5.6e+01 0 0 0 0 0 1 0 19 2 9 0 MatZeroEntries 6 1.0 3.6952e-03 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 6 1.0 3.6765e-01 1.0 1.11e+07 1.6 6.3e+05 2.5e+03 9.2e+01 0 0 0 0 0 12 34 17 12 15 28214 MatPtAPSymbolic 6 1.0 2.1124e-01 1.0 0.00e+00 0.0 3.2e+05 2.7e+03 4.2e+01 0 0 0 0 0 7 0 9 7 7 0 MatPtAPNumeric 6 1.0 1.4190e-01 1.0 1.11e+07 1.6 3.0e+05 2.3e+03 4.8e+01 0 0 0 0 0 5 34 8 6 8 73100 MatGetLocalMat 6 1.0 4.7829e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 1.2071e-02 2.1 0.00e+00 0.0 1.9e+05 3.4e+03 0.0e+00 0 0 0 0 0 0 0 5 5 0 0 SFSetGraph 12 1.0 2.4796e-05 8.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 12 1.0 1.8736e-02 1.1 0.00e+00 0.0 2.7e+05 5.8e+02 0.0e+00 0 0 0 0 0 1 0 7 1 0 0 SFBcastBegin 68 1.0 2.7175e-03 2.8 0.00e+00 0.0 7.2e+05 5.1e+02 0.0e+00 0 0 0 0 0 0 0 20 3 0 0 SFBcastEnd 68 1.0 1.5780e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 2.5062e+00 1.0 0.00e+00 0.0 2.2e+06 4.7e+03 3.1e+02 1 0 0 1 1 79 0 59 79 50 0 GAMG: partLevel 6 1.0 3.9494e-01 1.0 1.11e+07 1.6 6.5e+05 2.4e+03 2.4e+02 0 0 0 0 1 12 34 18 12 39 26264 repartition 3 1.0 2.5189e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 3 0 Invert-Sort 3 1.0 2.0480e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 2 0 Move A 3 1.0 1.1837e-02 1.1 0.00e+00 0.0 9.5e+03 7.4e+02 5.0e+01 0 0 0 0 0 0 0 0 0 8 0 Move P 3 1.0 9.3570e-03 1.1 0.00e+00 0.0 1.3e+04 1.3e+01 5.0e+01 0 0 0 0 0 0 0 0 0 8 0 PCSetUp 2 1.0 2.9206e+00 1.0 1.11e+07 1.6 2.8e+06 4.2e+03 5.8e+02 1 0 0 1 3 93 34 77 91 94 3552 PCSetUpOnBlocks 4 1.0 4.2915e-04 2.6 3.29e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 77 PCApply 4 1.0 1.2181e-01 1.0 1.82e+07 1.3 8.2e+05 1.2e+03 8.0e+00 0 0 0 0 0 4 56 22 7 1 142729 --- Event Stage 2: Remaining Solves KSPSolve 999 1.0 1.3992e+02 1.1 2.12e+10 1.3 8.4e+08 1.4e+03 2.2e+04 46100 99 97 97 51100 99 98100 145661 VecTDot 7964 1.0 7.8852e+00 2.9 4.30e+08 1.0 0.0e+00 0.0e+00 8.0e+03 1 2 0 0 35 2 2 0 0 36 54539 VecNorm 5980 1.0 8.5339e+00 6.9 3.23e+08 1.0 0.0e+00 0.0e+00 6.0e+03 1 2 0 0 27 2 2 0 0 27 37840 VecScale 23892 1.0 1.3013e-01 2.5 5.40e+07 2.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 364277 VecCopy 999 1.0 1.3009e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 83622 1.0 7.4316e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 7964 1.0 1.2206e+00 1.5 4.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 352338 VecAYPX 27874 1.0 1.5360e+00 1.9 3.56e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 229492 VecScatterBegin 100549 1.0 6.4482e+00 3.0 0.00e+00 0.0 8.4e+08 1.4e+03 0.0e+00 2 0 99 97 0 2 0 99 98 0 0 VecScatterEnd 100549 1.0 5.2444e+01 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 10 0 0 0 0 0 MatMult 28873 1.0 3.4346e+01 1.4 6.11e+09 1.2 2.8e+08 2.0e+03 0.0e+00 9 29 33 48 0 10 29 34 49 0 172028 MatMultAdd 23892 1.0 2.0934e+01 2.9 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 6 6 18 8 0 7 6 18 8 0 61290 MatMultTranspose 23892 1.0 1.4770e+01 2.2 1.37e+09 1.6 1.5e+08 6.5e+02 0.0e+00 3 6 18 8 0 3 6 18 8 0 86867 MatSolve 3982 0.0 4.7890e-02 0.0 1.09e+07 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 228 MatSOR 47784 1.0 7.1105e+01 1.3 1.08e+10 1.3 2.6e+08 1.5e+03 8.0e+03 23 51 30 32 35 26 51 30 32 36 145842 MatResidual 23892 1.0 2.4455e+01 1.4 4.53e+09 1.3 2.6e+08 1.5e+03 0.0e+00 7 21 30 32 0 7 21 30 32 0 177060 PCSetUpOnBlocks 3982 1.0 5.4274e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 3982 1.0 1.1804e+02 1.0 1.81e+10 1.3 8.1e+08 1.2e+03 8.0e+03 40 85 96 81 35 44 85 96 82 36 146277 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 9 11424 0. DMKSP interface 1 0 0 0. Vector 5 52 2382208 0. Matrix 0 65 14780672 0. Distributed Mesh 1 0 0 0. Index Set 2 18 171852 0. IS L to G Mapping 1 0 0 0. Star Forest Graph 2 0 0 0. Discrete System 1 0 0 0. Vec Scatter 1 13 16432 0. Preconditioner 1 9 9676 0. Viewer 1 0 0 0. --- Event Stage 1: First Solve Krylov Solver 8 0 0 0. Vector 152 104 2238504 0. Matrix 148 83 22951356 0. Matrix Coarsen 6 6 3816 0. Index Set 128 112 590828 0. Star Forest Graph 12 12 10368 0. Vec Scatter 34 21 26544 0. Preconditioner 8 0 0 0. --- Event Stage 2: Remaining Solves Vector 23892 23892 1302241424 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 3.93867e-05 Average time for zero size MPI_Send(): 1.59838e-05 #PETSc Option Table entries: -gamg_est_ksp_type cg -ksp_norm_type unpreconditioned -ksp_type cg -log_view -mg_levels_esteig_ksp_max_it 10 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 1 -mg_levels_ksp_norm_type none -mg_levels_ksp_type richardson -mg_levels_pc_sor_its 1 -mg_levels_pc_type sor -pc_gamg_type classical -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 PETSC_ARCH=arch-linux-amd-opt --download-f2cblaslapack --with-mpi-dir=/cm/shared/apps/mvapich2/intel-17.0.1/2.0 --download-hypre --download-ml --with-fc=0 --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 --with-batch --with-x --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=4 ----------------------------------------- Libraries compiled on 2018-05-25 07:05:14 on node1-001 Machine characteristics: Linux-2.6.32-696.18.7.el6.x86_64-x86_64-with-redhat-6.6-Carbon Using PETSc directory: /home/ritsat/beckerm/petsc Using PETSc arch: arch-linux-amd-opt ----------------------------------------- Using C compiler: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc -fPIC -wd1572 -O3 ----------------------------------------- Using include paths: -I/home/ritsat/beckerm/petsc/include -I/home/ritsat/beckerm/petsc/arch-linux-amd-opt/include -I/cm/shared/apps/mvapich2/intel-17.0.1/2.0/include ----------------------------------------- Using C linker: /cm/shared/apps/mvapich2/intel-17.0.1/2.0/bin/mpicc Using libraries: -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lpetsc -Wl,-rpath,/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -L/home/ritsat/beckerm/petsc/arch-linux-amd-opt/lib -lHYPRE -lml -lf2clapack -lf2cblas -lX11 -ldl ----------------------------------------- From bsmith at mcs.anl.gov Tue May 29 11:04:22 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 29 May 2018 16:04:22 +0000 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: <7A64BC7D-B0AE-411F-A39E-483286C6FF4D@anl.gov> Can you please run the "2" cases again? I want to see how reproducible they are. Thanks Barry > On May 29, 2018, at 6:18 AM, Michael Becker wrote: > > Hello again, > > here are the updated log_view files for 125 and 1000 processors. I ran both problems twice, the first time with all processors per node allocated ("-1.txt"), the second with only half on twice the number of nodes ("-2.txt"). > >>> On May 24, 2018, at 12:24 AM, Michael Becker >>> wrote: >>> >>> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). >>> >> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. >> > > I mean this, right in the log_view output: > >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> ... >> >> --- Event Stage 1: First Solve >> >> ... >> >> --- Event Stage 2: Remaining Solves >> >> Vector 23904 23904 1295501184 0. > I logged the exact number of KSP iterations over the 999 timesteps and its exactly 23904/6 = 3984. > Michael > > > Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: >> Please send the log file for 1000 with cg as the solver. >> >> You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) >> >> >> >> >>> On May 24, 2018, at 12:24 AM, Michael Becker >>> wrote: >>> >>> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). >>> >> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. >> >> >> >> >>> This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). >>> >>> >>> Thanks in advance. >>> >>> Michael >>> >>> >>> >>> >>> > > From valerio.barnabei at uniroma1.it Tue May 29 12:14:57 2018 From: valerio.barnabei at uniroma1.it (Valerio Barnabei) Date: Tue, 29 May 2018 19:14:57 +0200 Subject: [petsc-users] [petsc4py] DMPlex and DT class Message-ID: Hello, I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, snes/ex77 in python using petsc4py. Unfortunately I'm having trouble finding something analogue to the DT class of C++ PETSc, to call methods like PetscFECreateDefault and similar. Is this something that can be achieved using petsc4py? I mean, is there something included in DM, DMDA or DMPlex that takes care of the discretization of a generic value field that I'm missing? (As far as I can see and understand, no DT class is implemented in petsc4py) I hope i explained myself, unfortunately I'm still a new user. Thanks in advance for your help. Best regards, Valerio -- ___________________________________________ *Il tuo?5?diventa 1000* Fai crescere la tua universit? Dona il?5?per?mille?alla Sapienza Codice fiscale:?*80209930587* https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue May 29 13:03:55 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 29 May 2018 18:03:55 +0000 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: <10D25B6B-2A77-4690-9716-61C9AA24214C@anl.gov> Here is the bar chart I mentioned you should generate. As you can see the larger problem has 1.5096 extra seconds in VecTDot, 3.9548 extra seconds in Norm, is 1.266 seconds faster in Matmult, is 9.697 seconds slower in MatMultAdd, 1.047 secs slower in MatMultTranspose and 5.006 seconds slower in MatSOR. All of these together match the total extra time of 19.64 well (19.94 compared to 19.64). The Dot and Norm could be explained by using 8 times as many processes slowing down the reductions a bit. I cannot explain the slow down in MatSOR at all since that is embarrassingly parallel and should scale perfectly. I also cannot explain the huge jump in MatMultAdd()! This is why I ask you to run again to see if the numbers are consistent. Barry [cid:20ADD8C8-1DC4-4781-94E4-9C8AFC99F6C1 at mcs.anl.gov] On May 29, 2018, at 6:18 AM, Michael Becker > wrote: Hello again, here are the updated log_view files for 125 and 1000 processors. I ran both problems twice, the first time with all processors per node allocated ("-1.txt"), the second with only half on twice the number of nodes ("-2.txt"). On May 24, 2018, at 12:24 AM, Michael Becker > wrote: I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. I mean this, right in the log_view output: Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage ... --- Event Stage 1: First Solve ... --- Event Stage 2: Remaining Solves Vector 23904 23904 1295501184 0. I logged the exact number of KSP iterations over the 999 timesteps and its exactly 23904/6 = 3984. Michael Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: Please send the log file for 1000 with cg as the solver. You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) On May 24, 2018, at 12:24 AM, Michael Becker > wrote: I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). Thanks in advance. Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Untitled.png Type: image/png Size: 285204 bytes Desc: Untitled.png URL: From bsmith at mcs.anl.gov Tue May 29 14:29:41 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 29 May 2018 19:29:41 +0000 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> Message-ID: <6139EB46-0918-4B5E-92DF-C85DB51975DF@mcs.anl.gov> > On May 29, 2018, at 3:05 AM, Najeeb Ahmad wrote: > > > > On Mon, May 28, 2018 at 9:32 PM, Smith, Barry F. wrote: > > > > On May 28, 2018, at 10:32 AM, Najeeb Ahmad wrote: > > > > Thanks a lot Satish for your prompt reply. > > > > I just checked that SuperLU_dist package works only for matrices of type aij. and uses lu preconditioner. I am currently working with baij matrix. What is the best preconditioner choice for baij matrices on parallel machines? > > The best preconditioner is always problem specific. Where does your problem come from? CFD? Structural mechanics? other apps? > > I am interested in writing solver for reservoir simulation employing FVM and unstructured grids. My main objective is to study performance of the code with different data structures/data layouts and architecture specific optimizations, specifically targeting the multicore architectures like KNL for instance. Later the study may be extended to include GPUs. The options for switching between AIJ and BAIJ etc. are therefore very useful for my study. > > The purpose why I wanted to change the preconditioner is that the default preconditioner is giving me different iterations count for different number of processesors. I would rather like a preconditioner that would give me same iteration count for any processor count so that I can better compare the performance results. Pretty much all preconditioners will give at least slightly different iteration counts for different number of processes. This must be taken into account in evaluating the "performance" of the solver and its implementation. You can try -pc_type gamg or -pc_type hypre (also requires running ./configure with the additional option --download-hypre). > > Your suggestions in this regard are highly appreciated, specifically with reference to the following points: > > - Is it possible to explicitly use high bandwidth memory in PETSc for selected object placement (e.g. using memkind library for instance)? We've found that placing some objects in high bandwidth memory and some in regular memory is a painful and thankless task. You should just run all in high bandwidth memory, otherwise on KNL you will get really crappy performance. > - What would it take to take advantage of architecture specific compiler flags to achieve good performance on a given platform (e.g. -xMIC-AVX512 for AVX512 on KNL, #pragma SIMD etc.). > We've found that just using these as compile time flags doesn't help much (that is the compiler is not smart enough to really take advantage of this vectorization). Please see the attached paper for a bunch of discussions about performance and KNL. Barry > Sorry for some very basic questions as I am a novice PETSc user. > > Thanks for your time :) > > Anyways you probably want to make your code be able to switch between AIJ and BAIJ at run time since the different formats support somewhat different solvers. If your code alls MatSetFromOptions then you can switch via the command line option -mat_type aij or baij > > Barry > > > > > Thanks > > > > On Mon, May 28, 2018 at 8:23 PM, Satish Balay wrote: > > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > > > Hi All, > > > > > > I have Petsc release version 3.9.2 configured with the following options: > > > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > > > --download-fblaslapack=1 > > > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU in > > > the code, I get the following error: > > > > > > [0]PETSC ERROR: --------------------- Error Message > > > -------------------------------------------------------------- > > > [0]PETSC ERROR: See > > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > > > possible LU and Cholesky solvers > > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > > ./configure with --download- > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > > > trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad Mon > > > May 28 17:52:41 2018 > > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > > --with-fc=mpiifort --download-fblaslapack=1 > > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > > > > I assume that I am missing LU package like SuperLU_dist for instance and I > > > need to download and configure it with Petsc. > > > > yes - petsc has sequential LU - but you need superlu_dist/mumps for parallel lu. > > > > > > > > I am wondering what is the best way to reconfigure Petsc to download and > > > use the appropriate package to support PCILU? > > > > Rerun configure with the additional option --download-superlu_dist=1. > > > > You can do this with current PETSC_ARCH you are using [i.e reinstall > > over the current build] - or use a different PETSC_ARCH - so both > > builds exist and useable. > > > > Satish > > > > > > > > You advice is highly appreciated. > > > > > > > > > > > > > > > > -- > > Najeeb Ahmad > > > > Research and Teaching Assistant > > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > > Computer Science and Engineering > > Ko? University, Istanbul, Turkey > > > > > > > -- > Najeeb Ahmad > > Research and Teaching Assistant > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > Computer Science and Engineering > Ko? University, Istanbul, Turkey > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: paper.pdf Type: application/pdf Size: 2277996 bytes Desc: paper.pdf URL: From bsmith at mcs.anl.gov Tue May 29 14:33:46 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 29 May 2018 19:33:46 +0000 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: Message-ID: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. Barry > On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: > > > Hi, > > I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? From knepley at gmail.com Tue May 29 16:21:59 2018 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 29 May 2018 17:21:59 -0400 Subject: [petsc-users] [petsc4py] DMPlex and DT class In-Reply-To: References: Message-ID: On Tue, May 29, 2018 at 1:14 PM, Valerio Barnabei < valerio.barnabei at uniroma1.it> wrote: > Hello, > I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, > snes/ex77 in python using petsc4py. Unfortunately I'm having trouble > finding something analogue to the DT class of C++ PETSc, to call methods > like PetscFECreateDefault and similar. > Is this something that can be achieved using petsc4py? > I mean, is there something included in DM, DMDA or DMPlex that takes care > of the discretization of a generic value field that I'm missing? (As far as > I can see and understand, no DT class is implemented in petsc4py) > I hope i explained myself, unfortunately I'm still a new user. > Hi Valerio, There are no Python interfaces for DT because it is all experimental code. We have not yet agreed that this is the correct way to do things, so its all my own C experimentation. I could help you understand it to make Python interfaces if you wanted to. Thanks, Matt > Thanks in advance for your help. > > Best regards, > Valerio > > ___________________________________________ > *Il tuo 5 diventa 1000* > Fai crescere la tua universit? > Dona il 5 per mille alla Sapienza > Codice fiscale: *80209930587* > https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer > sita-con-il-cinque-mille > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Tue May 29 16:41:46 2018 From: jczhang at mcs.anl.gov (Junchao Zhang) Date: Tue, 29 May 2018 16:41:46 -0500 Subject: [petsc-users] Poor weak scaling when solving successivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: The log files have something like "Average time for zero size MPI_Send(): 1.84231e-05". It looks you ran on a cluster with a very slow network. A typical machine should give less than 1/10 of the latency you have. An easy way to try is just running the code on a machine with a faster network and see what happens. Also, how many cores & numa domains does a compute node have? I could not figure out how you distributed the 125 MPI ranks evenly. --Junchao Zhang On Tue, May 29, 2018 at 6:18 AM, Michael Becker < Michael.Becker at physik.uni-giessen.de> wrote: > Hello again, > > here are the updated log_view files for 125 and 1000 processors. I ran > both problems twice, the first time with all processors per node allocated > ("-1.txt"), the second with only half on twice the number of nodes > ("-2.txt"). > > On May 24, 2018, at 12:24 AM, Michael Becker wrote: > > I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). > > Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. > > > I mean this, right in the log_view output: > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > ... > > --- Event Stage 1: First Solve > > ... > > --- Event Stage 2: Remaining Solves > > Vector 23904 23904 1295501184 0. > > I logged the exact number of KSP iterations over the 999 timesteps and its > exactly 23904/6 = 3984. > > Michael > > > > Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: > > Please send the log file for 1000 with cg as the solver. > > You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) > > > > > On May 24, 2018, at 12:24 AM, Michael Becker wrote: > > I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). > > Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. > > > > > This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). > > > Thanks in advance. > > Michael > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue May 29 22:01:13 2018 From: hongzhang at anl.gov (Zhang, Hong) Date: Wed, 30 May 2018 03:01:13 +0000 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: References: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> Message-ID: <938429CF-8D28-4428-9B23-DF405571D0B8@anl.gov> On May 29, 2018, at 3:05 AM, Najeeb Ahmad > wrote: On Mon, May 28, 2018 at 9:32 PM, Smith, Barry F. > wrote: > On May 28, 2018, at 10:32 AM, Najeeb Ahmad > wrote: > > Thanks a lot Satish for your prompt reply. > > I just checked that SuperLU_dist package works only for matrices of type aij. and uses lu preconditioner. I am currently working with baij matrix. What is the best preconditioner choice for baij matrices on parallel machines? The best preconditioner is always problem specific. Where does your problem come from? CFD? Structural mechanics? other apps? I am interested in writing solver for reservoir simulation employing FVM and unstructured grids. My main objective is to study performance of the code with different data structures/data layouts and architecture specific optimizations, specifically targeting the multicore architectures like KNL for instance. Later the study may be extended to include GPUs. The options for switching between AIJ and BAIJ etc. are therefore very useful for my study. The purpose why I wanted to change the preconditioner is that the default preconditioner is giving me different iterations count for different number of processesors. I would rather like a preconditioner that would give me same iteration count for any processor count so that I can better compare the performance results. Your suggestions in this regard are highly appreciated, specifically with reference to the following points: - Is it possible to explicitly use high bandwidth memory in PETSc for selected object placement (e.g. using memkind library for instance)? Yes, see http://www.mcs.anl.gov/petsc/petsc-3.8/src/sys/memory/mhbw.c.html This was developed to use memkind to handle adjoint checkpointing where I want to use HBW memory for computation and DRAM for storing checkpoints. But it can be used for your purpose as well. When configure PETSc, use "--with-memkind-dir=" to specify the location of the memkind library. The runtime option "-malloc_hbw" will allow you to allocate all PETSc objects in HBW memory. If the HBW memory is ran out, it falls back to DRAM. If you want to place selective objects in DRAM, you can do PetscMallocSetDRAM() ... allocate your objects ... PetscMallocResetDRAM() An example usage can be found at http://www.mcs.anl.gov/petsc/petsc-dev/src/ts/trajectory/impls/memory/trajmemory.c - What would it take to take advantage of architecture specific compiler flags to achieve good performance on a given platform (e.g. -xMIC-AVX512 for AVX512 on KNL, #pragma SIMD etc.). To build PETSc on KNL with AVX512 enabled, see the example scripts config/examples/arch-linux-knl.py config/examples/arch-cray-xc40-knl-opt.py Note that the MatMult kernel (for AIJ and SELL) has been manually optimized for best performance. Hong (Mr.) Sorry for some very basic questions as I am a novice PETSc user. Thanks for your time :) Anyways you probably want to make your code be able to switch between AIJ and BAIJ at run time since the different formats support somewhat different solvers. If your code alls MatSetFromOptions then you can switch via the command line option -mat_type aij or baij Barry > > Thanks > > On Mon, May 28, 2018 at 8:23 PM, Satish Balay > wrote: > On Mon, 28 May 2018, Najeeb Ahmad wrote: > > > Hi All, > > > > I have Petsc release version 3.9.2 configured with the following options: > > > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > > --download-fblaslapack=1 > > > > Now I want to use PCILU in my code and when I set the PC type to PCILU in > > the code, I get the following error: > > > > [0]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [0]PETSC ERROR: See > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for > > possible LU and Cholesky solvers > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must > > ./configure with --download- > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > > trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by nahmad Mon > > May 28 17:52:41 2018 > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc > > --with-fc=mpiifort --download-fblaslapack=1 > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > [0]PETSC ERROR: #3 PCSetUp() line 923 in > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #5 KSPSolve() line 612 in > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 SolveSystem() line 60 in > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c > > > > > > I assume that I am missing LU package like SuperLU_dist for instance and I > > need to download and configure it with Petsc. > > yes - petsc has sequential LU - but you need superlu_dist/mumps for parallel lu. > > > > > I am wondering what is the best way to reconfigure Petsc to download and > > use the appropriate package to support PCILU? > > Rerun configure with the additional option --download-superlu_dist=1. > > You can do this with current PETSC_ARCH you are using [i.e reinstall > over the current build] - or use a different PETSC_ARCH - so both > builds exist and useable. > > Satish > > > > > You advice is highly appreciated. > > > > > > > > > -- > Najeeb Ahmad > > Research and Teaching Assistant > PARallel and MultiCORE Computing Laboratory (ParCoreLab) > Computer Science and Engineering > Ko? University, Istanbul, Turkey > -- Najeeb Ahmad Research and Teaching Assistant PARallel and MultiCORE Computing Laboratory (ParCoreLab) Computer Science and Engineering Ko? University, Istanbul, Turkey -------------- next part -------------- An HTML attachment was scrubbed... URL: From valerio.barnabei at uniroma1.it Wed May 30 02:38:18 2018 From: valerio.barnabei at uniroma1.it (Valerio Barnabei) Date: Wed, 30 May 2018 09:38:18 +0200 Subject: [petsc-users] [petsc4py] DMPlex and DT class In-Reply-To: References: Message-ID: Thank you Matt, I've read some of your works and I'm interested in your approach. I would like to keep in touch with you, both to consider the option to make python interfaces for DT class, and to have a deeper understanding of Sieve. Best regards, Valerio Francesco Barnabei Il 29 mag 2018 23:22, "Matthew Knepley" ha scritto: On Tue, May 29, 2018 at 1:14 PM, Valerio Barnabei < valerio.barnabei at uniroma1.it> wrote: > Hello, > I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, > snes/ex77 in python using petsc4py. Unfortunately I'm having trouble > finding something analogue to the DT class of C++ PETSc, to call methods > like PetscFECreateDefault and similar. > Is this something that can be achieved using petsc4py? > I mean, is there something included in DM, DMDA or DMPlex that takes care > of the discretization of a generic value field that I'm missing? (As far as > I can see and understand, no DT class is implemented in petsc4py) > I hope i explained myself, unfortunately I'm still a new user. > Hi Valerio, There are no Python interfaces for DT because it is all experimental code. We have not yet agreed that this is the correct way to do things, so its all my own C experimentation. I could help you understand it to make Python interfaces if you wanted to. Thanks, Matt > Thanks in advance for your help. > > Best regards, > Valerio > > ___________________________________________ > *Il tuo 5 diventa 1000* > Fai crescere la tua universit? > Dona il 5 per mille alla Sapienza > Codice fiscale: *80209930587* > https://www.uniroma1.it/it/pagina/fai-crescere-la-tua- > universita-con-il-cinque-mille > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- ___________________________________________ *Il tuo?5?diventa 1000* Fai crescere la tua universit? Dona il?5?per?mille?alla Sapienza Codice fiscale:?*80209930587* https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille -------------- next part -------------- An HTML attachment was scrubbed... URL: From Michael.Becker at physik.uni-giessen.de Wed May 30 03:27:07 2018 From: Michael.Becker at physik.uni-giessen.de (Michael Becker) Date: Wed, 30 May 2018 10:27:07 +0200 Subject: [petsc-users] Poor weak scaling when solvingsuccessivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: Barry: On its way. Could take a couple days again. Junchao: I unfortunately don't have access to a cluster with a faster network. This one has a mixed 4X QDR-FDR InfiniBand 2:1 blocking fat-tree network, which I realize causes parallel slowdown if the nodes are not connected to the same switch. Each node has 24 processors (2x12/socket) and four NUMA domains (two for each socket). The ranks are usually not distributed perfectly even, i.e. for 125 processes, of the six required nodes, five would use 21 cores and one 20. Would using another CPU type make a difference communication-wise? I could switch to faster ones (on the same network), but I always assumed this would only improve performance of the stuff that is unrelated to communication. Michael > The log files have something like "Average time for zero size > MPI_Send(): 1.84231e-05". It looks you ran on a cluster with a very > slow network. A typical machine should give less than 1/10 of the > latency you have. An easy way to try is just running the code on a > machine with a faster network and see what happens. > > Also, how many cores & numa domains does a compute node have? I could > not figure out how you distributed the 125 MPI ranks evenly. > > --Junchao Zhang > > On Tue, May 29, 2018 at 6:18 AM, Michael Becker > > wrote: > > Hello again, > > here are the updated log_view files for 125 and 1000 processors. I > ran both problems twice, the first time with all processors per > node allocated ("-1.txt"), the second with only half on twice the > number of nodes ("-2.txt"). > > >>> On May 24, 2018, at 12:24 AM, Michael Becker >>> wrote: >>> >>> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). >> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. > > I mean this, right in the log_view output: > >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> ... >> >> --- Event Stage 1: First Solve >> >> ... >> >> --- Event Stage 2: Remaining Solves >> >> Vector 23904 23904 1295501184 0. > I logged the exact number of KSP iterations over the 999 timesteps > and its exactly 23904/6 = 3984. > > Michael > > > > Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: >> Please send the log file for 1000 with cg as the solver. >> >> You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) >> >> >> >>> On May 24, 2018, at 12:24 AM, Michael Becker >>> wrote: >>> >>> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). >> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. >> >> >> >>> This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). >>> >>> >>> Thanks in advance. >>> >>> Michael >>> >>> >>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valerio.barnabei at uniroma1.it Wed May 30 04:45:33 2018 From: valerio.barnabei at uniroma1.it (Valerio Barnabei) Date: Wed, 30 May 2018 11:45:33 +0200 Subject: [petsc-users] [petsc4py] DMPlex and DT class In-Reply-To: References: Message-ID: Hello again, I have a few questions about DT and DMPlex in general, not exclusively related to petsc4py implementation: in fact at the current state of my project, which is still being set up, I can choose either to program with C++ or python. -So far, we used PETSc indirectly through libmesh interface. We have a fully working FEM code for FSI problems, and that's our starting point for further research. We're currently interested in developing our own, lighter, simpler interface for the same class of problems, by simply cutting out libmesh, and directly accessing to PETSc. We are strongly determined to implement the "space-time variational formulation of incompressible flows" by Tezduyar and Takizawa (eventually you can find theory informations in https://www.worldscientific.com/doi/abs/10.1142/S0218202512300013). In this approach the 3D problem becomes a 4D problem including the time dimension, as well discretized with it's own temporal basis functions, resulting in a sequence of time slabs (spatial meshes of a space-time slab is the deformed versions of each other). Unfortunately, as I already mentioned, I'm not quite confident with DT and DMPlex, and even reading examples has not helped my understanding. Do you think DT and DMPlex can handle this kind of 4D representation, or do you think this is incompatible with the current state of that classes? -If DMPlex and DT could handle the above mentioned model, do you think we could get any help in implementing the python interface for petsc4py? -If DMPlex and DT could NOT handle the above mentioned model, do you think they can be adapted to achieve that purpose? If they can be adapted, is it a huge, deep modification of those classes, to the point it is not worth even trying? I apologize for my verbose mail, I'm available for further explanation if required. Best regards, Valerio Francesco Barnabei 2018-05-30 9:38 GMT+02:00 Valerio Barnabei : > Thank you Matt, > I've read some of your works and I'm interested in your approach. I would > like to keep in touch with you, both to consider the option to make python > interfaces for DT class, and to have a deeper understanding of Sieve. > > Best regards, > Valerio Francesco Barnabei > > Il 29 mag 2018 23:22, "Matthew Knepley" ha scritto: > > On Tue, May 29, 2018 at 1:14 PM, Valerio Barnabei < > valerio.barnabei at uniroma1.it> wrote: > >> Hello, >> I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, >> snes/ex77 in python using petsc4py. Unfortunately I'm having trouble >> finding something analogue to the DT class of C++ PETSc, to call methods >> like PetscFECreateDefault and similar. >> Is this something that can be achieved using petsc4py? >> I mean, is there something included in DM, DMDA or DMPlex that takes care >> of the discretization of a generic value field that I'm missing? (As far as >> I can see and understand, no DT class is implemented in petsc4py) >> I hope i explained myself, unfortunately I'm still a new user. >> > > Hi Valerio, > > There are no Python interfaces for DT because it is all experimental code. > We have not yet agreed that this is > the correct way to do things, so its all my own C experimentation. I could > help you understand it to make Python > interfaces if you wanted to. > > Thanks, > > Matt > > >> Thanks in advance for your help. >> >> Best regards, >> Valerio >> >> ___________________________________________ >> *Il tuo 5 diventa 1000* >> Fai crescere la tua universit? >> Dona il 5 per mille alla Sapienza >> Codice fiscale: *80209930587* >> https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer >> sita-con-il-cinque-mille >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- ___________________________________________ *Il tuo?5?diventa 1000* Fai crescere la tua universit? Dona il?5?per?mille?alla Sapienza Codice fiscale:?*80209930587* https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 30 07:12:03 2018 From: jed at jedbrown.org (Jed Brown) Date: Wed, 30 May 2018 06:12:03 -0600 Subject: [petsc-users] [petsc4py] DMPlex and DT class In-Reply-To: References: Message-ID: <87fu29filo.fsf@jedbrown.org> Are you doing space-time adaptivity? Via space-time conforming meshes or with hanging nodes? If you just solve a space-time slab (even with hanging node refinement) then there is no need to store 4D topology. This would save you the need to deal with orientation in 4D, which is somewhat complicated and would be hard to debug. Furthermore, if your spatial and/or temporal adaptivity are split (this can be a different spatially adaptive mesh with an adaptive width "time slab", just not simultaneous space-time adaptivity) then the space-time formulation (after choice of quadrature) is algebraically equivalent to a Runge-Kutta method. Valerio Barnabei writes: > Hello again, > I have a few questions about DT and DMPlex in general, not exclusively > related to petsc4py implementation: in fact at the current state of my > project, which is still being set up, I can choose either to program with > C++ or python. > > -So far, we used PETSc indirectly through libmesh interface. We have a > fully working FEM code for FSI problems, and that's our starting point for > further research. We're currently interested in developing our own, > lighter, simpler interface for the same class of problems, by simply > cutting out libmesh, and directly accessing to PETSc. We are strongly > determined to implement the "space-time variational formulation of > incompressible flows" by Tezduyar and Takizawa (eventually you can find > theory informations in > https://www.worldscientific.com/doi/abs/10.1142/S0218202512300013). In this > approach the 3D problem becomes a 4D problem including the time dimension, > as well discretized with it's own temporal basis functions, resulting in a > sequence of time slabs (spatial meshes of a space-time slab is the deformed > versions of each other). Unfortunately, as I already mentioned, I'm not > quite confident with DT and DMPlex, and even reading examples has not > helped my understanding. > Do you think DT and DMPlex can handle this kind of 4D representation, or do > you think this is incompatible with the current state of that classes? > > -If DMPlex and DT could handle the above mentioned model, do you think we > could get any help in implementing the python interface for petsc4py? > > -If DMPlex and DT could NOT handle the above mentioned model, do you think > they can be adapted to achieve that purpose? If they can be adapted, is it > a huge, deep modification of those classes, to the point it is not worth > even trying? > > > I apologize for my verbose mail, I'm available for further explanation if > required. > > Best regards, > Valerio Francesco Barnabei > > > 2018-05-30 9:38 GMT+02:00 Valerio Barnabei : > >> Thank you Matt, >> I've read some of your works and I'm interested in your approach. I would >> like to keep in touch with you, both to consider the option to make python >> interfaces for DT class, and to have a deeper understanding of Sieve. >> >> Best regards, >> Valerio Francesco Barnabei >> >> Il 29 mag 2018 23:22, "Matthew Knepley" ha scritto: >> >> On Tue, May 29, 2018 at 1:14 PM, Valerio Barnabei < >> valerio.barnabei at uniroma1.it> wrote: >> >>> Hello, >>> I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, >>> snes/ex77 in python using petsc4py. Unfortunately I'm having trouble >>> finding something analogue to the DT class of C++ PETSc, to call methods >>> like PetscFECreateDefault and similar. >>> Is this something that can be achieved using petsc4py? >>> I mean, is there something included in DM, DMDA or DMPlex that takes care >>> of the discretization of a generic value field that I'm missing? (As far as >>> I can see and understand, no DT class is implemented in petsc4py) >>> I hope i explained myself, unfortunately I'm still a new user. >>> >> >> Hi Valerio, >> >> There are no Python interfaces for DT because it is all experimental code. >> We have not yet agreed that this is >> the correct way to do things, so its all my own C experimentation. I could >> help you understand it to make Python >> interfaces if you wanted to. >> >> Thanks, >> >> Matt >> >> >>> Thanks in advance for your help. >>> >>> Best regards, >>> Valerio >>> >>> ___________________________________________ >>> *Il tuo 5 diventa 1000* >>> Fai crescere la tua universit? >>> Dona il 5 per mille alla Sapienza >>> Codice fiscale: *80209930587* >>> https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer >>> sita-con-il-cinque-mille >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > > -- > ___________________________________________ > *Il tuo?5?diventa 1000* > Fai > crescere la tua universit? > Dona il?5?per?mille?alla Sapienza > Codice > fiscale:?*80209930587* > > https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille > From knepley at gmail.com Wed May 30 07:47:25 2018 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 30 May 2018 08:47:25 -0400 Subject: [petsc-users] [petsc4py] DMPlex and DT class In-Reply-To: References: Message-ID: On Wed, May 30, 2018 at 5:45 AM, Valerio Barnabei < valerio.barnabei at uniroma1.it> wrote: > Hello again, > I have a few questions about DT and DMPlex in general, not exclusively > related to petsc4py implementation: in fact at the current state of my > project, which is still being set up, I can choose either to program with > C++ or python. > > -So far, we used PETSc indirectly through libmesh interface. We have a > fully working FEM code for FSI problems, and that's our starting point for > further research. We're currently interested in developing our own, > lighter, simpler interface for the same class of problems, by simply > cutting out libmesh, and directly accessing to PETSc. We are strongly > determined to implement the "space-time variational formulation of > incompressible flows" by Tezduyar and Takizawa (eventually you can find > theory informations in https://www.worldscientific.com/doi/abs/10.1142/ > S0218202512300013). In this approach the 3D problem becomes a 4D problem > including the time dimension, as well discretized with it's own temporal > basis functions, resulting in a sequence of time slabs (spatial meshes of a > space-time slab is the deformed versions of each other). Unfortunately, as > I already mentioned, I'm not quite confident with DT and DMPlex, and even > reading examples has not helped my understanding. > Do you think DT and DMPlex can handle this kind of 4D representation, or > do you think this is incompatible with the current state of that classes? > Too easy answer: yes More nuanced answer: I am not an expert in this area. I agree with Jed that representing it fully is probably not what you want. Usually these 4D things are tensor products. Its not hard to do this in Plex (I do it in PyLith), but the discretization support is somewhat clunky for this. The truly beautiful way to do this is implemented in Firedrake (which uses Plex underneath). I want to incorporate it, but have not had any time. Bottom line: DT is not going to do this out of box. However, in order to do it, you would need all the pieces in DT, judging by the way its done in Firedrake: https://www.geosci-model-dev.net/9/3803/2016/ > -If DMPlex and DT could handle the above mentioned model, do you think we > could get any help in implementing the python interface for petsc4py? > Yes, its not that hard to wrap stuff. The reason its not already done is that DT has a bunch of arrays passing in the interface, and that is still not automatic in 2018. > -If DMPlex and DT could NOT handle the above mentioned model, do you think > they can be adapted to achieve that purpose? If they can be adapted, is it > a huge, deep modification of those classes, to the point it is not worth > even trying? > I think the adaptation is not a research effort, since everything has already been worked out. I think its definitely an MS project level of difficulty in the programming and verification. > I apologize for my verbose mail, I'm available for further explanation if > required. > No problem. The questions are interesting. Thanks, Matt > Best regards, > Valerio Francesco Barnabei > > > 2018-05-30 9:38 GMT+02:00 Valerio Barnabei : > >> Thank you Matt, >> I've read some of your works and I'm interested in your approach. I >> would like to keep in touch with you, both to consider the option to make >> python interfaces for DT class, and to have a deeper understanding of >> Sieve. >> >> Best regards, >> Valerio Francesco Barnabei >> >> Il 29 mag 2018 23:22, "Matthew Knepley" ha scritto: >> >> On Tue, May 29, 2018 at 1:14 PM, Valerio Barnabei < >> valerio.barnabei at uniroma1.it> wrote: >> >>> Hello, >>> I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, >>> snes/ex77 in python using petsc4py. Unfortunately I'm having trouble >>> finding something analogue to the DT class of C++ PETSc, to call methods >>> like PetscFECreateDefault and similar. >>> Is this something that can be achieved using petsc4py? >>> I mean, is there something included in DM, DMDA or DMPlex that takes >>> care of the discretization of a generic value field that I'm missing? (As >>> far as I can see and understand, no DT class is implemented in petsc4py) >>> I hope i explained myself, unfortunately I'm still a new user. >>> >> >> Hi Valerio, >> >> There are no Python interfaces for DT because it is all experimental >> code. We have not yet agreed that this is >> the correct way to do things, so its all my own C experimentation. I >> could help you understand it to make Python >> interfaces if you wanted to. >> >> Thanks, >> >> Matt >> >> >>> Thanks in advance for your help. >>> >>> Best regards, >>> Valerio >>> >>> ___________________________________________ >>> *Il tuo 5 diventa 1000* >>> Fai crescere la tua universit? >>> Dona il 5 per mille alla Sapienza >>> Codice fiscale: *80209930587* >>> https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer >>> sita-con-il-cinque-mille >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > > ___________________________________________ > *Il tuo 5 diventa 1000* > Fai crescere la tua universit? > Dona il 5 per mille alla Sapienza > Codice fiscale: *80209930587* > https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer > sita-con-il-cinque-mille > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed May 30 07:53:00 2018 From: jczhang at mcs.anl.gov (Junchao Zhang) Date: Wed, 30 May 2018 07:53:00 -0500 Subject: [petsc-users] Poor weak scaling when solvingsuccessivelinearsystems In-Reply-To: References: <6d9b6488-3957-0d74-148a-323c78d42f6a@physik.uni-giessen.de> <079D8B4D-C57B-4054-B9BD-05644E52C3A4@anl.gov> Message-ID: If you have an example code and can share that, I can test and further profile it on our machine. --Junchao Zhang On Wed, May 30, 2018 at 3:27 AM, Michael Becker < Michael.Becker at physik.uni-giessen.de> wrote: > Barry: On its way. Could take a couple days again. > > Junchao: I unfortunately don't have access to a cluster with a faster > network. This one has a mixed 4X QDR-FDR InfiniBand 2:1 blocking fat-tree > network, which I realize causes parallel slowdown if the nodes are not > connected to the same switch. Each node has 24 processors (2x12/socket) and > four NUMA domains (two for each socket). > The ranks are usually not distributed perfectly even, i.e. for 125 > processes, of the six required nodes, five would use 21 cores and one 20. > Would using another CPU type make a difference communication-wise? I could > switch to faster ones (on the same network), but I always assumed this > would only improve performance of the stuff that is unrelated to > communication. > > Michael > > > > The log files have something like "Average time for zero size MPI_Send(): > 1.84231e-05". It looks you ran on a cluster with a very slow network. A > typical machine should give less than 1/10 of the latency you have. An easy > way to try is just running the code on a machine with a faster network and > see what happens. > > Also, how many cores & numa domains does a compute node have? I could not > figure out how you distributed the 125 MPI ranks evenly. > > --Junchao Zhang > > On Tue, May 29, 2018 at 6:18 AM, Michael Becker < > Michael.Becker at physik.uni-giessen.de> wrote: > >> Hello again, >> >> here are the updated log_view files for 125 and 1000 processors. I ran >> both problems twice, the first time with all processors per node allocated >> ("-1.txt"), the second with only half on twice the number of nodes >> ("-2.txt"). >> >> On May 24, 2018, at 12:24 AM, Michael Becker wrote: >> >> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). >> >> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. >> >> >> I mean this, right in the log_view output: >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> ... >> >> --- Event Stage 1: First Solve >> >> ... >> >> --- Event Stage 2: Remaining Solves >> >> Vector 23904 23904 1295501184 0. >> >> I logged the exact number of KSP iterations over the 999 timesteps and >> its exactly 23904/6 = 3984. >> >> Michael >> >> >> >> Am 24.05.2018 um 19:50 schrieb Smith, Barry F.: >> >> Please send the log file for 1000 with cg as the solver. >> >> You should make a bar chart of each event for the two cases to see which ones are taking more time and which are taking less (we cannot tell with the two logs you sent us since they are for different solvers.) >> >> >> >> >> On May 24, 2018, at 12:24 AM, Michael Becker wrote: >> >> I noticed that for every individual KSP iteration, six vector objects are created and destroyed (with CG, more with e.g. GMRES). >> >> Hmm, it is certainly not intended at vectors be created and destroyed within each KSPSolve() could you please point us to the code that makes you think they are being created and destroyed? We create all the work vectors at KSPSetUp() and destroy them in KSPReset() not during the solve. Not that this would be a measurable distance. >> >> >> >> >> This seems kind of wasteful, is this supposed to be like this? Is this even the reason for my problems? Apart from that, everything seems quite normal to me (but I'm not the expert here). >> >> >> Thanks in advance. >> >> Michael >> >> >> >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valerio.barnabei at uniroma1.it Wed May 30 08:12:26 2018 From: valerio.barnabei at uniroma1.it (Valerio Barnabei) Date: Wed, 30 May 2018 15:12:26 +0200 Subject: [petsc-users] [petsc4py] DMPlex and DT class In-Reply-To: References: Message-ID: Jed, Matt, Thank you for your kind answers. I will soon write you back with a more precise (and concise) description of the space time method we intend to implement in our code, and maybe have further confrontation on this subject. Your knowledge and experience is really precious to me. Thank you again. Valerio 2018-05-30 14:47 GMT+02:00 Matthew Knepley : > On Wed, May 30, 2018 at 5:45 AM, Valerio Barnabei < > valerio.barnabei at uniroma1.it> wrote: > >> Hello again, >> I have a few questions about DT and DMPlex in general, not exclusively >> related to petsc4py implementation: in fact at the current state of my >> project, which is still being set up, I can choose either to program with >> C++ or python. >> >> -So far, we used PETSc indirectly through libmesh interface. We have a >> fully working FEM code for FSI problems, and that's our starting point for >> further research. We're currently interested in developing our own, >> lighter, simpler interface for the same class of problems, by simply >> cutting out libmesh, and directly accessing to PETSc. We are strongly >> determined to implement the "space-time variational formulation of >> incompressible flows" by Tezduyar and Takizawa (eventually you can find >> theory informations in https://www.worldscientific.co >> m/doi/abs/10.1142/S0218202512300013). In this approach the 3D problem >> becomes a 4D problem including the time dimension, as well discretized with >> it's own temporal basis functions, resulting in a sequence of time slabs >> (spatial meshes of a space-time slab is the deformed versions of each >> other). Unfortunately, as I already mentioned, I'm not quite confident with >> DT and DMPlex, and even reading examples has not helped my understanding. >> Do you think DT and DMPlex can handle this kind of 4D representation, or >> do you think this is incompatible with the current state of that classes? >> > > Too easy answer: yes > > More nuanced answer: I am not an expert in this area. I agree with Jed > that representing it fully is probably not what you want. Usually these > 4D things are tensor products. Its not hard to do this in Plex (I do it in > PyLith), but the discretization support is somewhat clunky for this. The > truly beautiful way to do this is implemented in Firedrake (which uses > Plex underneath). I want to incorporate it, but have not had any time. > > Bottom line: DT is not going to do this out of box. However, in order to > do it, you would need all the pieces in DT, judging by the way its done > in Firedrake: https://www.geosci-model-dev.net/9/3803/2016/ > > >> -If DMPlex and DT could handle the above mentioned model, do you think we >> could get any help in implementing the python interface for petsc4py? >> > > Yes, its not that hard to wrap stuff. The reason its not already done is > that DT has a bunch of arrays passing in the interface, and that is > still not automatic in 2018. > > >> -If DMPlex and DT could NOT handle the above mentioned model, do you >> think they can be adapted to achieve that purpose? If they can be adapted, >> is it a huge, deep modification of those classes, to the point it is not >> worth even trying? >> > > I think the adaptation is not a research effort, since everything has > already been worked out. I think its definitely an MS project level of > difficulty > in the programming and verification. > > >> I apologize for my verbose mail, I'm available for further explanation if >> required. >> > > No problem. The questions are interesting. > > Thanks, > > Matt > > >> Best regards, >> Valerio Francesco Barnabei >> >> >> 2018-05-30 9:38 GMT+02:00 Valerio Barnabei >> : >> >>> Thank you Matt, >>> I've read some of your works and I'm interested in your approach. I >>> would like to keep in touch with you, both to consider the option to make >>> python interfaces for DT class, and to have a deeper understanding of >>> Sieve. >>> >>> Best regards, >>> Valerio Francesco Barnabei >>> >>> Il 29 mag 2018 23:22, "Matthew Knepley" ha scritto: >>> >>> On Tue, May 29, 2018 at 1:14 PM, Valerio Barnabei < >>> valerio.barnabei at uniroma1.it> wrote: >>> >>>> Hello, >>>> I'm trying to figure how to translate snes/ex12, snes/ex56, snes/ex62, >>>> snes/ex77 in python using petsc4py. Unfortunately I'm having trouble >>>> finding something analogue to the DT class of C++ PETSc, to call methods >>>> like PetscFECreateDefault and similar. >>>> Is this something that can be achieved using petsc4py? >>>> I mean, is there something included in DM, DMDA or DMPlex that takes >>>> care of the discretization of a generic value field that I'm missing? (As >>>> far as I can see and understand, no DT class is implemented in petsc4py) >>>> I hope i explained myself, unfortunately I'm still a new user. >>>> >>> >>> Hi Valerio, >>> >>> There are no Python interfaces for DT because it is all experimental >>> code. We have not yet agreed that this is >>> the correct way to do things, so its all my own C experimentation. I >>> could help you understand it to make Python >>> interfaces if you wanted to. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks in advance for your help. >>>> >>>> Best regards, >>>> Valerio >>>> >>>> ___________________________________________ >>>> *Il tuo 5 diventa 1000* >>>> Fai crescere la tua universit? >>>> Dona il 5 per mille alla Sapienza >>>> Codice fiscale: *80209930587* >>>> https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer >>>> sita-con-il-cinque-mille >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >> >> ___________________________________________ >> *Il tuo 5 diventa 1000* >> Fai crescere la tua universit? >> Dona il 5 per mille alla Sapienza >> Codice fiscale: *80209930587* >> https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-univer >> sita-con-il-cinque-mille >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- ___________________________________________ *Il tuo?5?diventa 1000* Fai crescere la tua universit? Dona il?5?per?mille?alla Sapienza Codice fiscale:?*80209930587* https://www.uniroma1.it/it/pagina/fai-crescere-la-tua-universita-con-il-cinque-mille -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 30 12:58:02 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 30 May 2018 17:58:02 +0000 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Message-ID: Fixed in the branch barry/fix-mat-new-nonzero-locations/maint Once this passes testing it will go into the maint branch and then the next patch release but you can use it now in the branch barry/fix-mat-new-nonzero-locations/maint Thanks for the report and reproducible example Barry > On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: > > Sure, I made a small reproducer, it is Fortran though I hope that is ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is set to true the new nonzero element is inserted, if MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR have been set to false I get an error again. > > > program newnonzero > #include > use petscmat > implicit none > > Mat :: A > PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 > PetscScalar :: v(1) > PetscReal :: info(MAT_INFO_SIZE) > PetscErrorCode :: ierr > > integer :: nproc,iproc,i > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) > > call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) > > n=3 > m=n > call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) > > > call MatGetOwnershipRange(A,nl1,nl2,ierr) > do i=nl1,nl2-1 > idxn(1)=i > idxm(1)=i > v(1)=1d0 > call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > end do > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) > !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR ,PETSC_FALSE,ierr) > !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > > > idxn(1)=0 > idxm(1)=n-1 > if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > v(1)=2d0 > call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > end if > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > v(1)=2d0 > call MatGetValues(A,1,idxn,1,idxm, v,ierr) > write(6,*) v > end if > > call PetscFinalize(ierr) > > end program newnonzero > > > > $ mpiexec.hydra -n 3 ./a.out > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) into matrix > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 09:42:40 2018 > [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 --download-elemental=yes --download-metis=yes --download-parmetis=yes --download-mumps=yes --with-scalapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 --download-superlu_dist=yes --download-ptscotch=yes --with-x --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 --with-mkl_pardiso=1 --with-scalapack=1 > [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #2 MatSetValues() line 1312 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c > (0.000000000000000E+000,0.000000000000000E+000) > > > > Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. > > Barry > > >> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: >> >> >> Hi, >> >> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? > From mbuerkle at web.de Wed May 30 18:55:16 2018 From: mbuerkle at web.de (Marius Buerkle) Date: Thu, 31 May 2018 01:55:16 +0200 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Message-ID: Thanks for the quick fix, I will test it and report back. I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is true and let's say 1 new nonzero position is created it does not allocated 1 but several new nonzeros but only use 1. I think that is normal, right? But, at least as far as I understand the manual, a subsequent call of mat assemble with MAT_FINAL_ASSEMBLY should compress out the unused allocations and release the memory, is this correct? If so, this did not work for me, even after doing MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this normal? > > Fixed in the branch barry/fix-mat-new-nonzero-locations/maint > > Once this passes testing it will go into the maint branch and then the next patch release but you can use it now in the branch barry/fix-mat-new-nonzero-locations/maint > > Thanks for the report and reproducible example > > Barry > > > > On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: > > > > Sure, I made a small reproducer, it is Fortran though I hope that is ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is set to true the new nonzero element is inserted, if MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR have been set to false I get an error again. > > > > > > program newnonzero > > #include > > use petscmat > > implicit none > > > > Mat :: A > > PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 > > PetscScalar :: v(1) > > PetscReal :: info(MAT_INFO_SIZE) > > PetscErrorCode :: ierr > > > > integer :: nproc,iproc,i > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > > > call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) > > > > call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) > > > > n=3 > > m=n > > call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) > > > > > > call MatGetOwnershipRange(A,nl1,nl2,ierr) > > do i=nl1,nl2-1 > > idxn(1)=i > > idxm(1)=i > > v(1)=1d0 > > call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > > end do > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > > > call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > > !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) > > !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR ,PETSC_FALSE,ierr) > > !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > > > > > > idxn(1)=0 > > idxm(1)=n-1 > > if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > > v(1)=2d0 > > call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > > end if > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > > > if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > > v(1)=2d0 > > call MatGetValues(A,1,idxn,1,idxm, v,ierr) > > write(6,*) v > > end if > > > > call PetscFinalize(ierr) > > > > end program newnonzero > > > > > > > > $ mpiexec.hydra -n 3 ./a.out > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Argument out of range > > [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) into matrix > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > > [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 09:42:40 2018 > > [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 --download-elemental=yes --download-metis=yes --download-parmetis=yes --download-mumps=yes --with-scalapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 --download-superlu_dist=yes --download-ptscotch=yes --with-x --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 --with-mkl_pardiso=1 --with-scalapack=1 > > [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #2 MatSetValues() line 1312 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c > > (0.000000000000000E+000,0.000000000000000E+000) > > > > > > > > Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. > > > > Barry > > > > > >> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: > >> > >> > >> Hi, > >> > >> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? > > > > From bsmith at mcs.anl.gov Wed May 30 19:07:42 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 31 May 2018 00:07:42 +0000 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Message-ID: > On May 30, 2018, at 6:55 PM, Marius Buerkle wrote: > > Thanks for the quick fix, I will test it and report back. > I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is true and let's say 1 new nonzero position is created it does not allocated 1 but several new nonzeros but only use 1. Correct > I think that is normal, right? Yes > But, at least as far as I understand the manual, a subsequent call of mat assemble with > MAT_FINAL_ASSEMBLY should compress out the unused allocations and release the memory, is this correct? It "compresses it out" (by shifting all the nonzero entries to the beginning of the internal i, j, and a arrays), but does NOT release any memory. Since the values are stored in one big contiguous array (obtained with a single malloc) it cannot just free part of the array, so the extra locations just sit harmlessly at the end if the array unused. > If so, this did not work for me, even after doing > MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this normal? Yes, Barry > >> >> Fixed in the branch barry/fix-mat-new-nonzero-locations/maint >> >> Once this passes testing it will go into the maint branch and then the next patch release but you can use it now in the branch barry/fix-mat-new-nonzero-locations/maint >> >> Thanks for the report and reproducible example >> >> Barry >> >> >>> On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: >>> >>> Sure, I made a small reproducer, it is Fortran though I hope that is ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is set to true the new nonzero element is inserted, if MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR have been set to false I get an error again. >>> >>> >>> program newnonzero >>> #include >>> use petscmat >>> implicit none >>> >>> Mat :: A >>> PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 >>> PetscScalar :: v(1) >>> PetscReal :: info(MAT_INFO_SIZE) >>> PetscErrorCode :: ierr >>> >>> integer :: nproc,iproc,i >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>> >>> call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) >>> >>> call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) >>> >>> n=3 >>> m=n >>> call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) >>> >>> >>> call MatGetOwnershipRange(A,nl1,nl2,ierr) >>> do i=nl1,nl2-1 >>> idxn(1)=i >>> idxm(1)=i >>> v(1)=1d0 >>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>> end do >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>> >>> call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR ,PETSC_FALSE,ierr) >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>> >>> >>> idxn(1)=0 >>> idxm(1)=n-1 >>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>> v(1)=2d0 >>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>> end if >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>> >>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>> v(1)=2d0 >>> call MatGetValues(A,1,idxn,1,idxm, v,ierr) >>> write(6,*) v >>> end if >>> >>> call PetscFinalize(ierr) >>> >>> end program newnonzero >>> >>> >>> >>> $ mpiexec.hydra -n 3 ./a.out >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) into matrix >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 >>> [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 09:42:40 2018 >>> [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 --download-elemental=yes --download-metis=yes --download-parmetis=yes --download-mumps=yes --with-scalapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 --download-superlu_dist=yes --download-ptscotch=yes --with-x --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 --with-mkl_pardiso=1 --with-scalapack=1 >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c >>> [0]PETSC ERROR: #2 MatSetValues() line 1312 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c >>> (0.000000000000000E+000,0.000000000000000E+000) >>> >>> >>> >>> Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. >>> >>> Barry >>> >>> >>>> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: >>>> >>>> >>>> Hi, >>>> >>>> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? >>> >> >> From mbuerkle at web.de Thu May 31 02:37:00 2018 From: mbuerkle at web.de (Marius Buerkle) Date: Thu, 31 May 2018 09:37:00 +0200 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Message-ID: The fix for MAT_NEW_NONZERO_LOCATIONS, thanks again. I have yet another question, sorry. The recent version of MUMPS supports distributed and sparse RHS is there any chance that this will be supported in PETSc in the near future? ? ? > On May 30, 2018, at 6:55 PM, Marius Buerkle wrote: > > Thanks for the quick fix, I will test it and report back. > I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is true and let's say 1 new nonzero position is created it does not allocated 1 but several new nonzeros but only use 1. Correct > I think that is normal, right? Yes > But, at least as far as I understand the manual, a subsequent call of mat assemble with > MAT_FINAL_ASSEMBLY should compress out the unused allocations and release the memory, is this correct? It "compresses it out" (by shifting all the nonzero entries to the beginning of the internal i, j, and a arrays), but does NOT release any memory. Since the values are stored in one big contiguous array (obtained with a single malloc) it cannot just free part of the array, so the extra locations just sit harmlessly at the end if the array unused. > If so, this did not work for me, even after doing > MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this normal? Yes, Barry > >> >> Fixed in the branch barry/fix-mat-new-nonzero-locations/maint >> >> Once this passes testing it will go into the maint branch and then the next patch release but you can use it now in the branch barry/fix-mat-new-nonzero-locations/maint >> >> Thanks for the report and reproducible example >> >> Barry >> >> >>> On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: >>> >>> Sure, I made a small reproducer, it is Fortran though I hope that is ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is set to true the new nonzero element is inserted, if MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR have been set to false I get an error again. >>> >>> >>> program newnonzero >>> #include >>> use petscmat >>> implicit none >>> >>> Mat :: A >>> PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 >>> PetscScalar :: v(1) >>> PetscReal :: info(MAT_INFO_SIZE) >>> PetscErrorCode :: ierr >>> >>> integer :: nproc,iproc,i >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>> >>> call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) >>> >>> call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) >>> >>> n=3 >>> m=n >>> call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) >>> >>> >>> call MatGetOwnershipRange(A,nl1,nl2,ierr) >>> do i=nl1,nl2-1 >>> idxn(1)=i >>> idxm(1)=i >>> v(1)=1d0 >>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>> end do >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>> >>> call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR ,PETSC_FALSE,ierr) >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>> >>> >>> idxn(1)=0 >>> idxm(1)=n-1 >>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>> v(1)=2d0 >>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>> end if >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>> >>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>> v(1)=2d0 >>> call MatGetValues(A,1,idxn,1,idxm, v,ierr) >>> write(6,*) v >>> end if >>> >>> call PetscFinalize(ierr) >>> >>> end program newnonzero >>> >>> >>> >>> $ mpiexec.hydra -n 3 ./a.out >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) into matrix >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 >>> [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 09:42:40 2018 >>> [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 --download-elemental=yes --download-metis=yes --download-parmetis=yes --download-mumps=yes --with-scalapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 --download-superlu_dist=yes --download-ptscotch=yes --with-x --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 --with-mkl_pardiso=1 --with-scalapack=1 >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c >>> [0]PETSC ERROR: #2 MatSetValues() line 1312 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c >>> (0.000000000000000E+000,0.000000000000000E+000) >>> >>> >>> >>> Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. >>> >>> Barry >>> >>> >>>> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: >>>> >>>> >>>> Hi, >>>> >>>> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? >>> >> >> ? From stefano.zampini at gmail.com Thu May 31 03:21:27 2018 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 31 May 2018 11:21:27 +0300 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Message-ID: The current version of the MUMPS code in PETSc supports sparse right hand sides for sequential solvers. You can call MatMatSolve(A,X,B) with B of type MATTRANSPOSE, with the inner matrix being a MATSEQAIJ 2018-05-31 10:37 GMT+03:00 Marius Buerkle : > The fix for MAT_NEW_NONZERO_LOCATIONS, thanks again. > > I have yet another question, sorry. The recent version of MUMPS supports > distributed and sparse RHS is there any chance that this will be supported > in PETSc in the near future? > > > > > > On May 30, 2018, at 6:55 PM, Marius Buerkle wrote: > > > > Thanks for the quick fix, I will test it and report back. > > I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is > true and let's say 1 new nonzero position is created it does not allocated > 1 but several new nonzeros but only use 1. > > Correct > > > I think that is normal, right? > > Yes > > > But, at least as far as I understand the manual, a subsequent call of > mat assemble with > > MAT_FINAL_ASSEMBLY should compress out the unused allocations and > release the memory, is this correct? > > It "compresses it out" (by shifting all the nonzero entries to the > beginning of the internal i, j, and a arrays), but does NOT release any > memory. Since the values are stored in one big contiguous array (obtained > with a single malloc) it cannot just free part of the array, so the extra > locations just sit harmlessly at the end if the array unused. > > > If so, this did not work for me, even after doing > > MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this normal? > > Yes, > > Barry > > > > >> > >> Fixed in the branch barry/fix-mat-new-nonzero-locations/maint > >> > >> Once this passes testing it will go into the maint branch and then the > next patch release but you can use it now in the branch > barry/fix-mat-new-nonzero-locations/maint > >> > >> Thanks for the report and reproducible example > >> > >> Barry > >> > >> > >>> On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: > >>> > >>> Sure, I made a small reproducer, it is Fortran though I hope that is > ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is > set to true the new nonzero element is inserted, if > MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR > or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new > nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS > is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR > have been set to false I get an error again. > >>> > >>> > >>> program newnonzero > >>> #include > >>> use petscmat > >>> implicit none > >>> > >>> Mat :: A > >>> PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 > >>> PetscScalar :: v(1) > >>> PetscReal :: info(MAT_INFO_SIZE) > >>> PetscErrorCode :: ierr > >>> > >>> integer :: nproc,iproc,i > >>> > >>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > >>> > >>> call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) > >>> > >>> call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) > >>> > >>> n=3 > >>> m=n > >>> call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m, > 1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) > >>> > >>> > >>> call MatGetOwnershipRange(A,nl1,nl2,ierr) > >>> do i=nl1,nl2-1 > >>> idxn(1)=i > >>> idxm(1)=i > >>> v(1)=1d0 > >>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > >>> end do > >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > >>> > >>> call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) > >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR > ,PETSC_FALSE,ierr) > >>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > >>> > >>> > >>> idxn(1)=0 > >>> idxm(1)=n-1 > >>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > >>> v(1)=2d0 > >>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > >>> end if > >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > >>> > >>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > >>> v(1)=2d0 > >>> call MatGetValues(A,1,idxn,1,idxm, v,ierr) > >>> write(6,*) v > >>> end if > >>> > >>> call PetscFinalize(ierr) > >>> > >>> end program newnonzero > >>> > >>> > >>> > >>> $ mpiexec.hydra -n 3 ./a.out > >>> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >>> [0]PETSC ERROR: Argument out of range > >>> [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) > into matrix > >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > >>> [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > >>> [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 > 09:42:40 2018 > >>> [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 > --download-elemental=yes --download-metis=yes --download-parmetis=yes > --download-mumps=yes --with-scalapack-lib="/home/ > marius/intel/compilers_and_libraries_2018.2.199/linux/ > mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a > -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx > --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/ > marius/intel/compilers_and_libraries_2018.2.199/linux/ > mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a > -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 > --download-superlu_dist=yes --download-ptscotch=yes --with-x > --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 > --with-mkl_pardiso=1 --with-scalapack=1 > >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in > /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c > >>> [0]PETSC ERROR: #2 MatSetValues() line 1312 in > /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c > >>> (0.000000000000000E+000,0.000000000000000E+000) > >>> > >>> > >>> > >>> Please send complete error message; type of matrix used etc. Ideally > code that demonstrates the problem. > >>> > >>> Barry > >>> > >>> > >>>> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: > >>>> > >>>> > >>>> Hi, > >>>> > >>>> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I > understood MatSetValues should simply ignore entries which would give rise > to new nonzero values not creating a new entry and not cause an error, but > I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is > this option supposed to work or not? > >>> > >> > >> > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From nahmad16 at ku.edu.tr Thu May 31 05:13:42 2018 From: nahmad16 at ku.edu.tr (Najeeb Ahmad) Date: Thu, 31 May 2018 15:13:42 +0500 Subject: [petsc-users] Re-configuring PETSc In-Reply-To: <938429CF-8D28-4428-9B23-DF405571D0B8@anl.gov> References: <289F7EFF-CD71-486B-89E6-94AFF5C2DAD3@anl.gov> <938429CF-8D28-4428-9B23-DF405571D0B8@anl.gov> Message-ID: Thank you Barry for your useful comments and the paper and Hong for some very very useful information. I will get back to you if I need any further assistance. By the way, in my experience, PETSc has got one of the best user support system I have ever experienced for any software/library. Information is timely and relevant, Three cheers for all PETSc experts in this group :) Have a great day, On Wed, May 30, 2018 at 8:01 AM, Zhang, Hong wrote: > > > On May 29, 2018, at 3:05 AM, Najeeb Ahmad wrote: > > > > On Mon, May 28, 2018 at 9:32 PM, Smith, Barry F. > wrote: > >> >> >> > On May 28, 2018, at 10:32 AM, Najeeb Ahmad wrote: >> > >> > Thanks a lot Satish for your prompt reply. >> > >> > I just checked that SuperLU_dist package works only for matrices of >> type aij. and uses lu preconditioner. I am currently working with baij >> matrix. What is the best preconditioner choice for baij matrices on >> parallel machines? >> >> The best preconditioner is always problem specific. Where does your >> problem come from? CFD? Structural mechanics? other apps? >> > > I am interested in writing solver for reservoir simulation > employing FVM and unstructured grids. My main objective is to study > performance of the code with different data structures/data layouts and > architecture specific optimizations, specifically targeting the multicore > architectures like KNL for instance. Later the study may be extended to > include GPUs. The options for switching between AIJ and BAIJ etc. are > therefore very useful for my study. > > The purpose why I wanted to change the preconditioner is that the > default preconditioner is giving me different iterations count for > different number of processesors. I would rather like a preconditioner that > would give me same iteration count for any processor count so that I can > better compare the performance results. > > Your suggestions in this regard are highly appreciated, > specifically with reference to the following points: > > - Is it possible to explicitly use high bandwidth memory in PETSc > for selected object placement (e.g. using memkind library for instance)? > > > Yes, see http://www.mcs.anl.gov/petsc/petsc-3.8/src/sys/memory/mhbw.c.html > > This was developed to use memkind to handle adjoint checkpointing where I > want to use HBW memory for computation and DRAM for storing checkpoints. > But it can be used for your purpose as well. > > When configure PETSc, use "--with-memkind-dir=" to specify the location of > the memkind library. > The runtime option "-malloc_hbw" will allow you to allocate all PETSc > objects in HBW memory. If the HBW memory is ran out, it falls back to DRAM. > > If you want to place selective objects in DRAM, you can do > > PetscMallocSetDRAM() > ... allocate your objects ... > PetscMallocResetDRAM() > > An example usage can be found at > http://www.mcs.anl.gov/petsc/petsc-dev/src/ts/trajectory/ > impls/memory/trajmemory.c > > > - What would it take to take advantage of architecture specific > compiler flags to achieve good performance on a given platform (e.g. > -xMIC-AVX512 for AVX512 on KNL, #pragma SIMD etc.). > > > To build PETSc on KNL with AVX512 enabled, see the example scripts > config/examples/arch-linux-knl.py > config/examples/arch-cray-xc40-knl-opt.py > > Note that the MatMult kernel (for AIJ and SELL) has been manually > optimized for best performance. > > Hong (Mr.) > > Sorry for some very basic questions as I am a novice PETSc user. > > Thanks for your time :) > >> >> Anyways you probably want to make your code be able to switch >> between AIJ and BAIJ at run time since the different formats support >> somewhat different solvers. If your code alls MatSetFromOptions then you >> can switch via the command line option -mat_type aij or baij >> >> Barry >> >> > >> > Thanks >> > >> > On Mon, May 28, 2018 at 8:23 PM, Satish Balay >> wrote: >> > On Mon, 28 May 2018, Najeeb Ahmad wrote: >> > >> > > Hi All, >> > > >> > > I have Petsc release version 3.9.2 configured with the following >> options: >> > > >> > > Configure options --with-cc=mpiicc --with-cxx=mpiicpc >> --with-fc=mpiifort >> > > --download-fblaslapack=1 >> > > >> > > Now I want to use PCILU in my code and when I set the PC type to >> PCILU in >> > > the code, I get the following error: >> > > >> > > [0]PETSC ERROR: --------------------- Error Message >> > > -------------------------------------------------------------- >> > > [0]PETSC ERROR: See >> > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for >> > > possible LU and Cholesky solvers >> > > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must >> > > ./configure with --download- >> > > [0]PETSC ERROR: See http://www.mcs.anl.gov/ >> petsc/documentation/faq.html for >> > > trouble shooting. >> > > [0]PETSC ERROR: Petsc Release Version 3.9.2, unknown >> > > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named Karachi by >> nahmad Mon >> > > May 28 17:52:41 2018 >> > > [0]PETSC ERROR: Configure options --with-cc=mpiicc --with-cxx=mpiicpc >> > > --with-fc=mpiifort --download-fblaslapack=1 >> > > [0]PETSC ERROR: #1 MatGetFactor() line 4318 in >> > > /home/nahmad/PETSc/petsc/src/mat/interface/matrix.c >> > > [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in >> > > /home/nahmad/PETSc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c >> > > [0]PETSC ERROR: #3 PCSetUp() line 923 in >> > > /home/nahmad/PETSc/petsc/src/ksp/pc/interface/precon.c >> > > [0]PETSC ERROR: #4 KSPSetUp() line 381 in >> > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c >> > > [0]PETSC ERROR: #5 KSPSolve() line 612 in >> > > /home/nahmad/PETSc/petsc/src/ksp/ksp/interface/itfunc.c >> > > [0]PETSC ERROR: #6 SolveSystem() line 60 in >> > > /home/nahmad/Aramco/petsc/petsc/BlockSolveTest/src/main.c >> > > >> > > >> > > I assume that I am missing LU package like SuperLU_dist for instance >> and I >> > > need to download and configure it with Petsc. >> > >> > yes - petsc has sequential LU - but you need superlu_dist/mumps for >> parallel lu. >> > >> > > >> > > I am wondering what is the best way to reconfigure Petsc to download >> and >> > > use the appropriate package to support PCILU? >> > >> > Rerun configure with the additional option --download-superlu_dist=1. >> > >> > You can do this with current PETSC_ARCH you are using [i.e reinstall >> > over the current build] - or use a different PETSC_ARCH - so both >> > builds exist and useable. >> > >> > Satish >> > >> > > >> > > You advice is highly appreciated. >> > > >> > > >> > >> > >> > >> > >> > -- >> > Najeeb Ahmad >> > >> > Research and Teaching Assistant >> > PARallel and MultiCORE Computing Laboratory (ParCoreLab) >> > Computer Science and Engineering >> > Ko? University, Istanbul, Turkey >> > >> >> > > > -- > *Najeeb Ahmad* > > > *Research and Teaching Assistant * > *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * > > *Computer Science and Engineering * > *Ko? University, Istanbul, Turkey* > > > -- *Najeeb Ahmad* *Research and Teaching Assistant* *PARallel and MultiCORE Computing Laboratory (ParCoreLab) * *Computer Science and Engineering* *Ko? University, Istanbul, Turkey* -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 31 11:25:05 2018 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 31 May 2018 16:25:05 +0000 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> Message-ID: <58D5E959-0E3C-4EF2-8270-882D1EA6EA38@mcs.anl.gov> Hong, Can you see about adding support for distributed right hand side? Thanks Barry > On May 31, 2018, at 2:37 AM, Marius Buerkle wrote: > > The fix for MAT_NEW_NONZERO_LOCATIONS, thanks again. > > I have yet another question, sorry. The recent version of MUMPS supports distributed and sparse RHS is there any chance that this will be supported in PETSc in the near future? > > > > >> On May 30, 2018, at 6:55 PM, Marius Buerkle wrote: >> >> Thanks for the quick fix, I will test it and report back. >> I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is true and let's say 1 new nonzero position is created it does not allocated 1 but several new nonzeros but only use 1. > > Correct > >> I think that is normal, right? > > Yes > >> But, at least as far as I understand the manual, a subsequent call of mat assemble with >> MAT_FINAL_ASSEMBLY should compress out the unused allocations and release the memory, is this correct? > > It "compresses it out" (by shifting all the nonzero entries to the beginning of the internal i, j, and a arrays), but does NOT release any memory. Since the values are stored in one big contiguous array (obtained with a single malloc) it cannot just free part of the array, so the extra locations just sit harmlessly at the end if the array unused. > >> If so, this did not work for me, even after doing >> MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this normal? > > Yes, > > Barry > >> >>> >>> Fixed in the branch barry/fix-mat-new-nonzero-locations/maint >>> >>> Once this passes testing it will go into the maint branch and then the next patch release but you can use it now in the branch barry/fix-mat-new-nonzero-locations/maint >>> >>> Thanks for the report and reproducible example >>> >>> Barry >>> >>> >>>> On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: >>>> >>>> Sure, I made a small reproducer, it is Fortran though I hope that is ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is set to true the new nonzero element is inserted, if MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR have been set to false I get an error again. >>>> >>>> >>>> program newnonzero >>>> #include >>>> use petscmat >>>> implicit none >>>> >>>> Mat :: A >>>> PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 >>>> PetscScalar :: v(1) >>>> PetscReal :: info(MAT_INFO_SIZE) >>>> PetscErrorCode :: ierr >>>> >>>> integer :: nproc,iproc,i >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>>> >>>> call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) >>>> >>>> call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) >>>> >>>> n=3 >>>> m=n >>>> call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) >>>> >>>> >>>> call MatGetOwnershipRange(A,nl1,nl2,ierr) >>>> do i=nl1,nl2-1 >>>> idxn(1)=i >>>> idxm(1)=i >>>> v(1)=1d0 >>>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>>> end do >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>>> >>>> call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR ,PETSC_FALSE,ierr) >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>>> >>>> >>>> idxn(1)=0 >>>> idxm(1)=n-1 >>>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>>> v(1)=2d0 >>>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>>> end if >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>>> >>>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>>> v(1)=2d0 >>>> call MatGetValues(A,1,idxn,1,idxm, v,ierr) >>>> write(6,*) v >>>> end if >>>> >>>> call PetscFinalize(ierr) >>>> >>>> end program newnonzero >>>> >>>> >>>> >>>> $ mpiexec.hydra -n 3 ./a.out >>>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>> [0]PETSC ERROR: Argument out of range >>>> [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) into matrix >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>>> [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 >>>> [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 09:42:40 2018 >>>> [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 --download-elemental=yes --download-metis=yes --download-parmetis=yes --download-mumps=yes --with-scalapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 --download-superlu_dist=yes --download-ptscotch=yes --with-x --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 --with-mkl_pardiso=1 --with-scalapack=1 >>>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c >>>> [0]PETSC ERROR: #2 MatSetValues() line 1312 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c >>>> (0.000000000000000E+000,0.000000000000000E+000) >>>> >>>> >>>> >>>> Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. >>>> >>>> Barry >>>> >>>> >>>>> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: >>>>> >>>>> >>>>> Hi, >>>>> >>>>> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? >>>> >>> >>> > From hzhang at mcs.anl.gov Thu May 31 12:09:44 2018 From: hzhang at mcs.anl.gov (Hong) Date: Thu, 31 May 2018 12:09:44 -0500 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: <58D5E959-0E3C-4EF2-8270-882D1EA6EA38@mcs.anl.gov> References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> <58D5E959-0E3C-4EF2-8270-882D1EA6EA38@mcs.anl.gov> Message-ID: I see MUMPS http://mumps.enseeiht.fr/ - *Sparse multiple right-hand side, distributed solution*; Exploitation of sparsity in the right-hand sides PETSc interface computes mumps *distributed solution *as default (this is not new) (ICNTL(21) = 1) I will add support for *Sparse multiple right-hand side.* Hong On Thu, May 31, 2018 at 11:25 AM, Smith, Barry F. wrote: > > Hong, > > Can you see about adding support for distributed right hand side? > > Thanks > > Barry > > > > On May 31, 2018, at 2:37 AM, Marius Buerkle wrote: > > > > The fix for MAT_NEW_NONZERO_LOCATIONS, thanks again. > > > > I have yet another question, sorry. The recent version of MUMPS supports > distributed and sparse RHS is there any chance that this will be supported > in PETSc in the near future? > > > > > > > > > >> On May 30, 2018, at 6:55 PM, Marius Buerkle wrote: > >> > >> Thanks for the quick fix, I will test it and report back. > >> I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is > true and let's say 1 new nonzero position is created it does not allocated > 1 but several new nonzeros but only use 1. > > > > Correct > > > >> I think that is normal, right? > > > > Yes > > > >> But, at least as far as I understand the manual, a subsequent call of > mat assemble with > >> MAT_FINAL_ASSEMBLY should compress out the unused allocations and > release the memory, is this correct? > > > > It "compresses it out" (by shifting all the nonzero entries to the > beginning of the internal i, j, and a arrays), but does NOT release any > memory. Since the values are stored in one big contiguous array (obtained > with a single malloc) it cannot just free part of the array, so the extra > locations just sit harmlessly at the end if the array unused. > > > >> If so, this did not work for me, even after doing > >> MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this > normal? > > > > Yes, > > > > Barry > > > >> > >>> > >>> Fixed in the branch barry/fix-mat-new-nonzero-locations/maint > >>> > >>> Once this passes testing it will go into the maint branch and then the > next patch release but you can use it now in the branch > barry/fix-mat-new-nonzero-locations/maint > >>> > >>> Thanks for the report and reproducible example > >>> > >>> Barry > >>> > >>> > >>>> On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: > >>>> > >>>> Sure, I made a small reproducer, it is Fortran though I hope that is > ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is > set to true the new nonzero element is inserted, if > MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR > or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new > nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS > is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR > have been set to false I get an error again. > >>>> > >>>> > >>>> program newnonzero > >>>> #include > >>>> use petscmat > >>>> implicit none > >>>> > >>>> Mat :: A > >>>> PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 > >>>> PetscScalar :: v(1) > >>>> PetscReal :: info(MAT_INFO_SIZE) > >>>> PetscErrorCode :: ierr > >>>> > >>>> integer :: nproc,iproc,i > >>>> > >>>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > >>>> > >>>> call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) > >>>> > >>>> call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) > >>>> > >>>> n=3 > >>>> m=n > >>>> call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m, > 1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) > >>>> > >>>> > >>>> call MatGetOwnershipRange(A,nl1,nl2,ierr) > >>>> do i=nl1,nl2-1 > >>>> idxn(1)=i > >>>> idxm(1)=i > >>>> v(1)=1d0 > >>>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > >>>> end do > >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > >>>> > >>>> call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) > >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR > ,PETSC_FALSE,ierr) > >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) > >>>> > >>>> > >>>> idxn(1)=0 > >>>> idxm(1)=n-1 > >>>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > >>>> v(1)=2d0 > >>>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) > >>>> end if > >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > >>>> > >>>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then > >>>> v(1)=2d0 > >>>> call MatGetValues(A,1,idxn,1,idxm, v,ierr) > >>>> write(6,*) v > >>>> end if > >>>> > >>>> call PetscFinalize(ierr) > >>>> > >>>> end program newnonzero > >>>> > >>>> > >>>> > >>>> $ mpiexec.hydra -n 3 ./a.out > >>>> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >>>> [0]PETSC ERROR: Argument out of range > >>>> [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) > into matrix > >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > >>>> [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 > >>>> [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 > 09:42:40 2018 > >>>> [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 > --download-elemental=yes --download-metis=yes --download-parmetis=yes > --download-mumps=yes --with-scalapack-lib="/home/ > marius/intel/compilers_and_libraries_2018.2.199/linux/ > mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a > -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx > --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/ > marius/intel/compilers_and_libraries_2018.2.199/linux/ > mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a > /home/marius/intel/compilers_and_libraries_2018.2.199/ > linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_ > and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a > -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 > --download-superlu_dist=yes --download-ptscotch=yes --with-x > --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 > --with-mkl_pardiso=1 --with-scalapack=1 > >>>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in > /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c > >>>> [0]PETSC ERROR: #2 MatSetValues() line 1312 in > /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c > >>>> (0.000000000000000E+000,0.000000000000000E+000) > >>>> > >>>> > >>>> > >>>> Please send complete error message; type of matrix used etc. Ideally > code that demonstrates the problem. > >>>> > >>>> Barry > >>>> > >>>> > >>>>> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: > >>>>> > >>>>> > >>>>> Hi, > >>>>> > >>>>> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I > understood MatSetValues should simply ignore entries which would give rise > to new nonzero values not creating a new entry and not cause an error, but > I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is > this option supposed to work or not? > >>>> > >>> > >>> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbuerkle at web.de Thu May 31 17:52:59 2018 From: mbuerkle at web.de (Marius Buerkle) Date: Fri, 1 Jun 2018 00:52:59 +0200 Subject: [petsc-users] MAT_NEW_NONZERO_LOCATIONS working? In-Reply-To: References: <1153922A-8876-492B-8AD9-F41DD718FD21@anl.gov> <58D5E959-0E3C-4EF2-8270-882D1EA6EA38@mcs.anl.gov> Message-ID: Thanks a lot guys, very helpful. ? I see MUMPS?http://mumps.enseeiht.fr/ Sparse multiple right-hand side, distributed solution; Exploitation of sparsity in the right-hand sidesPETSc interface computes mumps?distributed solution as default (this is not new) (ICNTL(21) = 1) ? I will add support for?Sparse multiple right-hand side. ? Hong ? On Thu, May 31, 2018 at 11:25 AM, Smith, Barry F. wrote: ? Hong, ? ? Can you see about adding support for distributed right hand side? ? ? Thanks ? ? ? Barry > On May 31, 2018, at 2:37 AM, Marius Buerkle wrote: > > The fix for MAT_NEW_NONZERO_LOCATIONS, thanks again. > > I have yet another question, sorry. The recent version of MUMPS supports distributed and sparse RHS is there any chance that this will be supported in PETSc in the near future? > >? >? > >> On May 30, 2018, at 6:55 PM, Marius Buerkle wrote: >> >> Thanks for the quick fix, I will test it and report back. >> I have another maybe related question, if MAT_NEW_NONZERO_LOCATIONS is true and let's say 1 new nonzero position is created it does not allocated 1 but several new nonzeros but only use 1. > > Correct > >> I think that is normal, right? > > Yes > >> But, at least as far as I understand the manual, a subsequent call of mat assemble with >> MAT_FINAL_ASSEMBLY should compress out the unused allocations and release the memory, is this correct? > > It "compresses it out" (by shifting all the nonzero entries to the beginning of the internal i, j, and a arrays), but does NOT release any memory. Since the values are stored in one big contiguous array (obtained with a single malloc) it cannot just free part of the array, so the extra locations just sit harmlessly at the end if the array unused. > >> If so, this did not work for me, even after doing >> MAT_FINAL_ASSEMBLY the unused nonzero allocations remain. Is this normal? > > Yes, > > Barry > >> >>> >>> Fixed in the branch barry/fix-mat-new-nonzero-locations/maint >>> >>> Once this passes testing it will go into the maint branch and then the next patch release but you can use it now in the branch barry/fix-mat-new-nonzero-locations/maint >>> >>> Thanks for the report and reproducible example >>> >>> Barry >>> >>> >>>> On May 29, 2018, at 7:51 PM, Marius Buerkle wrote: >>>> >>>> Sure, I made a small reproducer, it is Fortran though I hope that is ok. If MAT_NEW_NONZERO_LOCATIONS is set to false I get an error, if it is set to true the new nonzero element is inserted, if MAT_NEW_NONZERO_LOCATIONS is false and either MAT_NEW_NONZERO_LOCATION_ERR or MAT_NEW_NONZERO_ALLOCATION_ERR is set to false afterwards then the new nonzero is also created without an error, but if MAT_NEW_NONZERO_LOCATIONS is set to false after MAT_NEW_NONZERO_LOCATION_ERR/MAT_NEW_NONZERO_ALLOCATION_ERR have been set to false I get an error again. >>>> >>>> >>>> program newnonzero >>>> #include >>>> use petscmat >>>> implicit none >>>> >>>> Mat :: A >>>> PetscInt :: dnnz,onnz,n,m,idxm(1),idxn(1),nl1,nl2 >>>> PetscScalar :: v(1) >>>> PetscReal :: info(MAT_INFO_SIZE) >>>> PetscErrorCode :: ierr >>>> >>>> integer :: nproc,iproc,i >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>>> >>>> call MPI_COMM_SIZE(PETSC_COMM_WORLD, nproc,ierr) >>>> >>>> call MPI_Comm_rank( PETSC_COMM_WORLD, iproc, ierr ) >>>> >>>> n=3 >>>> m=n >>>> call MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,m,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,A,ierr) >>>> >>>> >>>> call MatGetOwnershipRange(A,nl1,nl2,ierr) >>>> do i=nl1,nl2-1 >>>> idxn(1)=i >>>> idxm(1)=i >>>> v(1)=1d0 >>>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>>> end do >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>>> >>>> call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_FALSE,ierr) >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR ,PETSC_FALSE,ierr) >>>> !~ call MatSetOption(A,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ierr) >>>> >>>> >>>> idxn(1)=0 >>>> idxm(1)=n-1 >>>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>>> v(1)=2d0 >>>> call MatSetValues(A,1,idxn,1,idxm, v,INSERT_VALUES,ierr) >>>> end if >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) >>>> >>>> if ((idxn(1).ge.nl1).and.(idxn(1).le.nl2-1)) then >>>> v(1)=2d0 >>>> call MatGetValues(A,1,idxn,1,idxm, v,ierr) >>>> write(6,*) v >>>> end if >>>> >>>> call PetscFinalize(ierr) >>>> >>>> end program newnonzero >>>> >>>> >>>> >>>> $ mpiexec.hydra -n 3 ./a.out >>>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>> [0]PETSC ERROR: Argument out of range >>>> [0]PETSC ERROR: Inserting a new nonzero at global row/column (0, 2) into matrix >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html[http://www.mcs.anl.gov/petsc/documentation/faq.html] for trouble shooting. >>>> [0]PETSC ERROR: Petsc Release Version 3.9.2, May, 20, 2018 >>>> [0]PETSC ERROR: ./a.out on a named tono-hpc1 by marius Wed May 30 09:42:40 2018 >>>> [0]PETSC ERROR: Configure options --prefix=/home/marius/prog/petsc/3.9.2 --download-elemental=yes --download-metis=yes --download-parmetis=yes --download-mumps=yes --with-scalapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --FC=mpiifort --CC=mpicc --CXX=mpicxx --with-scalar-type=complex --with-mpi-dir= --with-blaslapack-lib="/home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_sequential.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_core.a /home/marius/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm -ldl" --with-cxx-dialect=C++11 --download-superlu_dist=yes --download-ptscotch=yes --with-x --with-debugging=1 --download-superlu=yes --with-mkl_cpardiso=1 --with-mkl_pardiso=1 --with-scalapack=1 >>>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 607 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/impls/aij/mpi/mpiaij.c >>>> [0]PETSC ERROR: #2 MatSetValues() line 1312 in /home/marius/prog/petsc/petsc-3.9.2/src/mat/interface/matrix.c >>>> (0.000000000000000E+000,0.000000000000000E+000) >>>> >>>> >>>> >>>> Please send complete error message; type of matrix used etc. Ideally code that demonstrates the problem. >>>> >>>> Barry >>>> >>>> >>>>> On May 29, 2018, at 3:31 AM, Marius Buerkle wrote: >>>>> >>>>> >>>>> Hi, >>>>> >>>>> I tried to set MAT_NEW_NONZERO_LOCATIONS to false, as far as I understood MatSetValues should simply ignore entries which would give rise to new nonzero values not creating a new entry and not cause an error, but I get "[1]PETSC ERROR: Inserting a new nonzero at global row/column". Is this option supposed to work or not? >>>> >>> >>> >? ?