[petsc-users] Strange GAMG performance for mixed FE formulation
Matthew Knepley
knepley at gmail.com
Wed Mar 2 19:48:20 CST 2016
On Wed, Mar 2, 2016 at 7:15 PM, Justin Chang <jychang48 at gmail.com> wrote:
> Barry,
>
> Attached are the log_summary output for each preconditioner.
>
MatPtAP takes all the time. It looks like there is no coarsening at all at
the first level. Mark, can you see what is going on here?
Matt
> Thanks,
> Justin
>
>
> On Wednesday, March 2, 2016, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> Justin,
>>
>> Do you have the -log_summary output for these runs?
>>
>> Barry
>>
>> > On Mar 2, 2016, at 4:28 PM, Justin Chang <jychang48 at gmail.com> wrote:
>> >
>> > Dear all,
>> >
>> > Using the firedrake project, I am solving this simple mixed poisson
>> problem:
>> >
>> > mesh = UnitCubeMesh(40,40,40)
>> > V = FunctionSpace(mesh,"RT",1)
>> > Q = FunctionSpace(mesh,"DG",0)
>> > W = V*Q
>> >
>> > v, p = TrialFunctions(W)
>> > w, q = TestFunctions(W)
>> >
>> > f = Function(Q)
>> >
>> f.interpolate(Expression("12*pi*pi*sin(pi*x[0]*2)*sin(pi*x[1]*2)*sin(2*pi*x[2])"))
>> >
>> > a = dot(v,w)*dx - p*div(w)*dx + div(v)*q*dx
>> > L = f*q*dx
>> >
>> > u = Function(W)
>> > solve(a==L,u,solver_parameters={...})
>> >
>> > This problem has 1161600 degrees of freedom. The solver_parameters are:
>> >
>> > -ksp_type gmres
>> > -pc_type fieldsplit
>> > -pc_fieldsplit_type schur
>> > -pc_fieldsplit_schur_fact_type: upper
>> > -pc_fieldsplit_schur_precondition selfp
>> > -fieldsplit_0_ksp_type preonly
>> > -fieldsplit_0_pc_type bjacobi
>> > -fieldsplit_1_ksp_type preonly
>> > -fieldsplit_1_pc_type hypre/ml/gamg
>> >
>> > for the last option, I compared the wall-clock timings for hypre,
>> ml,and gamg. Here are the strong-scaling results (across 64 cores, 8 cores
>> per Intel Xeon E5-2670 node) for hypre, ml, and gamg:
>> >
>> > hypre:
>> > 1 core: 47.5 s, 12 solver iters
>> > 2 cores: 34.1 s, 15 solver iters
>> > 4 cores: 21.5 s, 15 solver iters
>> > 8 cores: 16.6 s, 15 solver iters
>> > 16 cores: 10.2 s, 15 solver iters
>> > 24 cores: 7.66 s, 15 solver iters
>> > 32 cores: 6.31 s, 15 solver iters
>> > 40 cores: 5.68 s, 15 solver iters
>> > 48 cores: 5.36 s, 16 solver iters
>> > 56 cores: 5.12 s, 16 solver iters
>> > 64 cores: 4.99 s, 16 solver iters
>> >
>> > ml:
>> > 1 core: 4.44 s, 14 solver iters
>> > 2 cores: 2.85 s, 16 solver iters
>> > 4 cores: 1.6 s, 17 solver iters
>> > 8 cores: 0.966 s, 17 solver iters
>> > 16 cores: 0.585 s, 18 solver iters
>> > 24 cores: 0.440 s, 18 solver iters
>> > 32 cores: 0.375 s, 18 solver iters
>> > 40 cores: 0.332 s, 18 solver iters
>> > 48 cores: 0.307 s, 17 solver iters
>> > 56 cores: 0.290 s, 18 solver iters
>> > 64 cores: 0.281 s, 18 solver items
>> >
>> > gamg:
>> > 1 core: 613 s, 12 solver iters
>> > 2 cores: 204 s, 15 solver iters
>> > 4 cores: 77.1 s, 15 solver iters
>> > 8 cores: 38.1 s, 15 solver iters
>> > 16 cores: 15.9 s, 16 solver iters
>> > 24 cores: 9.24 s, 16 solver iters
>> > 32 cores: 5.92 s, 16 solver iters
>> > 40 cores: 4.72 s, 16 solver iters
>> > 48 cores: 3.89 s, 16 solver iters
>> > 56 cores: 3.65 s, 16 solver iters
>> > 64 cores: 3.46 s, 16 solver iters
>> >
>> > The performance difference between ML and HYPRE makes sense to me, but
>> what I am really confused about is GAMG. It seems GAMG is really slow on a
>> single core but something internally is causing it to speed up
>> super-linearly as I increase the number of MPI processes. Shouldn't ML and
>> GAMG have the same performance? I am not sure what log outputs to give you
>> guys, but for starters, below is -ksp_view for the single core case with
>> GAMG
>> >
>> > KSP Object:(solver_) 1 MPI processes
>> > type: gmres
>> > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>> > GMRES: happy breakdown tolerance 1e-30
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
>> > left preconditioning
>> > using PRECONDITIONED norm type for convergence test
>> > PC Object:(solver_) 1 MPI processes
>> > type: fieldsplit
>> > FieldSplit with Schur preconditioner, factorization UPPER
>> > Preconditioner for the Schur complement formed from Sp, an
>> assembled approximation to S, which uses (lumped, if requested) A00's
>> diagonal's inverse
>> > Split info:
>> > Split number 0 Defined by IS
>> > Split number 1 Defined by IS
>> > KSP solver for A00 block
>> > KSP Object: (solver_fieldsplit_0_) 1 MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_0_) 1 MPI processes
>> > type: bjacobi
>> > block Jacobi: number of blocks = 1
>> > Local solve is same for all blocks, in the following KSP and
>> PC objects:
>> > KSP Object: (solver_fieldsplit_0_sub_) 1
>> MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_0_sub_) 1
>> MPI processes
>> > type: ilu
>> > ILU: out-of-place factorization
>> > 0 levels of fill
>> > tolerance for zero pivot 2.22045e-14
>> > matrix ordering: natural
>> > factor fill ratio given 1., needed 1.
>> > Factored matrix follows:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > package used to perform factorization: petsc
>> > total: nonzeros=5385600, allocated nonzeros=5385600
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object: (solver_fieldsplit_0_) 1
>> MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > total: nonzeros=5385600, allocated nonzeros=5385600
>> > total number of mallocs used during MatSetValues calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object: (solver_fieldsplit_0_) 1 MPI
>> processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > total: nonzeros=5385600, allocated nonzeros=5385600
>> > total number of mallocs used during MatSetValues calls =0
>> > not using I-node routines
>> > KSP solver for S = A11 - A10 inv(A00) A01
>> > KSP Object: (solver_fieldsplit_1_) 1 MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_1_) 1 MPI processes
>> > type: gamg
>> > MG: type is MULTIPLICATIVE, levels=5 cycles=v
>> > Cycles per PCApply=1
>> > Using Galerkin computed coarse grid matrices
>> > GAMG specific options
>> > Threshold for dropping small values from graph 0.
>> > AGG specific options
>> > Symmetric graph false
>> > Coarse grid solver -- level -------------------------------
>> > KSP Object: (solver_fieldsplit_1_mg_coarse_)
>> 1 MPI processes
>> > type: preonly
>> > maximum iterations=1, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_1_mg_coarse_)
>> 1 MPI processes
>> > type: bjacobi
>> > block Jacobi: number of blocks = 1
>> > Local solve is same for all blocks, in the following KSP
>> and PC objects:
>> > KSP Object:
>> (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes
>> > type: preonly
>> > maximum iterations=1, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object:
>> (solver_fieldsplit_1_mg_coarse_sub_) 1 MPI processes
>> > type: lu
>> > LU: out-of-place factorization
>> > tolerance for zero pivot 2.22045e-14
>> > using diagonal shift on blocks to prevent zero pivot
>> [INBLOCKS]
>> > matrix ordering: nd
>> > factor fill ratio given 5., needed 1.
>> > Factored matrix follows:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=9, cols=9
>> > package used to perform factorization: petsc
>> > total: nonzeros=81, allocated nonzeros=81
>> > total number of mallocs used during
>> MatSetValues calls =0
>> > using I-node routines: found 2 nodes, limit
>> used is 5
>> > linear system matrix = precond matrix:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=9, cols=9
>> > total: nonzeros=81, allocated nonzeros=81
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > using I-node routines: found 2 nodes, limit used is
>> 5
>> > linear system matrix = precond matrix:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=9, cols=9
>> > total: nonzeros=81, allocated nonzeros=81
>> > total number of mallocs used during MatSetValues calls =0
>> > using I-node routines: found 2 nodes, limit used is 5
>> > Down solver (pre-smoother) on level 1
>> -------------------------------
>> > KSP Object: (solver_fieldsplit_1_mg_levels_1_)
>> 1 MPI processes
>> > type: chebyshev
>> > Chebyshev: eigenvalue estimates: min = 0.0999525, max =
>> 1.09948
>> > Chebyshev: eigenvalues estimated using gmres with
>> translations [0. 0.1; 0. 1.1]
>> > KSP Object:
>> (solver_fieldsplit_1_mg_levels_1_esteig_) 1 MPI processes
>> > type: gmres
>> > GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> > GMRES: happy breakdown tolerance 1e-30
>> > maximum iterations=10, initial guess is zero
>> > tolerances: relative=1e-12, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using PRECONDITIONED norm type for convergence test
>> > maximum iterations=2
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using nonzero initial guess
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_1_mg_levels_1_)
>> 1 MPI processes
>> > type: sor
>> > SOR: type = local_symmetric, iterations = 1, local
>> iterations = 1, omega = 1.
>> > linear system matrix = precond matrix:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=207, cols=207
>> > total: nonzeros=42849, allocated nonzeros=42849
>> > total number of mallocs used during MatSetValues calls =0
>> > using I-node routines: found 42 nodes, limit used is 5
>> > Up solver (post-smoother) same as down solver (pre-smoother)
>> > Down solver (pre-smoother) on level 2
>> -------------------------------
>> > KSP Object: (solver_fieldsplit_1_mg_levels_2_)
>> 1 MPI processes
>> > type: chebyshev
>> > Chebyshev: eigenvalue estimates: min = 0.0996628, max =
>> 1.09629
>> > Chebyshev: eigenvalues estimated using gmres with
>> translations [0. 0.1; 0. 1.1]
>> > KSP Object:
>> (solver_fieldsplit_1_mg_levels_2_esteig_) 1 MPI processes
>> > type: gmres
>> > GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> > GMRES: happy breakdown tolerance 1e-30
>> > maximum iterations=10, initial guess is zero
>> > tolerances: relative=1e-12, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using PRECONDITIONED norm type for convergence test
>> > maximum iterations=2
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using nonzero initial guess
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_1_mg_levels_2_)
>> 1 MPI processes
>> > type: sor
>> > SOR: type = local_symmetric, iterations = 1, local
>> iterations = 1, omega = 1.
>> > linear system matrix = precond matrix:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=5373, cols=5373
>> > total: nonzeros=28852043, allocated nonzeros=28852043
>> > total number of mallocs used during MatSetValues calls =0
>> > using I-node routines: found 1481 nodes, limit used is 5
>> > Up solver (post-smoother) same as down solver (pre-smoother)
>> > Down solver (pre-smoother) on level 3
>> -------------------------------
>> > KSP Object: (solver_fieldsplit_1_mg_levels_3_)
>> 1 MPI processes
>> > type: chebyshev
>> > Chebyshev: eigenvalue estimates: min = 0.0994294, max =
>> 1.09372
>> > Chebyshev: eigenvalues estimated using gmres with
>> translations [0. 0.1; 0. 1.1]
>> > KSP Object:
>> (solver_fieldsplit_1_mg_levels_3_esteig_) 1 MPI processes
>> > type: gmres
>> > GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> > GMRES: happy breakdown tolerance 1e-30
>> > maximum iterations=10, initial guess is zero
>> > tolerances: relative=1e-12, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using PRECONDITIONED norm type for convergence test
>> > maximum iterations=2
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using nonzero initial guess
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_1_mg_levels_3_)
>> 1 MPI processes
>> > type: sor
>> > SOR: type = local_symmetric, iterations = 1, local
>> iterations = 1, omega = 1.
>> > linear system matrix = precond matrix:
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=52147, cols=52147
>> > total: nonzeros=38604909, allocated nonzeros=38604909
>> > total number of mallocs used during MatSetValues calls =2
>> > not using I-node routines
>> > Up solver (post-smoother) same as down solver (pre-smoother)
>> > Down solver (pre-smoother) on level 4
>> -------------------------------
>> > KSP Object: (solver_fieldsplit_1_mg_levels_4_)
>> 1 MPI processes
>> > type: chebyshev
>> > Chebyshev: eigenvalue estimates: min = 0.158979, max =
>> 1.74876
>> > Chebyshev: eigenvalues estimated using gmres with
>> translations [0. 0.1; 0. 1.1]
>> > KSP Object:
>> (solver_fieldsplit_1_mg_levels_4_esteig_) 1 MPI processes
>> > type: gmres
>> > GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>> > GMRES: happy breakdown tolerance 1e-30
>> > maximum iterations=10, initial guess is zero
>> > tolerances: relative=1e-12, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using PRECONDITIONED norm type for convergence test
>> > maximum iterations=2
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using nonzero initial guess
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_1_mg_levels_4_)
>> 1 MPI processes
>> > type: sor
>> > SOR: type = local_symmetric, iterations = 1, local
>> iterations = 1, omega = 1.
>> > linear system matrix followed by preconditioner matrix:
>> > Mat Object: (solver_fieldsplit_1_) 1
>> MPI processes
>> > type: schurcomplement
>> > rows=384000, cols=384000
>> > Schur complement A11 - A10 inv(A00) A01
>> > A11
>> > Mat Object: (solver_fieldsplit_1_)
>> 1 MPI processes
>> > type: seqaij
>> > rows=384000, cols=384000
>> > total: nonzeros=384000, allocated nonzeros=384000
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > A10
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=384000, cols=777600
>> > total: nonzeros=1919999, allocated nonzeros=1919999
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > KSP of A00
>> > KSP Object: (solver_fieldsplit_0_)
>> 1 MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_0_)
>> 1 MPI processes
>> > type: bjacobi
>> > block Jacobi: number of blocks = 1
>> > Local solve is same for all blocks, in the
>> following KSP and PC objects:
>> > KSP Object:
>> (solver_fieldsplit_0_sub_) 1 MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object:
>> (solver_fieldsplit_0_sub_) 1 MPI processes
>> > type: ilu
>> > ILU: out-of-place factorization
>> > 0 levels of fill
>> > tolerance for zero pivot 2.22045e-14
>> > matrix ordering: natural
>> > factor fill ratio given 1., needed 1.
>> > Factored matrix follows:
>> > Mat Object:
>> 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > package used to perform factorization:
>> petsc
>> > total: nonzeros=5385600, allocated
>> nonzeros=5385600
>> > total number of mallocs used during
>> MatSetValues calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object:
>> (solver_fieldsplit_0_) 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > total: nonzeros=5385600, allocated
>> nonzeros=5385600
>> > total number of mallocs used during
>> MatSetValues calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object:
>> (solver_fieldsplit_0_) 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > total: nonzeros=5385600, allocated
>> nonzeros=5385600
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > A01
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=384000
>> > total: nonzeros=1919999, allocated nonzeros=1919999
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=384000, cols=384000
>> > total: nonzeros=3416452, allocated nonzeros=3416452
>> > total number of mallocs used during MatSetValues calls =0
>> > not using I-node routines
>> > Up solver (post-smoother) same as down solver (pre-smoother)
>> > linear system matrix followed by preconditioner matrix:
>> > Mat Object: (solver_fieldsplit_1_) 1 MPI
>> processes
>> > type: schurcomplement
>> > rows=384000, cols=384000
>> > Schur complement A11 - A10 inv(A00) A01
>> > A11
>> > Mat Object: (solver_fieldsplit_1_)
>> 1 MPI processes
>> > type: seqaij
>> > rows=384000, cols=384000
>> > total: nonzeros=384000, allocated nonzeros=384000
>> > total number of mallocs used during MatSetValues calls
>> =0
>> > not using I-node routines
>> > A10
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=384000, cols=777600
>> > total: nonzeros=1919999, allocated nonzeros=1919999
>> > total number of mallocs used during MatSetValues calls
>> =0
>> > not using I-node routines
>> > KSP of A00
>> > KSP Object: (solver_fieldsplit_0_)
>> 1 MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object: (solver_fieldsplit_0_)
>> 1 MPI processes
>> > type: bjacobi
>> > block Jacobi: number of blocks = 1
>> > Local solve is same for all blocks, in the following
>> KSP and PC objects:
>> > KSP Object:
>> (solver_fieldsplit_0_sub_) 1 MPI processes
>> > type: preonly
>> > maximum iterations=10000, initial guess is zero
>> > tolerances: relative=1e-05, absolute=1e-50,
>> divergence=10000.
>> > left preconditioning
>> > using NONE norm type for convergence test
>> > PC Object:
>> (solver_fieldsplit_0_sub_) 1 MPI processes
>> > type: ilu
>> > ILU: out-of-place factorization
>> > 0 levels of fill
>> > tolerance for zero pivot 2.22045e-14
>> > matrix ordering: natural
>> > factor fill ratio given 1., needed 1.
>> > Factored matrix follows:
>> > Mat Object: 1 MPI
>> processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > package used to perform factorization: petsc
>> > total: nonzeros=5385600, allocated
>> nonzeros=5385600
>> > total number of mallocs used during
>> MatSetValues calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object:
>> (solver_fieldsplit_0_) 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > total: nonzeros=5385600, allocated
>> nonzeros=5385600
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object: (solver_fieldsplit_0_)
>> 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=777600
>> > total: nonzeros=5385600, allocated nonzeros=5385600
>> > total number of mallocs used during MatSetValues
>> calls =0
>> > not using I-node routines
>> > A01
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=777600, cols=384000
>> > total: nonzeros=1919999, allocated nonzeros=1919999
>> > total number of mallocs used during MatSetValues calls
>> =0
>> > not using I-node routines
>> > Mat Object: 1 MPI processes
>> > type: seqaij
>> > rows=384000, cols=384000
>> > total: nonzeros=3416452, allocated nonzeros=3416452
>> > total number of mallocs used during MatSetValues calls =0
>> > not using I-node routines
>> > linear system matrix = precond matrix:
>> > Mat Object: 1 MPI processes
>> > type: nest
>> > rows=1161600, cols=1161600
>> > Matrix object:
>> > type=nest, rows=2, cols=2
>> > MatNest structure:
>> > (0,0) : prefix="solver_fieldsplit_0_", type=seqaij,
>> rows=777600, cols=777600
>> > (0,1) : type=seqaij, rows=777600, cols=384000
>> > (1,0) : type=seqaij, rows=384000, cols=777600
>> > (1,1) : prefix="solver_fieldsplit_1_", type=seqaij,
>> rows=384000, cols=384000
>> >
>> > Any insight as to what's happening? Btw this firedrake/petsc-mapdes is
>> from way back in october 2015 (yes much has changed since but
>> reinstalling/updating firedrake and petsc on LANL's firewall HPC machines
>> is a big pain in the ass).
>> >
>> > Thanks,
>> > Justin
>>
>>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160302/a17429c9/attachment-0001.html>
More information about the petsc-users
mailing list