[petsc-dev] Error running on Titan with GPUs & GNU
Mark Adams
mfadams at lbl.gov
Fri Nov 2 13:25:52 CDT 2018
And I just tested it with GAMG and it seems fine. And hypre ran, but it is
not clear that it used GPUs....
14:13 master= ~/petsc/src/snes/examples/tutorials$ jsrun -n 1 ./ex19
-dm_vec_type cuda -dm_mat_type aijcusparse -pc_type hypre -ksp_type fgmres
-snes_monitor_short -snes_rtol 1.e-5 -ksp_view
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
0 SNES Function norm 0.239155
KSP Object: 1 MPI processes
type: fgmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization
with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
Cycle type V
Maximum number of levels 25
Maximum number of iterations PER hypre call 1
Convergence tolerance PER hypre call 0.
Threshold for strong coupling 0.25
Interpolation truncation factor 0.
Interpolation: max elements per row 0
Number of levels of aggressive coarsening 0
Number of paths for aggressive coarsening 1
Maximum row sums 0.9
Sweeps down 1
Sweeps up 1
Sweeps on coarse 1
Relax down symmetric-SOR/Jacobi
Relax up symmetric-SOR/Jacobi
Relax on coarse Gaussian-elimination
Relax weight (all) 1.
Outer relax weight (all) 1.
Using CF-relaxation
Not using more complex smoothers.
Measure type local
Coarsen type Falgout
Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaijcusparse
rows=64, cols=64, bs=4
total: nonzeros=1024, allocated nonzeros=1024
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
1 SNES Function norm 6.80716e-05
KSP Object: 1 MPI processes
type: fgmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization
with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: hypre
HYPRE BoomerAMG preconditioning
Cycle type V
Maximum number of levels 25
Maximum number of iterations PER hypre call 1
Convergence tolerance PER hypre call 0.
Threshold for strong coupling 0.25
Interpolation truncation factor 0.
Interpolation: max elements per row 0
Number of levels of aggressive coarsening 0
Number of paths for aggressive coarsening 1
Maximum row sums 0.9
Sweeps down 1
Sweeps up 1
Sweeps on coarse 1
Relax down symmetric-SOR/Jacobi
Relax up symmetric-SOR/Jacobi
Relax on coarse Gaussian-elimination
Relax weight (all) 1.
Outer relax weight (all) 1.
Using CF-relaxation
Not using more complex smoothers.
Measure type local
Coarsen type Falgout
Interpolation type classical
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaijcusparse
rows=64, cols=64, bs=4
total: nonzeros=1024, allocated nonzeros=1024
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 16 nodes, limit used is 5
2 SNES Function norm 4.093e-11
Number of SNES iterations = 2
On Fri, Nov 2, 2018 at 2:10 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
>
> > On Nov 2, 2018, at 1:03 PM, Mark Adams <mfadams at lbl.gov> wrote:
> >
> > FYI, I seem to have the new GPU machine at ORNL (summitdev) working with
> GPUs. That is good enough for now.
> > Thanks,
>
> Excellant!
>
> >
> > 14:00 master= ~/petsc/src/snes/examples/tutorials$ jsrun -n 1 ./ex19
> -dm_vec_type cuda -dm_mat_type aijcusparse -pc_type none -ksp_type fgmres
> -snes_monitor_short -snes_rtol 1.e-5 -ksp_view
> > lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
> > 0 SNES Function norm 0.239155
> > KSP Object: 1 MPI processes
> > type: fgmres
> > restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > happy breakdown tolerance 1e-30
> > maximum iterations=10000, initial guess is zero
> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> > right preconditioning
> > using UNPRECONDITIONED norm type for convergence test
> > PC Object: 1 MPI processes
> > type: none
> > linear system matrix = precond matrix:
> > Mat Object: 1 MPI processes
> > type: seqaijcusparse
> > rows=64, cols=64, bs=4
> > total: nonzeros=1024, allocated nonzeros=1024
> > total number of mallocs used during MatSetValues calls =0
> > using I-node routines: found 16 nodes, limit used is 5
> > 1 SNES Function norm 6.82338e-05
> > KSP Object: 1 MPI processes
> > type: fgmres
> > restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > happy breakdown tolerance 1e-30
> > maximum iterations=10000, initial guess is zero
> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
> > right preconditioning
> > using UNPRECONDITIONED norm type for convergence test
> > PC Object: 1 MPI processes
> > type: none
> > linear system matrix = precond matrix:
> > Mat Object: 1 MPI processes
> > type: seqaijcusparse
> > rows=64, cols=64, bs=4
> > total: nonzeros=1024, allocated nonzeros=1024
> > total number of mallocs used during MatSetValues calls =0
> > using I-node routines: found 16 nodes, limit used is 5
> > 2 SNES Function norm 3.346e-10
> > Number of SNES iterations = 2
> > 14:01 master= ~/petsc/src/snes/examples/tutorials$
> >
> >
> >
> > On Thu, Nov 1, 2018 at 9:33 AM Mark Adams <mfadams at lbl.gov> wrote:
> >
> >
> > On Wed, Oct 31, 2018 at 12:30 PM Mark Adams <mfadams at lbl.gov> wrote:
> >
> >
> > On Wed, Oct 31, 2018 at 6:59 AM Karl Rupp <rupp at iue.tuwien.ac.at> wrote:
> > Hi Mark,
> >
> > ah, I was confused by the Python information at the beginning of
> > configure.log. So it is picking up the correct compiler.
> >
> > Have you tried uncommenting the check for GNU?
> >
> > Yes, but I am getting an error that the cuda files do not find mpi.h.
> >
> >
> > I'm getting a make error.
> >
> > Thanks,
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20181102/33b3a224/attachment-0001.html>
More information about the petsc-dev
mailing list