petsc-users Digest, Vol 2, Issue 32
Barry Smith
bsmith at mcs.anl.gov
Tue Feb 24 12:25:44 CST 2009
SNESSolve 100 1.0 2.9339e+03 1.0 9.17e+05 1.0 0.0e+00 0.0e
+00 1.0e+03100100 0 0 1 100
SNESLineSearch 202 1.0 7.9707e+02 1.0 4.35e+05 1.0 0.0e+00 0.0e
+00 4.0e+02 27 13 0 0 0 27
SNESFunctionEval 302 1.0 1.1836e+03 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 3.0e+02 40 0 0 0 0 40
SNESJacobianEval 202 1.0 1.7238e+03 1.0 0.00e+00 0.0 0.0e+00 0.0e
+00 2.0e+02 59 0 0 0 0 59
The final column above (I have removed the later ones for clarity) is
the problem. Your function evaluation is
taking 40% of the time and the Jacobian 59%
The PETSc linear solver is taking 1% so something is seriously bad
about your function evaluation and Jacobian evaluation
code.
Barry
On Feb 24, 2009, at 12:17 PM, STEPHANE TCHOUANMO wrote:
> Here is my -log_summary:
> Something looks a bit strange to me; its the MPI Reductions below.
> Other than that, i dont see anything relevant.
> What do you think?
> Thanks
>
>
>
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
>
> ./diff-conv-opt on a linux-gnu named linux-stchouan with 1
> processor, by stephane Tue Feb 24 13:54:35 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT
> 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> Max Max/
> Min Avg Total
> Time (sec): 2.937e+03 1.00000 2.937e+03
> Objects: 3.420e+03 1.00000 3.420e
> +03
> Flops: 2.690e+09 1.00000
> 2.690e+09 2.690e+09
> Flops/sec: 9.161e+05 1.00000 9.161e
> +05 9.161e+05
> MPI Messages: 0.000e+00 0.00000 0.000e+00
> 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Reductions: 1.189e+05 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of
> length N --> 2N flops
> and VecAXPY() for complex vectors of
> length N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- ---
> Messages --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts
> %Total Avg %Total counts %Total
> 0: Main Stage: 2.9367e+03 100.0% 2.6905e+09 100.0% 0.000e
> +00 0.0% 0.000e+00 0.0% 1.106e+03 0.9%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops/sec: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all
> processors
> Mess: number of messages sent
> Avg. len: average message length
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with
> PetscLogStagePush() and PetscLogStagePop().
> %T - percent time in this phase %F - percent flops in
> this phase
> %M - percent messages in this phase %L - percent message
> lengths in this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max
> time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
> Event Count Time (sec) Flops/
> sec --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg
> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot 202 1.0 3.0360e-02 1.0 3.96e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 396
> VecMDot 202 1.0 3.0552e-02 1.0 3.94e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 394
> VecNorm 1110 1.0 1.2257e+00 1.0 5.40e+07 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 2 0 0 0 0 2 0 0 0 54
> VecScale 404 1.0 3.5342e-02 1.0 3.41e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 341
> VecCopy 507 1.0 8.4626e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 1408 1.0 1.1664e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 202 1.0 2.6221e-02 1.0 4.59e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 459
> VecWAXPY 202 1.0 4.4239e-02 1.0 1.36e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 136
> VecMAXPY 404 1.0 7.3515e-02 1.0 3.27e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 1 0 0 0 0 1 0 0 0 327
> VecAssemblyBegin 302 1.0 9.2960e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAssemblyEnd 302 1.0 5.5790e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecScatterBegin 603 1.0 1.9933e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecNormalize 404 1.0 5.5408e-01 1.0 6.52e+07 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 1 0 0 0 0 1 0 0 0 65
> MatMult 404 1.0 2.6457e+00 1.0 2.26e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 22 0 0 0 0 22 0 0 0 226
> MatSolve 404 1.0 4.6454e+00 1.0 1.28e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 22 0 0 0 0 22 0 0 0 128
> MatLUFactorNum 202 1.0 1.5211e+01 1.0 8.85e+07 1.0 0.0e+00 0.0e
> +00 0.0e+00 1 50 0 0 0 1 50 0 0 0 89
> MatILUFactorSym 100 1.0 1.9993e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 1.0e+02 0 0 0 0 0 0 0 0 0 9 0
> MatAssemblyBegin 404 1.0 9.6217e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 404 1.0 1.4601e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetRowIJ 100 1.0 2.4641e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 100 1.0 7.6755e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 2.0e+02 0 0 0 0 0 0 0 0 0 18 0
> MatZeroEntries 99 1.0 3.6160e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> SNESSolve 100 1.0 2.9339e+03 1.0 9.17e+05 1.0 0.0e+00 0.0e
> +00 1.0e+03100100 0 0 1 100100 0 0 91 1
> SNESLineSearch 202 1.0 7.9707e+02 1.0 4.35e+05 1.0 0.0e+00 0.0e
> +00 4.0e+02 27 13 0 0 0 27 13 0 0 37 0
> SNESFunctionEval 302 1.0 1.1836e+03 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 3.0e+02 40 0 0 0 0 40 0 0 0 27 0
> SNESJacobianEval 202 1.0 1.7238e+03 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 2.0e+02 59 0 0 0 0 59 0 0 0 18 0
> KSPGMRESOrthog 202 1.0 7.0303e-02 1.0 3.42e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 1 0 0 0 0 1 0 0 0 342
> KSPSetup 202 1.0 4.6391e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e
> +00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 202 1.0 2.4101e+01 1.0 9.65e+07 1.0 0.0e+00 0.0e
> +00 3.0e+02 1 86 0 0 0 1 86 0 0 27 97
> PCSetUp 202 1.0 1.7296e+01 1.0 7.78e+07 1.0 0.0e+00 0.0e
> +00 3.0e+02 1 50 0 0 0 1 50 0 0 27 78
> PCApply 404 1.0 4.6487e+00 1.0 1.28e+08 1.0 0.0e+00 0.0e
> +00 0.0e+00 0 22 0 0 0 0 22 0 0 0 128
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants'
> Mem.
>
> --- Event Stage 0: Main Stage
>
> Index Set 904 901 107564984 0
> Vec 1511 1497 357441684 0
> Vec Scatter 604 604 0 0
> Matrix 101 99 942432084 0
> SNES 100 99 12276 0
> Krylov Solver 100 99 1671120 0
> Preconditioner 100 99 14256 0
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> =
> ======================================================================
> Average time to get PetscTime(): 1.49164e-06
> OptionTable: -snes_converged_reason
> OptionTable: -snes_max_it 20
> OptionTable: -snes_rtol 0.0000001
> OptionTable: -snes_stol 0.001
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4
> sizeof(PetscScalar) 8
> Configure run at: Mon Feb 23 23:01:43 2009
> Configure options: --with-debugging=no -with-shared --download-mpich=1
> -----------------------------------------
>
>
>
>
> >
> > Message: 7
> > Date: Tue, 24 Feb 2009 18:05:38 +0100
> > From: STEPHANE TCHOUANMO <tchouanm at msn.com>
> > Subject: Petsc slowness on a single processor machine?
> > To: <petsc-users at mcs.anl.gov>
> > Message-ID: <BAY107-W5D076FE1F3A7037240F88C2AF0 at phx.gbl>
> > Content-Type: text/plain; charset="windows-1256"
> >
> >
> > Dear all,
> >
> > I use petsc to solve a nonlinear convection-diffusion type
> equation in porous media.
> > The built linear system's size is 30000 and its resolution lasts
> about a hour on a single processor machine (my laptop...).
> > I
> > run for that a hundred time steps and have in average 4 Newton
> > iterations per time step. So a hour seems huge to me with the
> optimized
> > version of Petsc.
> > I tryed to profile my code with the options
> > "-log_summary" and "-info". What i get out of it is that the first
> time
> > step lasts about 15min representing 25% of the total time and i
> think
> > its not normal.
> > The thing is, these 15min happen at the second call
> > of the petsc routine "VecScatterCreate()" right after the first
> Newton residual.
> >
> > Any idea?
> > Thanks.
> >
> > Stephane
> >
> >
> >
> > _________________________________________________________________
> > Show them the way! Add maps and directions to your party invites.
> > http://www.microsoft.com/windows/windowslive/products/events.aspx
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090224/c695a2bf/attachment-0001.htm
> >
> >
> > ------------------------------
> >
> > Message: 8
> > Date: Tue, 24 Feb 2009 11:22:18 -0600
> > From: Matthew Knepley <knepley at gmail.com>
> > Subject: Re: Petsc slowness on a single processor machine?
> > To: PETSc users list <petsc-users at mcs.anl.gov>
> > Message-ID:
> > <a9f269830902240922y6751b40aqf8ea84257c139fd9 at mail.gmail.com>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > 2009/2/24 STEPHANE TCHOUANMO <tchouanm at msn.com>
> >
> > > Dear all,
> > >
> > > I use petsc to solve a nonlinear convection-diffusion type
> equation in
> > > porous media.
> > > The built linear system's size is 30000 and its resolution lasts
> about a
> > > hour on a single processor machine (my laptop...).
> > > I run for that a hundred time steps and have in average 4 Newton
> iterations
> > > per time step. So a hour seems huge to me with the optimized
> version of
> > > Petsc.
> > > I tryed to profile my code with the options "-log_summary" and "-
> info".
> > > What i get out of it is that the first time step lasts about 15min
> > > representing 25% of the total time and i think its not normal.
> > > The thing is, these 15min happen at the second call of the petsc
> routine
> > > "VecScatterCreate()" right after the first Newton residual.
> > >
> >
> > We can't say anything without seeing the entire output of -
> log_summary.
> >
> > Matt
> >
> >
> > >
> > > Any idea?
> > > Thanks.
> > >
> > > Stephane
> > >
> > > <http://www.live.com/getstarted.aspx>
> > > <http://www.microsoft.com/windows/windowslive/default.aspx>
> > > ------------------------------
> > > See all the ways you can stay connected to friends and family<http://www.microsoft.com/windows/windowslive/default.aspx
> >
> > >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments
> > is infinitely more interesting than any results to which their
> experiments
> > lead.
> > -- Norbert Wiener
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090224/cd1ffccb/attachment-0001.htm
> >
> >
> > ------------------------------
> >
> > _______________________________________________
> > petsc-users mailing list
> > petsc-users at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users
> >
> >
> > End of petsc-users Digest, Vol 2, Issue 32
> > ******************************************
>
> Get news, entertainment and everything you care about at Live.com.
> Check it out!
More information about the petsc-users
mailing list