[petsc-users] Enquiry regarding log summary results

Jed Brown jedbrown at mcs.anl.gov
Wed Oct 3 10:59:59 CDT 2012


There is an inordinate amount of time being spent in VecScatterEnd(). That
sometimes indicates a very bad partition. Also, are your "48 cores" real
physical cores or just "logical cores" (look like cores to the operating
system, usually advertised as "threads" by the vendor, nothing like cores
in reality)? That can cause a huge load imbalance and very confusing
results as over-subscribed threads compete for shared resources. Step it
back to 24 threads and 12 threads, send log_summary for each.

On Wed, Oct 3, 2012 at 8:08 AM, TAY wee-beng <zonexo at gmail.com> wrote:

>  On 2/10/2012 2:43 PM, Jed Brown wrote:
>
> On Tue, Oct 2, 2012 at 8:35 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>>  Hi,
>>
>> I have combined the momentum linear eqns involving x,y,z into 1 large
>> matrix. The Poisson eqn is solved using HYPRE strcut format so it's not
>> included. I run the code for 50 timesteps (hence 50 kspsolve) using 96
>> procs. The log_summary is given below. I have some questions:
>>
>> 1. After combining the matrix, I should have only 1 PETSc matrix. Why
>> does it says there are 4 matrix, 12 vector etc?
>>
>
>  They are part of preconditioning. Are you sure you're using Hypre for
> this? It looks like you are using bjacobi/ilu.
>
>
>>
>> 2. I'm looking at the stages which take the longest time. It seems that
>> MatAssemblyBegin, VecNorm, VecAssemblyBegin, VecScatterEnd have very high
>> ratios. The ratios of some others are also not too good (~ 1.6 - 2). So are
>> these stages the reason why my code is not scaling well? What can I do to
>> improve it?
>>
>
>  3/4 of the solve time is evenly balanced between MatMult, MatSolve,
> MatLUFactorNumeric, and VecNorm+VecDot.
>
>  The high VecAssembly time might be due to generating a lot of entries
> off-process?
>
>  In any case, this looks like an _extremely_ slow network, perhaps it's
> misconfigured?
>
>
> My cluster is configured with 48 procs per node. I re-run the case, using
> only 48 procs, thus there's no need to pass over a 'slow' interconnect. I'm
> now also using GAMG and BCGS for the poisson and momentum eqn respectively.
> I have also separated the x,y,z component of the momentum eqn to 3 separate
> linear eqns to debug the problem.
>
> Results show that stage "momentum_z" is taking a lot of time. I wonder if
> it has to do with the fact that I am partitioning my grids in the z
> direction. VecScatterEnd, MatMult are taking a lot of time. VecNormalize,
> VecScatterEnd, VecNorm, VecAssemblyBegin 's ratio are also not good.
>
> I wonder why a lot of entries are generated off-process.
>
> I create my RHS vector using:
>
> *call
> VecCreateMPI(MPI_COMM_WORLD,ijk_xyz_end-ijk_xyz_sta,PETSC_DECIDE,b_rhs_semi_z,ierr)
> *
>
> where ijk_xyz_sta and ijk_xyz_end are obtained from
>
> *call MatGetOwnershipRange(A_semi_z,ijk_xyz_sta,ijk_xyz_end,ierr)*
>
> I then insert the values into the vector using:
>
> *call VecSetValues(b_rhs_semi_z , ijk_xyz_end - ijk_xyz_sta ,
> (/ijk_xyz_sta : ijk_xyz_end - 1/) , q_semi_vect_z(ijk_xyz_sta + 1 :
> ijk_xyz_end) , INSERT_VALUES , ierr)*
>
> What should I do to correct the problem?
>
> Thanks
>
>
>
>
>>
>> Btw, I insert matrix using:
>>
>> *do ijk=ijk_xyz_sta+1,ijk_xyz_end**
>> **
>> **    II = ijk - 1**    !Fortran shift to 0-based**
>> **    **
>> **    call
>> MatSetValues(A_semi_xyz,1,II,7,int_semi_xyz(ijk,1:7),semi_mat_xyz(ijk,1:7),INSERT_VALUES,ierr)
>> **
>> **
>> **end do*
>>
>> where ijk_xyz_sta/ijk_xyz_end are the starting/end index
>>
>> int_semi_xyz(ijk,1:7) stores the 7 column global indices
>>
>> semi_mat_xyz has the corresponding values.
>>
>> and I insert vectors using:
>>
>> call
>> VecSetValues(b_rhs_semi_xyz,ijk_xyz_end_mz-ijk_xyz_sta_mz,(/ijk_xyz_sta_mz:ijk_xyz_end_mz-1/),q_semi_vect_xyz(ijk_xyz_sta_mz+1:ijk_xyz_end_mz),INSERT_VALUES,ierr)
>>
>> Thanks!
>>
>> *
>> *
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>>  On 30/9/2012 11:30 PM, Jed Brown wrote:
>>
>> You can measure the time spent in Hypre via PCApply and PCSetUp, but you
>> can't get finer grained integrated profiling because it was not set up that
>> way.
>> On Sep 30, 2012 3:26 PM, "TAY wee-beng" <zonexo at gmail.com> wrote:
>>
>>>  On 27/9/2012 1:44 PM, Matthew Knepley wrote:
>>>
>>> On Thu, Sep 27, 2012 at 3:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm doing a log summary for my 3d cfd code. I have some questions:
>>>>
>>>> 1. if I'm solving 3 linear equations using ksp, is the result given in
>>>> the log summary the total of the 3 linear eqns' performance? How can I get
>>>> the performance for each individual eqn?
>>>>
>>>
>>>  Use logging stages:
>>> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogStagePush.html
>>>
>>>
>>>> 2. If I run my code for 10 time steps, does the log summary gives the
>>>> total or avg performance/ratio?
>>>>
>>>
>>>  Total.
>>>
>>>
>>>> 3. Besides PETSc, I'm also using HYPRE's native geometric MG (Struct)
>>>> to solve my Cartesian's grid CFD poisson eqn. Is there any way I can use
>>>> PETSc's log summary to get HYPRE's performance? If I use boomerAMG thru
>>>> PETSc, can I get its performance?
>>>
>>>
>>>  If you mean flops, only if you count them yourself and tell PETSc
>>> using
>>> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogFlops.html
>>>
>>>  This is the disadvantage of using packages that do not properly
>>> monitor things :)
>>>
>>>      Matt
>>>
>>>
>>> So u mean if I use boomerAMG thru PETSc, there is no proper way of
>>> evaluating its performance, beside using PetscLogFlops?
>>>
>>>
>>>> --
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>>
>>>
>>>
>>>  --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121003/a03e80a9/attachment.html>


More information about the petsc-users mailing list