[petsc-users] Configuring petsc with MPI on ubuntu quad-core

Vijay S. Mahadevan vijay.m at gmail.com
Wed Feb 2 17:38:00 CST 2011


Here's the performance statistic on 1 and 2 processor runs.

/usr/lib/petsc/linux-gnu-cxx-opt/bin/mpiexec -n 1 ./ex20 -grid 20 -log_summary

                         Max       Max/Min        Avg      Total
Time (sec):           8.452e+00      1.00000   8.452e+00
Objects:              1.470e+02      1.00000   1.470e+02
Flops:                5.045e+09      1.00000   5.045e+09  5.045e+09
Flops/sec:            5.969e+08      1.00000   5.969e+08  5.969e+08
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       4.440e+02      1.00000

/usr/lib/petsc/linux-gnu-cxx-opt/bin/mpiexec -n 2 ./ex20 -grid 20 -log_summary

                         Max       Max/Min        Avg      Total
Time (sec):           7.851e+00      1.00000   7.851e+00
Objects:              2.000e+02      1.00000   2.000e+02
Flops:                4.670e+09      1.00580   4.657e+09  9.313e+09
Flops/sec:            5.948e+08      1.00580   5.931e+08  1.186e+09
MPI Messages:         7.965e+02      1.00000   7.965e+02  1.593e+03
MPI Message Lengths:  1.412e+07      1.00000   1.773e+04  2.824e+07
MPI Reductions:       1.046e+03      1.00000

I am not entirely sure if I can make sense out of that statistic but
if there is something more you need, please feel free to let me know.

Vijay

On Wed, Feb 2, 2011 at 5:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Wed, Feb 2, 2011 at 5:04 PM, Vijay S. Mahadevan <vijay.m at gmail.com>
> wrote:
>>
>> Matt,
>>
>> The -with-debugging=1 option is certainly not meant for performance
>> studies but I didn't expect it to yield the same cpu time as a single
>> processor for snes/ex20 i.e., my runs with 1 and 2 processors take
>> approximately the same amount of time for computation of solution. But
>> I am currently configuring without debugging symbols and shall let you
>> know what that yields.
>>
>> On a similar note, is there something extra that needs to be done to
>> make use of multi-core machines while using MPI ? I am not sure if
>> this is even related to PETSc but could be an MPI configuration option
>> that maybe either I or the configure process is missing. All ideas are
>> much appreciated.
>
> Sparse MatVec (MatMult) is a memory bandwidth limited operation. On most
> cheap multicore machines, there is a single memory bus, and thus using more
> cores gains you very little extra performance. I still suspect you are not
> actually
> running in parallel, because you usually see a small speedup. That is why I
> suggested looking at -log_summary since it tells you how many processes were
> run and breaks down the time.
>    Matt
>
>>
>> Vijay
>>
>> On Wed, Feb 2, 2011 at 4:53 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> > On Wed, Feb 2, 2011 at 4:46 PM, Vijay S. Mahadevan <vijay.m at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am trying to configure my petsc install with an MPI installation to
>> >> make use of a dual quad-core desktop system running Ubuntu. But
>> >> eventhough the configure/make process went through without problems,
>> >> the scalability of the programs don't seem to reflect what I expected.
>> >> My configure options are
>> >>
>> >> --download-f-blas-lapack=1 --with-mpi-dir=/usr/lib/ --download-mpich=1
>> >> --with-mpi-shared=0 --with-shared=0 --COPTFLAGS=-g
>> >> --download-parmetis=1 --download-superlu_dist=1 --download-hypre=1
>> >> --download-blacs=1 --download-scalapack=1 --with-clanguage=C++
>> >> --download-plapack=1 --download-mumps=1 --download-umfpack=yes
>> >> --with-debugging=1 --with-errorchecking=yes
>> >
>> > 1) For performance studies, make a build using --with-debugging=0
>> > 2) Look at -log_summary for a breakdown of performance
>> >    Matt
>> >
>> >>
>> >> Is there something else that needs to be done as part of the configure
>> >> process to enable a decent scaling ? I am only comparing programs with
>> >> mpiexec (-n 1) and (-n 2) but they seem to be taking approximately the
>> >> same time as noted from -log_summary. If it helps, I've been testing
>> >> with snes/examples/tutorials/ex20.c for all purposes with a custom
>> >> -grid parameter from command-line to control the number of unknowns.
>> >>
>> >> If there is something you've witnessed before in this configuration or
>> >> if you need anything else to analyze the problem, do let me know.
>> >>
>> >> Thanks,
>> >> Vijay
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> > experiments
>> > is infinitely more interesting than any results to which their
>> > experiments
>> > lead.
>> > -- Norbert Wiener
>> >
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>


More information about the petsc-users mailing list