<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div>  PETSc is an MPI library. It is not an OpenMP library. Only some external packages that PETSc uses can use OpenMP, things like GAMG will not utilize OpenMP pretty much at all.<div class=""><br class=""></div><div class="">  Barry</div><div class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jun 1, 2022, at 1:37 PM, Lidia <<a href="mailto:lidia.varsh@mail.ioffe.ru" class="">lidia.varsh@mail.ioffe.ru</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
  
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
  
  <div class=""><p class="">Dear Matt,</p><p class="">Thank you for the rule of 10,000 variables per process! We have
      run ex.5 with matrix 1e4 x 1e4 at our cluster and got a good
      performance dynamics (see the figure "performance.png" -
      dependency of the solving time in seconds on the number of cores).
      We have used GAMG preconditioner (multithread: we have added the
      option "<span style="color: rgb(29, 28, 29); font-family:
        Slack-Lato, Slack-Fractions, appleLogo, sans-serif; font-size:
        15px; font-style: normal; font-variant-ligatures:
        common-ligatures; font-variant-caps: normal; font-weight: 400;
        letter-spacing: normal; orphans: 2; text-align: left;
        text-indent: 0px; text-transform: none; white-space: normal;
        widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
        background-color: rgb(255, 255, 255); text-decoration-thickness:
        initial; text-decoration-style: initial; text-decoration-color:
        initial; display: inline !important; float: none;" class="">-pc_gamg_use_parallel_coarse_grid_solver"</span>)
      and GMRES solver. And we have set one openMP thread to every MPI
      process. Now the ex.5 is working good on many mpi processes! But
      the running uses about 100 GB of RAM.<br class="">
    </p><p class="">How we can run ex.5 using many openMP threads without mpi? If we
      just change the running command, the
      cores are not loaded normally: usually just one core is loaded in
      100 % and others are idle. Sometimes all cores are working in 100
      %
      during 1 second but then again become idle about 30 seconds. Can
      the preconditioner use many threads and how to activate this
      option?</p><p class="">The solving times (the time of the solver work) using 60 openMP
      threads is 511 seconds now, and while using 60 MPI processes -
      13.19 seconds.</p><p class="">ksp_monitor outs for both cases (many openMP threads or many MPI
      processes) are attached.</p><p class=""><br class="">
    </p><p class="">Thank you!</p>
    Best,<br class="">
    Lidia<br class="">
    <div class="moz-cite-prefix"><br class="">
    </div>
    <div class="moz-cite-prefix">On 31.05.2022 15:21, Matthew Knepley
      wrote:<br class="">
    </div>
    <blockquote type="cite" cite="mid:CAMYG4G=2+46y86OfYccv4UJV991gyOJndaZC_7UuZM-aanDnOA@mail.gmail.com" class="">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8" class="">
      <div dir="ltr" class="">I have looked at the local logs. First, you have
        run problems of size 12  and 24. As a rule of thumb, you need
        10,000
        <div class="">variables per process in order to see good speedup.</div>
        <div class=""><br class="">
        </div>
        <div class="">  Thanks,</div>
        <div class=""><br class="">
        </div>
        <div class="">     Matt</div>
      </div>
      <br class="">
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, May 31, 2022 at 8:19
          AM Matthew Knepley <<a href="mailto:knepley@gmail.com" moz-do-not-send="true" class="moz-txt-link-freetext">knepley@gmail.com</a>>
          wrote:<br class="">
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div dir="ltr" class="">
            <div dir="ltr" class="">On Tue, May 31, 2022 at 7:39 AM Lidia <<a href="mailto:lidia.varsh@mail.ioffe.ru" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>
              wrote:<br class="">
            </div>
            <div class="gmail_quote">
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
                0.8ex;border-left:1px solid
                rgb(204,204,204);padding-left:1ex">
                <div class=""><p class="">Matt, Mark, thank you much for your answers!</p><p class=""><br class="">
                  </p><p class="">Now we have run example # 5 on our computer cluster
                    and on the local server and also have not seen any
                    performance increase, but by unclear reason running
                    times on the local server are much better than on
                    the cluster.</p>
                </div>
              </blockquote>
              <div class="">I suspect that you are trying to get speedup without
                increasing the memory bandwidth:</div>
              <div class=""><br class="">
              </div>
              <div class="">  <a href="https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup</a></div>
              <div class=""><br class="">
              </div>
              <div class="">  Thanks,</div>
              <div class=""><br class="">
              </div>
              <div class="">     Matt <br class="">
              </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
                0.8ex;border-left:1px solid
                rgb(204,204,204);padding-left:1ex">
                <div class=""><p class="">Now we will try to run petsc #5 example inside a
                    docker container on our server and see if the
                    problem is in our environment. I'll write you the
                    results of this test as soon as we get it.</p><p class="">The ksp_monitor outs for the 5th test at the
                    current local server configuration (for 2 and 4 mpi
                    processes) and for the cluster (for 1 and 3 mpi
                    processes) are attached .</p><p class=""><br class="">
                  </p><p class="">And one more question. Potentially we can use 10
                    nodes and 96 threads at each node on our cluster.
                    What do you think, which combination of numbers of
                    mpi processes and openmp threads may be the best for
                    the 5th example?<br class="">
                  </p><p class="">Thank you!<br class="">
                  </p><p class=""><br class="">
                  </p>
                  Best,<br class="">
                  Lidiia<br class="">
                  <div class=""><br class="">
                  </div>
                  <div class="">On 31.05.2022 05:42, Mark Adams wrote:<br class="">
                  </div>
                  <blockquote type="cite" class="">
                    <div dir="ltr" class="">And if you see "NO" change in
                      performance I suspect the solver/matrix is all on
                      one processor.
                      <div class="">(PETSc does not use threads by default so
                        threads should not change anything).</div>
                      <div class=""><br class="">
                      </div>
                      <div class="">As Matt said, it is best to start with a
                        PETSc example that does something like what you
                        want (parallel linear solve, see
                        src/ksp/ksp/tutorials for examples), and then
                        add your code to it.</div>
                      <div class="">That way you get the basic infrastructure in
                        place for you, which is pretty obscure to the
                        uninitiated.</div>
                      <div class=""><br class="">
                      </div>
                      <div class="">Mark</div>
                    </div>
                    <br class="">
                    <div class="gmail_quote">
                      <div dir="ltr" class="gmail_attr">On Mon, May 30,
                        2022 at 10:18 PM Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">knepley@gmail.com</a>>
                        wrote:<br class="">
                      </div>
                      <blockquote class="gmail_quote" style="margin:0px
                        0px 0px 0.8ex;border-left:1px solid
                        rgb(204,204,204);padding-left:1ex">
                        <div dir="ltr" class="">
                          <div dir="ltr" class="">On Mon, May 30, 2022 at 10:12
                            PM Lidia <<a href="mailto:lidia.varsh@mail.ioffe.ru" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>
                            wrote:<br class="">
                          </div>
                          <div class="gmail_quote">
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex">Dear
                              colleagues,<br class="">
                              <br class="">
                              Is here anyone who have solved big sparse
                              linear matrices using PETSC?<br class="">
                            </blockquote>
                            <div class=""><br class="">
                            </div>
                            <div class="">There are lots of publications with
                              this kind of data. Here is one recent
                              one: <a href="https://arxiv.org/abs/2204.01722" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://arxiv.org/abs/2204.01722</a></div>
                            <div class=""> </div>
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex"> We
                              have found NO performance improvement
                              while using more and more mpi <br class="">
                              processes (1-2-3) and open-mp threads
                              (from 1 to 72 threads). Did anyone <br class="">
                              faced to this problem? Does anyone know
                              any possible reasons of such <br class="">
                              behaviour?<br class="">
                            </blockquote>
                            <div class=""><br class="">
                            </div>
                            <div class="">Solver behavior is dependent on the
                              input matrix. The only general-purpose
                              solvers</div>
                            <div class="">are direct, but they do not scale
                              linearly and have high memory
                              requirements.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">Thus, in order to make progress you
                              will have to be specific about your
                              matrices.</div>
                            <div class=""> </div>
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex"> We use
                              AMG preconditioner and GMRES solver from
                              KSP package, as our <br class="">
                              matrix is large (from 100 000 to 1e+6 rows
                              and columns), sparse, <br class="">
                              non-symmetric and includes both positive
                              and negative values. But <br class="">
                              performance problems also exist while
                              using CG solvers with symmetric <br class="">
                              matrices.<br class="">
                            </blockquote>
                            <div class=""><br class="">
                            </div>
                            <div class="">There are many PETSc examples, such as
                              example 5 for the Laplacian, that exhibit</div>
                            <div class="">good scaling with both AMG and GMG.</div>
                            <div class=""> </div>
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex"> Could
                              anyone help us to set appropriate options
                              of the preconditioner <br class="">
                              and solver? Now we use default parameters,
                              maybe they are not the best, <br class="">
                              but we do not know a good combination. Or
                              maybe you could suggest any <br class="">
                              other pairs of preconditioner+solver for
                              such tasks?<br class="">
                              <br class="">
                              I can provide more information: the
                              matrices that we solve, c++ script <br class="">
                              to run solving using petsc and any
                              statistics obtained by our runs.<br class="">
                            </blockquote>
                            <div class=""><br class="">
                            </div>
                            <div class="">First, please provide a description of
                              the linear system, and the output of</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">  -ksp_view -ksp_monitor_true_residual
                              -ksp_converged_reason -log_view</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">for each test case.</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">  Thanks,</div>
                            <div class=""><br class="">
                            </div>
                            <div class="">     Matt</div>
                            <div class=""> </div>
                            <blockquote class="gmail_quote" style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex"> Thank
                              you in advance!<br class="">
                              <br class="">
                              Best regards,<br class="">
                              Lidiia Varshavchik,<br class="">
                              Ioffe Institute, St. Petersburg, Russia<br class="">
                            </blockquote>
                          </div>
                          <br clear="all" class="">
                          <div class=""><br class="">
                          </div>
                          -- <br class="">
                          <div dir="ltr" class="">
                            <div dir="ltr" class="">
                              <div class="">
                                <div dir="ltr" class="">
                                  <div class="">
                                    <div dir="ltr" class="">
                                      <div class="">What most experimenters take
                                        for granted before they begin
                                        their experiments is infinitely
                                        more interesting than any
                                        results to which their
                                        experiments lead.<br class="">
                                        -- Norbert Wiener</div>
                                      <div class=""><br class="">
                                      </div>
                                      <div class=""><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" moz-do-not-send="true" class="">https://www.cse.buffalo.edu/~knepley/</a><br class="">
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                  </blockquote>
                </div>
              </blockquote>
            </div>
            <br clear="all" class="">
            <div class=""><br class="">
            </div>
            -- <br class="">
            <div dir="ltr" class="">
              <div dir="ltr" class="">
                <div class="">
                  <div dir="ltr" class="">
                    <div class="">
                      <div dir="ltr" class="">
                        <div class="">What most experimenters take for granted
                          before they begin their experiments is
                          infinitely more interesting than any results
                          to which their experiments lead.<br class="">
                          -- Norbert Wiener</div>
                        <div class=""><br class="">
                        </div>
                        <div class=""><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" moz-do-not-send="true" class="">https://www.cse.buffalo.edu/~knepley/</a><br class="">
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </blockquote>
      </div>
      <br clear="all" class="">
      <div class=""><br class="">
      </div>
      -- <br class="">
      <div dir="ltr" class="gmail_signature">
        <div dir="ltr" class="">
          <div class="">
            <div dir="ltr" class="">
              <div class="">
                <div dir="ltr" class="">
                  <div class="">What most experimenters take for granted before
                    they begin their experiments is infinitely more
                    interesting than any results to which their
                    experiments lead.<br class="">
                    -- Norbert Wiener</div>
                  <div class=""><br class="">
                  </div>
                  <div class=""><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" moz-do-not-send="true" class="">https://www.cse.buffalo.edu/~knepley/</a><br class="">
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
  </div>

<span id="cid:A578CF80-2032-4B46-96FE-931EECEE344E"><performance.png></span><span id="cid:DE6DD3E9-E3EA-4A3A-9BCD-282B169B934A"><testOpenMP.txt></span><span id="cid:718D9EFF-D145-486E-B086-AE60B6C17267"><testMpi60.txt></span></div></blockquote></div><br class=""></div></body></html>