<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Dear colleagues,</p>
    <p>Thank you much for the help!</p>
    <p>Now the code seems to be working well!<br>
    </p>
    Best,<br>
    Lidiia<br>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">On 03.06.2022 15:19, Matthew Knepley
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAMYG4Gm+BCLxfyL4Q22zSgsxLbZvWbu7LiaQqui8sNhU4rofOg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div dir="ltr">On Fri, Jun 3, 2022 at 6:42 AM Lidia <<a
            href="mailto:lidia.varsh@mail.ioffe.ru"
            moz-do-not-send="true" class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>
          wrote:<br>
        </div>
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <p>Dear Matt, Barry,</p>
              <p>thank you for the information about openMP!</p>
              <p>Now all processes are loaded well. But we see a strange
                behaviour of running times at different iterations, see
                description below. Could you please explain us the
                reason and how we can improve it?<br>
              </p>
              <p>We need to quickly solve a big (about 1e6 rows) square
                sparse non-symmetric matrix many times (about 1e5 times)
                consequently. Matrix is constant at every iteration, and
                the right-side vector B is slowly changed (we think that
                its change at every iteration should be less then 0.001
                %). So we use every previous solution vector X as an
                initial guess for the next iteration. AMG preconditioner
                and GMRES solver are used.<br>
              </p>
              <p>We have tested the code using a matrix with 631 000
                rows, during 15 consequent iterations, using vector X
                from the previous iterations. Right-side vector B and
                matrix A are constant during the whole running. The time
                of the first iteration is large (about 2 seconds) and is
                quickly decreased to the next iterations (average time
                of last iterations were about 0.00008 s). But some
                iterations in the middle (# 2 and # 12) have huge time -
                0.999063 second (see the figure with time dynamics
                attached). This time of 0.999 second does not depend on
                the size of a matrix, on the number of MPI processes,
                these time jumps also exist if we vary vector B. Why
                these time jumps appear and how we can avoid them?</p>
            </div>
          </blockquote>
          <div><br>
          </div>
          <div>PETSc is not taking this time. It must come from
            somewhere else in your code. Notice that no iterations are
            taken for any subsequent solves, so no operations other than
            the residual norm check (and preconditioner application) are
            being performed.</div>
          <div><br>
          </div>
          <div>  Thanks,</div>
          <div><br>
          </div>
          <div>     Matt</div>
          <div> </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <p>The ksp_monitor out for this running (included 15
                iterations) using 36 MPI processes and a file with the
                memory bandwidth information (testSpeed) are also
                attached. We can provide our C++ script if it is needed.<br>
              </p>
              <p>Thanks a lot!<br>
              </p>
              Best,<br>
              Lidiia<br>
              <p><br>
              </p>
              <p><br>
              </p>
              <div>On 01.06.2022 21:14, Matthew Knepley wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">
                  <div dir="ltr">On Wed, Jun 1, 2022 at 1:43 PM Lidia
                    <<a href="mailto:lidia.varsh@mail.ioffe.ru"
                      target="_blank" moz-do-not-send="true"
                      class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>
                    wrote:<br>
                  </div>
                  <div class="gmail_quote">
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex">
                      <div>
                        <p>Dear Matt,</p>
                        <p>Thank you for the rule of 10,000 variables
                          per process! We have run ex.5 with matrix 1e4
                          x 1e4 at our cluster and got a good
                          performance dynamics (see the figure
                          "performance.png" - dependency of the solving
                          time in seconds on the number of cores). We
                          have used GAMG preconditioner (multithread: we
                          have added the option "<span
style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">-pc_gamg_use_parallel_coarse_grid_solver"</span>)
                          and GMRES solver. And we have set one openMP
                          thread to every MPI process. Now the ex.5 is
                          working good on many mpi processes! But the
                          running uses about 100 GB of RAM.<br>
                        </p>
                        <p>How we can run ex.5 using many openMP threads
                          without mpi? If we just change the running
                          command, the cores are not loaded normally:
                          usually just one core is loaded in 100 % and
                          others are idle. Sometimes all cores are
                          working in 100 % during 1 second but then
                          again become idle about 30 seconds. Can the
                          preconditioner use many threads and how to
                          activate this option?</p>
                      </div>
                    </blockquote>
                    <div><br>
                    </div>
                    <div>Maye you could describe what you are trying to
                      accomplish? Threads and processes are not really
                      different, except for memory sharing. However,
                      sharing large complex data structures rarely
                      works. That is why they get partitioned and
                      operate effectively as distributed memory. You
                      would not really save memory by using</div>
                    <div>threads in this instance, if that is your goal.
                      This is detailed in the talks in this session (see
                      2016 PP Minisymposium on this page <a
                        href="https://cse.buffalo.edu/~knepley/relacs.html"
                        target="_blank" moz-do-not-send="true"
                        class="moz-txt-link-freetext">https://cse.buffalo.edu/~knepley/relacs.html</a>).</div>
                    <div><br>
                    </div>
                    <div>  Thanks,</div>
                    <div><br>
                    </div>
                    <div>     Matt</div>
                    <div> </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex">
                      <div>
                        <p>The solving times (the time of the solver
                          work) using 60 openMP threads is 511 seconds
                          now, and while using 60 MPI processes - 13.19
                          seconds.</p>
                        <p>ksp_monitor outs for both cases (many openMP
                          threads or many MPI processes) are attached.</p>
                        <p><br>
                        </p>
                        <p>Thank you!</p>
                        Best,<br>
                        Lidia<br>
                        <div><br>
                        </div>
                        <div>On 31.05.2022 15:21, Matthew Knepley wrote:<br>
                        </div>
                        <blockquote type="cite">
                          <div dir="ltr">I have looked at the local
                            logs. First, you have run problems of size
                            12  and 24. As a rule of thumb, you need
                            10,000
                            <div>variables per process in order to see
                              good speedup.</div>
                            <div><br>
                            </div>
                            <div>  Thanks,</div>
                            <div><br>
                            </div>
                            <div>     Matt</div>
                          </div>
                          <br>
                          <div class="gmail_quote">
                            <div dir="ltr" class="gmail_attr">On Tue,
                              May 31, 2022 at 8:19 AM Matthew Knepley
                              <<a href="mailto:knepley@gmail.com"
                                target="_blank" moz-do-not-send="true"
                                class="moz-txt-link-freetext">knepley@gmail.com</a>>
                              wrote:<br>
                            </div>
                            <blockquote class="gmail_quote"
                              style="margin:0px 0px 0px
                              0.8ex;border-left:1px solid
                              rgb(204,204,204);padding-left:1ex">
                              <div dir="ltr">
                                <div dir="ltr">On Tue, May 31, 2022 at
                                  7:39 AM Lidia <<a
                                    href="mailto:lidia.varsh@mail.ioffe.ru"
                                    target="_blank"
                                    moz-do-not-send="true"
                                    class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>
                                  wrote:<br>
                                </div>
                                <div class="gmail_quote">
                                  <blockquote class="gmail_quote"
                                    style="margin:0px 0px 0px
                                    0.8ex;border-left:1px solid
                                    rgb(204,204,204);padding-left:1ex">
                                    <div>
                                      <p>Matt, Mark, thank you much for
                                        your answers!</p>
                                      <p><br>
                                      </p>
                                      <p>Now we have run example # 5 on
                                        our computer cluster and on the
                                        local server and also have not
                                        seen any performance increase,
                                        but by unclear reason running
                                        times on the local server are
                                        much better than on the cluster.</p>
                                    </div>
                                  </blockquote>
                                  <div>I suspect that you are trying to
                                    get speedup without increasing the
                                    memory bandwidth:</div>
                                  <div><br>
                                  </div>
                                  <div>  <a
href="https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup"
                                      target="_blank"
                                      moz-do-not-send="true"
                                      class="moz-txt-link-freetext">https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup</a></div>
                                  <div><br>
                                  </div>
                                  <div>  Thanks,</div>
                                  <div><br>
                                  </div>
                                  <div>     Matt <br>
                                  </div>
                                  <blockquote class="gmail_quote"
                                    style="margin:0px 0px 0px
                                    0.8ex;border-left:1px solid
                                    rgb(204,204,204);padding-left:1ex">
                                    <div>
                                      <p>Now we will try to run petsc #5
                                        example inside a docker
                                        container on our server and see
                                        if the problem is in our
                                        environment. I'll write you the
                                        results of this test as soon as
                                        we get it.</p>
                                      <p>The ksp_monitor outs for the
                                        5th test at the current local
                                        server configuration (for 2 and
                                        4 mpi processes) and for the
                                        cluster (for 1 and 3 mpi
                                        processes) are attached .</p>
                                      <p><br>
                                      </p>
                                      <p>And one more question.
                                        Potentially we can use 10 nodes
                                        and 96 threads at each node on
                                        our cluster. What do you think,
                                        which combination of numbers of
                                        mpi processes and openmp threads
                                        may be the best for the 5th
                                        example?<br>
                                      </p>
                                      <p>Thank you!<br>
                                      </p>
                                      <p><br>
                                      </p>
                                      Best,<br>
                                      Lidiia<br>
                                      <div><br>
                                      </div>
                                      <div>On 31.05.2022 05:42, Mark
                                        Adams wrote:<br>
                                      </div>
                                      <blockquote type="cite">
                                        <div dir="ltr">And if you see
                                          "NO" change in performance I
                                          suspect the solver/matrix is
                                          all on one processor.
                                          <div>(PETSc does not use
                                            threads by default so
                                            threads should not change
                                            anything).</div>
                                          <div><br>
                                          </div>
                                          <div>As Matt said, it is best
                                            to start with a PETSc
                                            example that does something
                                            like what you want (parallel
                                            linear solve, see
                                            src/ksp/ksp/tutorials for
                                            examples), and then add your
                                            code to it.</div>
                                          <div>That way you get the
                                            basic infrastructure in
                                            place for you, which is
                                            pretty obscure to the
                                            uninitiated.</div>
                                          <div><br>
                                          </div>
                                          <div>Mark</div>
                                        </div>
                                        <br>
                                        <div class="gmail_quote">
                                          <div dir="ltr"
                                            class="gmail_attr">On Mon,
                                            May 30, 2022 at 10:18 PM
                                            Matthew Knepley <<a
                                              href="mailto:knepley@gmail.com"
                                              target="_blank"
                                              moz-do-not-send="true"
                                              class="moz-txt-link-freetext">knepley@gmail.com</a>>
                                            wrote:<br>
                                          </div>
                                          <blockquote
                                            class="gmail_quote"
                                            style="margin:0px 0px 0px
                                            0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
                                            <div dir="ltr">
                                              <div dir="ltr">On Mon, May
                                                30, 2022 at 10:12 PM
                                                Lidia <<a
                                                  href="mailto:lidia.varsh@mail.ioffe.ru"
                                                  target="_blank"
                                                  moz-do-not-send="true"
class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>> wrote:<br>
                                              </div>
                                              <div class="gmail_quote">
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0px 0px
                                                  0px
                                                  0.8ex;border-left:1px
                                                  solid
                                                  rgb(204,204,204);padding-left:1ex">Dear
                                                  colleagues,<br>
                                                  <br>
                                                  Is here anyone who
                                                  have solved big sparse
                                                  linear matrices using
                                                  PETSC?<br>
                                                </blockquote>
                                                <div><br>
                                                </div>
                                                <div>There are lots of
                                                  publications with this
                                                  kind of data. Here is
                                                  one recent one: <a
                                                    href="https://arxiv.org/abs/2204.01722"
                                                    target="_blank"
                                                    moz-do-not-send="true"
class="moz-txt-link-freetext">https://arxiv.org/abs/2204.01722</a></div>
                                                <div> </div>
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0px 0px
                                                  0px
                                                  0.8ex;border-left:1px
                                                  solid
                                                  rgb(204,204,204);padding-left:1ex">
                                                  We have found NO
                                                  performance
                                                  improvement while
                                                  using more and more
                                                  mpi <br>
                                                  processes (1-2-3) and
                                                  open-mp threads (from
                                                  1 to 72 threads). Did
                                                  anyone <br>
                                                  faced to this problem?
                                                  Does anyone know any
                                                  possible reasons of
                                                  such <br>
                                                  behaviour?<br>
                                                </blockquote>
                                                <div><br>
                                                </div>
                                                <div>Solver behavior is
                                                  dependent on the input
                                                  matrix. The only
                                                  general-purpose
                                                  solvers</div>
                                                <div>are direct, but
                                                  they do not scale
                                                  linearly and have high
                                                  memory requirements.</div>
                                                <div><br>
                                                </div>
                                                <div>Thus, in order to
                                                  make progress you will
                                                  have to be specific
                                                  about your matrices.</div>
                                                <div> </div>
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0px 0px
                                                  0px
                                                  0.8ex;border-left:1px
                                                  solid
                                                  rgb(204,204,204);padding-left:1ex">
                                                  We use AMG
                                                  preconditioner and
                                                  GMRES solver from KSP
                                                  package, as our <br>
                                                  matrix is large (from
                                                  100 000 to 1e+6 rows
                                                  and columns), sparse,
                                                  <br>
                                                  non-symmetric and
                                                  includes both positive
                                                  and negative values.
                                                  But <br>
                                                  performance problems
                                                  also exist while using
                                                  CG solvers with
                                                  symmetric <br>
                                                  matrices.<br>
                                                </blockquote>
                                                <div><br>
                                                </div>
                                                <div>There are many
                                                  PETSc examples, such
                                                  as example 5 for the
                                                  Laplacian, that
                                                  exhibit</div>
                                                <div>good scaling with
                                                  both AMG and GMG.</div>
                                                <div> </div>
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0px 0px
                                                  0px
                                                  0.8ex;border-left:1px
                                                  solid
                                                  rgb(204,204,204);padding-left:1ex">
                                                  Could anyone help us
                                                  to set appropriate
                                                  options of the
                                                  preconditioner <br>
                                                  and solver? Now we use
                                                  default parameters,
                                                  maybe they are not the
                                                  best, <br>
                                                  but we do not know a
                                                  good combination. Or
                                                  maybe you could
                                                  suggest any <br>
                                                  other pairs of
                                                  preconditioner+solver
                                                  for such tasks?<br>
                                                  <br>
                                                  I can provide more
                                                  information: the
                                                  matrices that we
                                                  solve, c++ script <br>
                                                  to run solving using
                                                  petsc and any
                                                  statistics obtained by
                                                  our runs.<br>
                                                </blockquote>
                                                <div><br>
                                                </div>
                                                <div>First, please
                                                  provide a description
                                                  of the linear system,
                                                  and the output of</div>
                                                <div><br>
                                                </div>
                                                <div>  -ksp_view
                                                  -ksp_monitor_true_residual
                                                  -ksp_converged_reason
                                                  -log_view</div>
                                                <div><br>
                                                </div>
                                                <div>for each test case.</div>
                                                <div><br>
                                                </div>
                                                <div>  Thanks,</div>
                                                <div><br>
                                                </div>
                                                <div>     Matt</div>
                                                <div> </div>
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0px 0px
                                                  0px
                                                  0.8ex;border-left:1px
                                                  solid
                                                  rgb(204,204,204);padding-left:1ex">
                                                  Thank you in advance!<br>
                                                  <br>
                                                  Best regards,<br>
                                                  Lidiia Varshavchik,<br>
                                                  Ioffe Institute, St.
                                                  Petersburg, Russia<br>
                                                </blockquote>
                                              </div>
                                              <br clear="all">
                                              <div><br>
                                              </div>
                                              -- <br>
                                              <div dir="ltr">
                                                <div dir="ltr">
                                                  <div>
                                                    <div dir="ltr">
                                                      <div>
                                                        <div dir="ltr">
                                                          <div>What most
                                                          experimenters
                                                          take for
                                                          granted before
                                                          they begin
                                                          their
                                                          experiments is
                                                          infinitely
                                                          more
                                                          interesting
                                                          than any
                                                          results to
                                                          which their
                                                          experiments
                                                          lead.<br>
                                                          -- Norbert
                                                          Wiener</div>
                                                          <div><br>
                                                          </div>
                                                          <div><a
                                                          href="http://www.cse.buffalo.edu/~knepley/"
target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                        </div>
                                      </blockquote>
                                    </div>
                                  </blockquote>
                                </div>
                                <br clear="all">
                                <div><br>
                                </div>
                                -- <br>
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div>
                                      <div dir="ltr">
                                        <div>
                                          <div dir="ltr">
                                            <div>What most experimenters
                                              take for granted before
                                              they begin their
                                              experiments is infinitely
                                              more interesting than any
                                              results to which their
                                              experiments lead.<br>
                                              -- Norbert Wiener</div>
                                            <div><br>
                                            </div>
                                            <div><a
                                                href="http://www.cse.buffalo.edu/~knepley/"
                                                target="_blank"
                                                moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </blockquote>
                          </div>
                          <br clear="all">
                          <div><br>
                          </div>
                          -- <br>
                          <div dir="ltr">
                            <div dir="ltr">
                              <div>
                                <div dir="ltr">
                                  <div>
                                    <div dir="ltr">
                                      <div>What most experimenters take
                                        for granted before they begin
                                        their experiments is infinitely
                                        more interesting than any
                                        results to which their
                                        experiments lead.<br>
                                        -- Norbert Wiener</div>
                                      <div><br>
                                      </div>
                                      <div><a
                                          href="http://www.cse.buffalo.edu/~knepley/"
                                          target="_blank"
                                          moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </blockquote>
                      </div>
                    </blockquote>
                  </div>
                  <br clear="all">
                  <div><br>
                  </div>
                  -- <br>
                  <div dir="ltr">
                    <div dir="ltr">
                      <div>
                        <div dir="ltr">
                          <div>
                            <div dir="ltr">
                              <div>What most experimenters take for
                                granted before they begin their
                                experiments is infinitely more
                                interesting than any results to which
                                their experiments lead.<br>
                                -- Norbert Wiener</div>
                              <div><br>
                              </div>
                              <div><a
                                  href="http://www.cse.buffalo.edu/~knepley/"
                                  target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
            </div>
          </blockquote>
        </div>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        <div dir="ltr" class="gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div>
                  <div dir="ltr">
                    <div>What most experimenters take for granted before
                      they begin their experiments is infinitely more
                      interesting than any results to which their
                      experiments lead.<br>
                      -- Norbert Wiener</div>
                    <div><br>
                    </div>
                    <div><a href="http://www.cse.buffalo.edu/~knepley/"
                        target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
  </body>
</html>