<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Dear colleagues,</p>

    <p>Thank you much for the help!</p>

    <p>Now the code seems to be working well!<br>

    </p>

    Best,<br>

    Lidiia<br>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">On 03.06.2022 15:19, Matthew Knepley

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAMYG4Gm+BCLxfyL4Q22zSgsxLbZvWbu7LiaQqui8sNhU4rofOg@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="ltr">On Fri, Jun 3, 2022 at 6:42 AM Lidia <<a

            href="mailto:lidia.varsh@mail.ioffe.ru"

            moz-do-not-send="true" class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>

          wrote:<br>

        </div>

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <div>

              <p>Dear Matt, Barry,</p>

              <p>thank you for the information about openMP!</p>

              <p>Now all processes are loaded well. But we see a strange

                behaviour of running times at different iterations, see

                description below. Could you please explain us the

                reason and how we can improve it?<br>

              </p>

              <p>We need to quickly solve a big (about 1e6 rows) square

                sparse non-symmetric matrix many times (about 1e5 times)

                consequently. Matrix is constant at every iteration, and

                the right-side vector B is slowly changed (we think that

                its change at every iteration should be less then 0.001

                %). So we use every previous solution vector X as an

                initial guess for the next iteration. AMG preconditioner

                and GMRES solver are used.<br>

              </p>

              <p>We have tested the code using a matrix with 631 000

                rows, during 15 consequent iterations, using vector X

                from the previous iterations. Right-side vector B and

                matrix A are constant during the whole running. The time

                of the first iteration is large (about 2 seconds) and is

                quickly decreased to the next iterations (average time

                of last iterations were about 0.00008 s). But some

                iterations in the middle (# 2 and # 12) have huge time -

                0.999063 second (see the figure with time dynamics

                attached). This time of 0.999 second does not depend on

                the size of a matrix, on the number of MPI processes,

                these time jumps also exist if we vary vector B. Why

                these time jumps appear and how we can avoid them?</p>

            </div>

          </blockquote>

          <div><br>

          </div>

          <div>PETSc is not taking this time. It must come from

            somewhere else in your code. Notice that no iterations are

            taken for any subsequent solves, so no operations other than

            the residual norm check (and preconditioner application) are

            being performed.</div>

          <div><br>

          </div>

          <div>  Thanks,</div>

          <div><br>

          </div>

          <div>     Matt</div>

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

            0.8ex;border-left:1px solid

            rgb(204,204,204);padding-left:1ex">

            <div>

              <p>The ksp_monitor out for this running (included 15

                iterations) using 36 MPI processes and a file with the

                memory bandwidth information (testSpeed) are also

                attached. We can provide our C++ script if it is needed.<br>

              </p>

              <p>Thanks a lot!<br>

              </p>

              Best,<br>

              Lidiia<br>

              <p><br>

              </p>

              <p><br>

              </p>

              <div>On 01.06.2022 21:14, Matthew Knepley wrote:<br>

              </div>

              <blockquote type="cite">

                <div dir="ltr">

                  <div dir="ltr">On Wed, Jun 1, 2022 at 1:43 PM Lidia

                    <<a href="mailto:lidia.varsh@mail.ioffe.ru"

                      target="_blank" moz-do-not-send="true"

                      class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>

                    wrote:<br>

                  </div>

                  <div class="gmail_quote">

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex">

                      <div>

                        <p>Dear Matt,</p>

                        <p>Thank you for the rule of 10,000 variables

                          per process! We have run ex.5 with matrix 1e4

                          x 1e4 at our cluster and got a good

                          performance dynamics (see the figure

                          "performance.png" - dependency of the solving

                          time in seconds on the number of cores). We

                          have used GAMG preconditioner (multithread: we

                          have added the option "<span

style="color:rgb(29,28,29);font-family:Slack-Lato,Slack-Fractions,appleLogo,sans-serif;font-size:15px;font-style:normal;font-variant-ligatures:common-ligatures;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:left;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">-pc_gamg_use_parallel_coarse_grid_solver"</span>)

                          and GMRES solver. And we have set one openMP

                          thread to every MPI process. Now the ex.5 is

                          working good on many mpi processes! But the

                          running uses about 100 GB of RAM.<br>

                        </p>

                        <p>How we can run ex.5 using many openMP threads

                          without mpi? If we just change the running

                          command, the cores are not loaded normally:

                          usually just one core is loaded in 100 % and

                          others are idle. Sometimes all cores are

                          working in 100 % during 1 second but then

                          again become idle about 30 seconds. Can the

                          preconditioner use many threads and how to

                          activate this option?</p>

                      </div>

                    </blockquote>

                    <div><br>

                    </div>

                    <div>Maye you could describe what you are trying to

                      accomplish? Threads and processes are not really

                      different, except for memory sharing. However,

                      sharing large complex data structures rarely

                      works. That is why they get partitioned and

                      operate effectively as distributed memory. You

                      would not really save memory by using</div>

                    <div>threads in this instance, if that is your goal.

                      This is detailed in the talks in this session (see

                      2016 PP Minisymposium on this page <a

                        href="https://cse.buffalo.edu/~knepley/relacs.html"

                        target="_blank" moz-do-not-send="true"

                        class="moz-txt-link-freetext">https://cse.buffalo.edu/~knepley/relacs.html</a>).</div>

                    <div><br>

                    </div>

                    <div>  Thanks,</div>

                    <div><br>

                    </div>

                    <div>     Matt</div>

                    <div> </div>

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex">

                      <div>

                        <p>The solving times (the time of the solver

                          work) using 60 openMP threads is 511 seconds

                          now, and while using 60 MPI processes - 13.19

                          seconds.</p>

                        <p>ksp_monitor outs for both cases (many openMP

                          threads or many MPI processes) are attached.</p>

                        <p><br>

                        </p>

                        <p>Thank you!</p>

                        Best,<br>

                        Lidia<br>

                        <div><br>

                        </div>

                        <div>On 31.05.2022 15:21, Matthew Knepley wrote:<br>

                        </div>

                        <blockquote type="cite">

                          <div dir="ltr">I have looked at the local

                            logs. First, you have run problems of size

                            12  and 24. As a rule of thumb, you need

                            10,000

                            <div>variables per process in order to see

                              good speedup.</div>

                            <div><br>

                            </div>

                            <div>  Thanks,</div>

                            <div><br>

                            </div>

                            <div>     Matt</div>

                          </div>

                          <br>

                          <div class="gmail_quote">

                            <div dir="ltr" class="gmail_attr">On Tue,

                              May 31, 2022 at 8:19 AM Matthew Knepley

                              <<a href="mailto:knepley@gmail.com"

                                target="_blank" moz-do-not-send="true"

                                class="moz-txt-link-freetext">knepley@gmail.com</a>>

                              wrote:<br>

                            </div>

                            <blockquote class="gmail_quote"

                              style="margin:0px 0px 0px

                              0.8ex;border-left:1px solid

                              rgb(204,204,204);padding-left:1ex">

                              <div dir="ltr">

                                <div dir="ltr">On Tue, May 31, 2022 at

                                  7:39 AM Lidia <<a

                                    href="mailto:lidia.varsh@mail.ioffe.ru"

                                    target="_blank"

                                    moz-do-not-send="true"

                                    class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>>

                                  wrote:<br>

                                </div>

                                <div class="gmail_quote">

                                  <blockquote class="gmail_quote"

                                    style="margin:0px 0px 0px

                                    0.8ex;border-left:1px solid

                                    rgb(204,204,204);padding-left:1ex">

                                    <div>

                                      <p>Matt, Mark, thank you much for

                                        your answers!</p>

                                      <p><br>

                                      </p>

                                      <p>Now we have run example # 5 on

                                        our computer cluster and on the

                                        local server and also have not

                                        seen any performance increase,

                                        but by unclear reason running

                                        times on the local server are

                                        much better than on the cluster.</p>

                                    </div>

                                  </blockquote>

                                  <div>I suspect that you are trying to

                                    get speedup without increasing the

                                    memory bandwidth:</div>

                                  <div><br>

                                  </div>

                                  <div>  <a

href="https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup"

                                      target="_blank"

                                      moz-do-not-send="true"

                                      class="moz-txt-link-freetext">https://petsc.org/main/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup</a></div>

                                  <div><br>

                                  </div>

                                  <div>  Thanks,</div>

                                  <div><br>

                                  </div>

                                  <div>     Matt <br>

                                  </div>

                                  <blockquote class="gmail_quote"

                                    style="margin:0px 0px 0px

                                    0.8ex;border-left:1px solid

                                    rgb(204,204,204);padding-left:1ex">

                                    <div>

                                      <p>Now we will try to run petsc #5

                                        example inside a docker

                                        container on our server and see

                                        if the problem is in our

                                        environment. I'll write you the

                                        results of this test as soon as

                                        we get it.</p>

                                      <p>The ksp_monitor outs for the

                                        5th test at the current local

                                        server configuration (for 2 and

                                        4 mpi processes) and for the

                                        cluster (for 1 and 3 mpi

                                        processes) are attached .</p>

                                      <p><br>

                                      </p>

                                      <p>And one more question.

                                        Potentially we can use 10 nodes

                                        and 96 threads at each node on

                                        our cluster. What do you think,

                                        which combination of numbers of

                                        mpi processes and openmp threads

                                        may be the best for the 5th

                                        example?<br>

                                      </p>

                                      <p>Thank you!<br>

                                      </p>

                                      <p><br>

                                      </p>

                                      Best,<br>

                                      Lidiia<br>

                                      <div><br>

                                      </div>

                                      <div>On 31.05.2022 05:42, Mark

                                        Adams wrote:<br>

                                      </div>

                                      <blockquote type="cite">

                                        <div dir="ltr">And if you see

                                          "NO" change in performance I

                                          suspect the solver/matrix is

                                          all on one processor.

                                          <div>(PETSc does not use

                                            threads by default so

                                            threads should not change

                                            anything).</div>

                                          <div><br>

                                          </div>

                                          <div>As Matt said, it is best

                                            to start with a PETSc

                                            example that does something

                                            like what you want (parallel

                                            linear solve, see

                                            src/ksp/ksp/tutorials for

                                            examples), and then add your

                                            code to it.</div>

                                          <div>That way you get the

                                            basic infrastructure in

                                            place for you, which is

                                            pretty obscure to the

                                            uninitiated.</div>

                                          <div><br>

                                          </div>

                                          <div>Mark</div>

                                        </div>

                                        <br>

                                        <div class="gmail_quote">

                                          <div dir="ltr"

                                            class="gmail_attr">On Mon,

                                            May 30, 2022 at 10:18 PM

                                            Matthew Knepley <<a

                                              href="mailto:knepley@gmail.com"

                                              target="_blank"

                                              moz-do-not-send="true"

                                              class="moz-txt-link-freetext">knepley@gmail.com</a>>

                                            wrote:<br>

                                          </div>

                                          <blockquote

                                            class="gmail_quote"

                                            style="margin:0px 0px 0px

                                            0.8ex;border-left:1px solid

rgb(204,204,204);padding-left:1ex">

                                            <div dir="ltr">

                                              <div dir="ltr">On Mon, May

                                                30, 2022 at 10:12 PM

                                                Lidia <<a

                                                  href="mailto:lidia.varsh@mail.ioffe.ru"

                                                  target="_blank"

                                                  moz-do-not-send="true"

class="moz-txt-link-freetext">lidia.varsh@mail.ioffe.ru</a>> wrote:<br>

                                              </div>

                                              <div class="gmail_quote">

                                                <blockquote

                                                  class="gmail_quote"

                                                  style="margin:0px 0px

                                                  0px

                                                  0.8ex;border-left:1px

                                                  solid

                                                  rgb(204,204,204);padding-left:1ex">Dear

                                                  colleagues,<br>

                                                  <br>

                                                  Is here anyone who

                                                  have solved big sparse

                                                  linear matrices using

                                                  PETSC?<br>

                                                </blockquote>

                                                <div><br>

                                                </div>

                                                <div>There are lots of

                                                  publications with this

                                                  kind of data. Here is

                                                  one recent one: <a

                                                    href="https://arxiv.org/abs/2204.01722"

                                                    target="_blank"

                                                    moz-do-not-send="true"

class="moz-txt-link-freetext">https://arxiv.org/abs/2204.01722</a></div>

                                                <div> </div>

                                                <blockquote

                                                  class="gmail_quote"

                                                  style="margin:0px 0px

                                                  0px

                                                  0.8ex;border-left:1px

                                                  solid

                                                  rgb(204,204,204);padding-left:1ex">

                                                  We have found NO

                                                  performance

                                                  improvement while

                                                  using more and more

                                                  mpi <br>

                                                  processes (1-2-3) and

                                                  open-mp threads (from

                                                  1 to 72 threads). Did

                                                  anyone <br>

                                                  faced to this problem?

                                                  Does anyone know any

                                                  possible reasons of

                                                  such <br>

                                                  behaviour?<br>

                                                </blockquote>

                                                <div><br>

                                                </div>

                                                <div>Solver behavior is

                                                  dependent on the input

                                                  matrix. The only

                                                  general-purpose

                                                  solvers</div>

                                                <div>are direct, but

                                                  they do not scale

                                                  linearly and have high

                                                  memory requirements.</div>

                                                <div><br>

                                                </div>

                                                <div>Thus, in order to

                                                  make progress you will

                                                  have to be specific

                                                  about your matrices.</div>

                                                <div> </div>

                                                <blockquote

                                                  class="gmail_quote"

                                                  style="margin:0px 0px

                                                  0px

                                                  0.8ex;border-left:1px

                                                  solid

                                                  rgb(204,204,204);padding-left:1ex">

                                                  We use AMG

                                                  preconditioner and

                                                  GMRES solver from KSP

                                                  package, as our <br>

                                                  matrix is large (from

                                                  100 000 to 1e+6 rows

                                                  and columns), sparse,

                                                  <br>

                                                  non-symmetric and

                                                  includes both positive

                                                  and negative values.

                                                  But <br>

                                                  performance problems

                                                  also exist while using

                                                  CG solvers with

                                                  symmetric <br>

                                                  matrices.<br>

                                                </blockquote>

                                                <div><br>

                                                </div>

                                                <div>There are many

                                                  PETSc examples, such

                                                  as example 5 for the

                                                  Laplacian, that

                                                  exhibit</div>

                                                <div>good scaling with

                                                  both AMG and GMG.</div>

                                                <div> </div>

                                                <blockquote

                                                  class="gmail_quote"

                                                  style="margin:0px 0px

                                                  0px

                                                  0.8ex;border-left:1px

                                                  solid

                                                  rgb(204,204,204);padding-left:1ex">

                                                  Could anyone help us

                                                  to set appropriate

                                                  options of the

                                                  preconditioner <br>

                                                  and solver? Now we use

                                                  default parameters,

                                                  maybe they are not the

                                                  best, <br>

                                                  but we do not know a

                                                  good combination. Or

                                                  maybe you could

                                                  suggest any <br>

                                                  other pairs of

                                                  preconditioner+solver

                                                  for such tasks?<br>

                                                  <br>

                                                  I can provide more

                                                  information: the

                                                  matrices that we

                                                  solve, c++ script <br>

                                                  to run solving using

                                                  petsc and any

                                                  statistics obtained by

                                                  our runs.<br>

                                                </blockquote>

                                                <div><br>

                                                </div>

                                                <div>First, please

                                                  provide a description

                                                  of the linear system,

                                                  and the output of</div>

                                                <div><br>

                                                </div>

                                                <div>  -ksp_view

                                                  -ksp_monitor_true_residual

                                                  -ksp_converged_reason

                                                  -log_view</div>

                                                <div><br>

                                                </div>

                                                <div>for each test case.</div>

                                                <div><br>

                                                </div>

                                                <div>  Thanks,</div>

                                                <div><br>

                                                </div>

                                                <div>     Matt</div>

                                                <div> </div>

                                                <blockquote

                                                  class="gmail_quote"

                                                  style="margin:0px 0px

                                                  0px

                                                  0.8ex;border-left:1px

                                                  solid

                                                  rgb(204,204,204);padding-left:1ex">

                                                  Thank you in advance!<br>

                                                  <br>

                                                  Best regards,<br>

                                                  Lidiia Varshavchik,<br>

                                                  Ioffe Institute, St.

                                                  Petersburg, Russia<br>

                                                </blockquote>

                                              </div>

                                              <br clear="all">

                                              <div><br>

                                              </div>

                                              -- <br>

                                              <div dir="ltr">

                                                <div dir="ltr">

                                                  <div>

                                                    <div dir="ltr">

                                                      <div>

                                                        <div dir="ltr">

                                                          <div>What most

                                                          experimenters

                                                          take for

                                                          granted before

                                                          they begin

                                                          their

                                                          experiments is

                                                          infinitely

                                                          more

                                                          interesting

                                                          than any

                                                          results to

                                                          which their

                                                          experiments

                                                          lead.<br>

                                                          -- Norbert

                                                          Wiener</div>

                                                          <div><br>

                                                          </div>

                                                          <div><a

                                                          href="http://www.cse.buffalo.edu/~knepley/"

target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>

                                                          </div>

                                                        </div>

                                                      </div>

                                                    </div>

                                                  </div>

                                                </div>

                                              </div>

                                            </div>

                                          </blockquote>

                                        </div>

                                      </blockquote>

                                    </div>

                                  </blockquote>

                                </div>

                                <br clear="all">

                                <div><br>

                                </div>

                                -- <br>

                                <div dir="ltr">

                                  <div dir="ltr">

                                    <div>

                                      <div dir="ltr">

                                        <div>

                                          <div dir="ltr">

                                            <div>What most experimenters

                                              take for granted before

                                              they begin their

                                              experiments is infinitely

                                              more interesting than any

                                              results to which their

                                              experiments lead.<br>

                                              -- Norbert Wiener</div>

                                            <div><br>

                                            </div>

                                            <div><a

                                                href="http://www.cse.buffalo.edu/~knepley/"

                                                target="_blank"

                                                moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>

                                            </div>

                                          </div>

                                        </div>

                                      </div>

                                    </div>

                                  </div>

                                </div>

                              </div>

                            </blockquote>

                          </div>

                          <br clear="all">

                          <div><br>

                          </div>

                          -- <br>

                          <div dir="ltr">

                            <div dir="ltr">

                              <div>

                                <div dir="ltr">

                                  <div>

                                    <div dir="ltr">

                                      <div>What most experimenters take

                                        for granted before they begin

                                        their experiments is infinitely

                                        more interesting than any

                                        results to which their

                                        experiments lead.<br>

                                        -- Norbert Wiener</div>

                                      <div><br>

                                      </div>

                                      <div><a

                                          href="http://www.cse.buffalo.edu/~knepley/"

                                          target="_blank"

                                          moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>

                                      </div>

                                    </div>

                                  </div>

                                </div>

                              </div>

                            </div>

                          </div>

                        </blockquote>

                      </div>

                    </blockquote>

                  </div>

                  <br clear="all">

                  <div><br>

                  </div>

                  -- <br>

                  <div dir="ltr">

                    <div dir="ltr">

                      <div>

                        <div dir="ltr">

                          <div>

                            <div dir="ltr">

                              <div>What most experimenters take for

                                granted before they begin their

                                experiments is infinitely more

                                interesting than any results to which

                                their experiments lead.<br>

                                -- Norbert Wiener</div>

                              <div><br>

                              </div>

                              <div><a

                                  href="http://www.cse.buffalo.edu/~knepley/"

                                  target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>

                              </div>

                            </div>

                          </div>

                        </div>

                      </div>

                    </div>

                  </div>

                </div>

              </blockquote>

            </div>

          </blockquote>

        </div>

        <br clear="all">

        <div><br>

        </div>

        -- <br>

        <div dir="ltr" class="gmail_signature">

          <div dir="ltr">

            <div>

              <div dir="ltr">

                <div>

                  <div dir="ltr">

                    <div>What most experimenters take for granted before

                      they begin their experiments is infinitely more

                      interesting than any results to which their

                      experiments lead.<br>

                      -- Norbert Wiener</div>

                    <div><br>

                    </div>

                    <div><a href="http://www.cse.buffalo.edu/~knepley/"

                        target="_blank" moz-do-not-send="true">https://www.cse.buffalo.edu/~knepley/</a><br>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

  </body>

</html>