<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    Its twice the memory of the entire matrix (when stored on one

    process). I also just sent you the valgrind results, both for a

    serial run and a parallel run. The size on disk of the matrix I used

    is 20 GB. <br>

    In the serial run, valgrind shows a peak memory usage of 21GB, while

    in the parallel run (with 4 processes) each process shows a peak

    memory usage of 10.8GB<br>

    <br>

    Best regards,<br>

    Michael<br>

    <br>

    <div class="moz-cite-prefix">On 07.10.21 17:55, Barry Smith wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:0EAF1EE7-C34D-4118-BF74-78E1D983EFFD@petsc.dev">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <br class="">

      <div><br class="">

        <blockquote type="cite" class="">

          <div class="">On Oct 7, 2021, at 11:35 AM, Michael Werner <<a

              href="mailto:michael.werner@dlr.de" class=""

              moz-do-not-send="true">michael.werner@dlr.de</a>>

            wrote:</div>

          <br class="Apple-interchange-newline">

          <div class="">

            <div class=""> Currently I'm using psutil to query every

              process for its memory usage and sum it up. However, the

              spike was only visible in top (I had a call to psutil

              right before and after A.load(viewer), and both reported

              only 50 GB of RAM usage). That's why I thought it might be

              directly tied to loading the matrix. However, I also had

              the problem that the computation crashed due to running

              out of memory while loading a matrix that should in theory

              fit into memory. In that case I would expect the OS to

              free unused meory immediatly, right?<br class="">

              <br class="">

              Concerning Barry's questions: the matrix is a sparse

              matrix and is originally created sequentially as SEQAIJ.

              However, it is then loaded as MPIAIJ, and if I look at the

              memory usage of the various processes, they fill up one

              after another, just as described. Is the origin of the

              matrix somehow preserved in the binary file? I was under

              the impression that the binary format was agnostic to the

              number of processes? </div>

          </div>

        </blockquote>

        <div><br class="">

        </div>

         The file format is independent of the number of processes that

        created it.</div>

      <div><br class="">

        <blockquote type="cite" class="">

          <div class="">

            <div class="">I also varied the number of processes between

              1 and 60, as soon as I use more than one process I can

              observe the spike (and its always twice the memory, no

              matter how many processes I'm using).<br class="">

            </div>

          </div>

        </blockquote>

        <div><br class="">

        </div>

          Twice the size of the entire matrix (when stored on one

        process) or twice the size of the resulting matrix stored on the

        first rank? The latter is exactly as expected, since rank 0 has

        to load the part of the matrix destined for the next rank and

        hence for a short time contains its own part of the matrix and

        the part of one other rank.</div>

      <div><br class="">

      </div>

      <div>  Barry</div>

      <div><br class="">

        <blockquote type="cite" class="">

          <div class="">

            <div class=""> <br class="">

              I also tried running Valgrind with the --tool=massif

              option. However, I don't know what to look for. I can send

              you the output file separately, if it helps.<br class="">

              <br class="">

              Best regards,<br class="">

              Michael <br class="">

              <br class="">

              <div class="moz-cite-prefix">On 07.10.21 16:09, Matthew

                Knepley wrote:<br class="">

              </div>

              <blockquote type="cite"

cite="mid:CAMYG4Gn3X-1ctjBigYUms40sHg0fNPgOnFcqYe+=B8aG4ZF+PQ@mail.gmail.com"

                class="">

                <div dir="ltr" class="">

                  <div dir="ltr" class="">On Thu, Oct 7, 2021 at 10:03

                    AM Barry Smith <<a href="mailto:bsmith@petsc.dev"

                      moz-do-not-send="true" class="">bsmith@petsc.dev</a>>

                    wrote:<br class="">

                  </div>

                  <div class="gmail_quote">

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex"><br class="">

                         How many ranks are you using? Is it a sparse

                      matrix with MPIAIJ? <br class="">

                      <br class="">

                         The intention is that for parallel runs the

                      first rank reads in its own part of the matrix,

                      then reads in the part of the next rank and sends

                      it, then reads the part of the third rank and

                      sends it etc. So there should not be too much of a

                      blip in memory usage. You can run valgrind with

                      the option for tracking memory usage to see

                      exactly where in the code the blip occurs; it

                      could be a regression occurred in the code making

                      it require more memory. But internal MPI buffers

                      might explain some blip.<br class="">

                    </blockquote>

                    <div class=""><br class="">

                    </div>

                    <div class="">Is it possible that we free the

                      memory, but the OS has just not given back that

                      memory for use yet? How are you measuring memory

                      usage?</div>

                    <div class=""><br class="">

                    </div>

                    <div class="">  Thanks,</div>

                    <div class=""><br class="">

                    </div>

                    <div class="">     Matt</div>

                    <div class=""> </div>

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px 0.8ex;border-left:1px solid

                      rgb(204,204,204);padding-left:1ex">   Barry<br

                        class="">

                      <br class="">

                      <br class="">

                      > On Oct 7, 2021, at 9:50 AM, Michael Werner

                      <<a href="mailto:michael.werner@dlr.de"

                        target="_blank" moz-do-not-send="true" class="">michael.werner@dlr.de</a>>

                      wrote:<br class="">

                      > <br class="">

                      > Hello,<br class="">

                      > <br class="">

                      > I noticed that there is a peak in memory

                      consumption when I load an<br class="">

                      > existing matrix into PETSc. The matrix is

                      previously created by an<br class="">

                      > external program and saved in the PETSc

                      binary format.<br class="">

                      > The code I'm using in petsc4py is simple:<br

                        class="">

                      > <br class="">

                      > viewer =

                      PETSc.Viewer().createBinary(<path/to/existing/matrix>,

                      "r",<br class="">

                      > comm=PETSc.COMM_WORLD)<br class="">

                      > A = PETSc.Mat().create(comm=PETSc.COMM_WORLD)<br

                        class="">

                      > A.load(viewer)<br class="">

                      > <br class="">

                      > When I run this code in serial, the memory

                      consumption of the process is<br class="">

                      > about 50GB RAM, similar to the file size of

                      the saved matrix. However,<br class="">

                      > if I run the code in parallel, for a few

                      seconds the memory consumption<br class="">

                      > of the process doubles to around 100GB RAM,

                      before dropping back down to<br class="">

                      > around 50GB RAM. So it seems as if, for some

                      reason, the matrix is<br class="">

                      > copied after it is read into memory. Is there

                      a way to avoid this<br class="">

                      > behaviour? Currently, it is a clear

                      bottleneck in my code.<br class="">

                      > <br class="">

                      > I tried setting the size of the matrix and to

                      explicitly preallocate the<br class="">

                      > necessary NNZ (with A.setSizes(dim) and

                      A.setPreallocationNNZ(nnz),<br class="">

                      > respectively) before loading, but that didn't

                      help.<br class="">

                      > <br class="">

                      > As mentioned above, I'm using petsc4py

                      together with PETSc-3.16 on a<br class="">

                      > Linux workstation.<br class="">

                      > <br class="">

                      > Best regards,<br class="">

                      > Michael Werner<br class="">

                      > <br class="">

                      > -- <br class="">

                      > <br class="">

                      >

                      ____________________________________________________<br

                        class="">

                      > <br class="">

                      > Deutsches Zentrum für Luft- und Raumfahrt

                      e.V. (DLR)<br class="">

                      > Institut für Aerodynamik und Strömungstechnik

                      | Bunsenstr. 10 | 37073 Göttingen<br class="">

                      > <br class="">

                      > Michael Werner <br class="">

                      > Telefon 0551 709-2627 | Telefax 0551 709-2811

                      | <a href="mailto:Michael.Werner@dlr.de"

                        target="_blank" moz-do-not-send="true" class="">Michael.Werner@dlr.de</a><br

                        class="">

                      > <a href="http://DLR.de" class=""

                        moz-do-not-send="true">DLR.de</a><br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      > <br class="">

                      <br class="">

                    </blockquote>

                  </div>

                  <br class="" clear="all">

                  <div class=""><br class="">

                  </div>

                  -- <br class="">

                  <div dir="ltr" class="gmail_signature">

                    <div dir="ltr" class="">

                      <div class="">

                        <div dir="ltr" class="">

                          <div class="">

                            <div dir="ltr" class="">

                              <div class="">What most experimenters take

                                for granted before they begin their

                                experiments is infinitely more

                                interesting than any results to which

                                their experiments lead.<br class="">

                                -- Norbert Wiener</div>

                              <div class=""><br class="">

                              </div>

                              <div class=""><a

                                  href="http://www.cse.buffalo.edu/~knepley/"

                                  target="_blank" moz-do-not-send="true"

                                  class="">https://www.cse.buffalo.edu/~knepley/</a><br

                                  class="">

                              </div>

                            </div>

                          </div>

                        </div>

                      </div>

                    </div>

                  </div>

                </div>

              </blockquote>

              <br class="">

            </div>

          </div>

        </blockquote>

      </div>

      <br class="">

    </blockquote>

    <br>

  </body>

</html>