<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Jed,<br>
      <br>
      I believe they are real cores. Anyway, I have attached the log
      summary for the 12/24/48 cores. I re-run a smaller case because
      the large problem can't run with 12cores.<br>
      <pre class="moz-signature" cols="72">Yours sincerely,

TAY wee-beng</pre>
      On 3/10/2012 5:59 PM, Jed Brown wrote:<br>
    </div>
    <blockquote
cite="mid:CAM9tzSm8sUXb3FtXwXT4VX9tputhAyL--iZ3n0VnOkOdsxYJfg@mail.gmail.com"
      type="cite">There is an inordinate amount of time being spent in
      VecScatterEnd(). That sometimes indicates a very bad partition.
      Also, are your "48 cores" real physical cores or just "logical
      cores" (look like cores to the operating system, usually
      advertised as "threads" by the vendor, nothing like cores in
      reality)? That can cause a huge load imbalance and very confusing
      results as over-subscribed threads compete for shared resources.
      Step it back to 24 threads and 12 threads, send log_summary for
      each.<br>
      <br>
      <div class="gmail_quote">On Wed, Oct 3, 2012 at 8:08 AM, TAY
        wee-beng <span dir="ltr"><<a moz-do-not-send="true"
            href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
        wrote:<br>
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div bgcolor="#FFFFFF" text="#000000">
            <div class="im">
              <div>On 2/10/2012 2:43 PM, Jed Brown wrote:<br>
              </div>
              <blockquote type="cite">On Tue, Oct 2, 2012 at 8:35 AM,
                TAY wee-beng <span dir="ltr"><<a
                    moz-do-not-send="true"
                    href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
                wrote:<br>
                <div class="gmail_quote">
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    <div bgcolor="#FFFFFF" text="#000000">
                      <div>Hi,<br>
                        <br>
                        I have combined the momentum linear eqns
                        involving x,y,z into 1 large matrix. The Poisson
                        eqn is solved using HYPRE strcut format so it's
                        not included. I run the code for 50 timesteps
                        (hence 50 kspsolve) using 96 procs. The
                        log_summary is given below. I have some
                        questions:<br>
                        <br>
                        1. After combining the matrix, I should have
                        only 1 PETSc matrix. Why does it says there are
                        4 matrix, 12 vector etc? <br>
                      </div>
                    </div>
                  </blockquote>
                  <div><br>
                  </div>
                  <div>They are part of preconditioning. Are you sure
                    you're using Hypre for this? It looks like you are
                    using bjacobi/ilu.</div>
                  <div> </div>
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    <div bgcolor="#FFFFFF" text="#000000">
                      <div> <br>
                        2. I'm looking at the stages which take the
                        longest time. It seems that MatAssemblyBegin,
                        VecNorm, VecAssemblyBegin, VecScatterEnd have
                        very high ratios. The ratios of some others are
                        also not too good (~ 1.6 - 2). So are these
                        stages the reason why my code is not scaling
                        well? What can I do to improve it?<br>
                      </div>
                    </div>
                  </blockquote>
                  <div><br>
                  </div>
                  <div>3/4 of the solve time is evenly balanced between
                    MatMult, MatSolve, MatLUFactorNumeric, and
                    VecNorm+VecDot.</div>
                  <div><br>
                  </div>
                  <div>The high VecAssembly time might be due to
                    generating a lot of entries off-process?</div>
                  <div><br>
                  </div>
                  <div>In any case, this looks like an _extremely_ slow
                    network, perhaps it's misconfigured?</div>
                </div>
              </blockquote>
              <br>
            </div>
            My cluster is configured with 48 procs per node. I re-run
            the case, using only 48 procs, thus there's no need to pass
            over a 'slow' interconnect. I'm now also using GAMG and BCGS
            for the poisson and momentum eqn respectively. I have also
            separated the x,y,z component of the momentum eqn to 3
            separate linear eqns to debug the problem. <br>
            <br>
            Results show that stage "momentum_z" is taking a lot of
            time. I wonder if it has to do with the fact that I am
            partitioning my grids in the z direction. VecScatterEnd,
            MatMult are taking a lot of time. VecNormalize,
            VecScatterEnd, VecNorm, VecAssemblyBegin 's ratio are also
            not good.<br>
            <br>
            I wonder why a lot of entries are generated off-process.<br>
            <br>
            I create my RHS vector using:<br>
            <br>
            <i>call
VecCreateMPI(MPI_COMM_WORLD,ijk_xyz_end-ijk_xyz_sta,PETSC_DECIDE,b_rhs_semi_z,ierr)</i><br>
            <br>
            where ijk_xyz_sta and ijk_xyz_end are obtained from<br>
            <br>
            <i>call
              MatGetOwnershipRange(A_semi_z,ijk_xyz_sta,ijk_xyz_end,ierr)</i><br>
            <br>
            I then insert the values into the vector using:<br>
            <br>
            <i>call VecSetValues(b_rhs_semi_z , ijk_xyz_end -
              ijk_xyz_sta , (/ijk_xyz_sta : ijk_xyz_end - 1/) ,
              q_semi_vect_z(ijk_xyz_sta + 1 : ijk_xyz_end) ,
              INSERT_VALUES , ierr)</i><br>
            <br>
            What should I do to correct the problem?<br>
            <br>
            Thanks
            <div>
              <div class="h5"><br>
                <br>
                <blockquote type="cite">
                  <div class="gmail_quote">
                    <div> </div>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div> <br>
                          Btw, I insert matrix using:<br>
                          <br>
                          <i>do ijk=ijk_xyz_sta+1,ijk_xyz_end</i><i><br>
                          </i><i><br>
                          </i><i>    II = ijk - 1</i><i>    !Fortran
                            shift to 0-based</i><i><br>
                          </i><i>    </i><i><br>
                          </i><i>    call
MatSetValues(A_semi_xyz,1,II,7,int_semi_xyz(ijk,1:7),semi_mat_xyz(ijk,1:7),INSERT_VALUES,ierr)</i><i><br>
                          </i><i><br>
                          </i><i>end do</i><br>
                          <br>
                          where ijk_xyz_sta/ijk_xyz_end are the
                          starting/end index<br>
                          <br>
                          int_semi_xyz(ijk,1:7) stores the 7 column
                          global indices<br>
                          <br>
                          semi_mat_xyz has the corresponding values.<br>
                          <br>
                          and I insert vectors using:<br>
                          <br>
                          call
VecSetValues(b_rhs_semi_xyz,ijk_xyz_end_mz-ijk_xyz_sta_mz,(/ijk_xyz_sta_mz:ijk_xyz_end_mz-1/),q_semi_vect_xyz(ijk_xyz_sta_mz+1:ijk_xyz_end_mz),INSERT_VALUES,ierr)<br>
                          <br>
                          Thanks!<br>
                          <br>
                          <i><br>
                          </i><br>
                          <pre cols="72">Yours sincerely,

TAY wee-beng</pre>
                          <div>
                            <div> On 30/9/2012 11:30 PM, Jed Brown
                              wrote:<br>
                            </div>
                          </div>
                        </div>
                        <div>
                          <div>
                            <blockquote type="cite">
                              <p>You can measure the time spent in Hypre
                                via PCApply and PCSetUp, but you can't
                                get finer grained integrated profiling
                                because it was not set up that way.</p>
                              <div class="gmail_quote">On Sep 30, 2012
                                3:26 PM, "TAY wee-beng" <<a
                                  moz-do-not-send="true"
                                  href="mailto:zonexo@gmail.com"
                                  target="_blank">zonexo@gmail.com</a>>


                                wrote:<br type="attribution">
                                <blockquote class="gmail_quote"
                                  style="margin:0 0 0
                                  .8ex;border-left:1px #ccc
                                  solid;padding-left:1ex">
                                  <div bgcolor="#FFFFFF" text="#000000">
                                    <div>On 27/9/2012 1:44 PM, Matthew
                                      Knepley wrote:<br>
                                    </div>
                                    <blockquote type="cite">On Thu, Sep
                                      27, 2012 at 3:49 AM, TAY wee-beng
                                      <span dir="ltr"><<a
                                          moz-do-not-send="true"
                                          href="mailto:zonexo@gmail.com"
                                          target="_blank">zonexo@gmail.com</a>></span>
                                      wrote:<br>
                                      <div class="gmail_quote">
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex"> Hi,<br>
                                          <br>
                                          I'm doing a log summary for my
                                          3d cfd code. I have some
                                          questions:<br>
                                          <br>
                                          1. if I'm solving 3 linear
                                          equations using ksp, is the
                                          result given in the log
                                          summary the total of the 3
                                          linear eqns' performance? How
                                          can I get the performance for
                                          each individual eqn?<br>
                                        </blockquote>
                                        <div><br>
                                        </div>
                                        <div>Use logging stages: <a
                                            moz-do-not-send="true"
href="http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogStagePush.html"
                                            target="_blank">http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogStagePush.html</a></div>
                                        <div> </div>
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex"> 2. If
                                          I run my code for 10 time
                                          steps, does the log summary
                                          gives the total or avg
                                          performance/ratio?<br>
                                        </blockquote>
                                        <div><br>
                                        </div>
                                        <div>Total.</div>
                                        <div> </div>
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex"> 3.
                                          Besides PETSc, I'm also using
                                          HYPRE's native geometric MG
                                          (Struct) to solve my
                                          Cartesian's grid CFD poisson
                                          eqn. Is there any way I can
                                          use PETSc's log summary to get
                                          HYPRE's performance? If I use
                                          boomerAMG thru PETSc, can I
                                          get its performance?</blockquote>
                                        <div><br>
                                        </div>
                                        <div>If you mean flops, only if
                                          you count them yourself and
                                          tell PETSc using <a
                                            moz-do-not-send="true"
href="http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogFlops.html"
                                            target="_blank">http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Profiling/PetscLogFlops.html</a></div>
                                        <div><br>
                                        </div>
                                        <div>This is the disadvantage of
                                          using packages that do not
                                          properly monitor things :)</div>
                                        <div><br>
                                        </div>
                                        <div>    Matt</div>
                                        <div> </div>
                                      </div>
                                    </blockquote>
                                    So u mean if I use boomerAMG thru
                                    PETSc, there is no proper way of
                                    evaluating its performance, beside
                                    using PetscLogFlops?<br>
                                    <blockquote type="cite">
                                      <div class="gmail_quote">
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex"> <span><font
                                              color="#888888"><br>
                                              -- <br>
                                              Yours sincerely,<br>
                                              <br>
                                              TAY wee-beng<br>
                                              <br>
                                            </font></span></blockquote>
                                      </div>
                                      <br>
                                      <br clear="all">
                                      <div><br>
                                      </div>
                                      -- <br>
                                      What most experimenters take for
                                      granted before they begin their
                                      experiments is infinitely more
                                      interesting than any results to
                                      which their experiments lead.<br>
                                      -- Norbert Wiener<br>
                                    </blockquote>
                                    <br>
                                  </div>
                                </blockquote>
                              </div>
                            </blockquote>
                            <br>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                </blockquote>
                <br>
              </div>
            </div>
          </div>
        </blockquote>
      </div>
      <br>
    </blockquote>
    <br>
  </body>
</html>