<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 3/11/2015 9:01 PM, Matthew Knepley

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAMYG4GnKGhFuSokeczdFFaBhMWhhgkES03OAUgFjO2ETJb4LHA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On Tue, Nov 3, 2015 at 6:58 AM, TAY

            wee-beng <span dir="ltr"><<a moz-do-not-send="true"

                href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> <br>

                <div>On 3/11/2015 8:52 PM, Matthew Knepley wrote:<br>

                </div>

                <blockquote type="cite">

                  <div dir="ltr">

                    <div class="gmail_extra">

                      <div class="gmail_quote">On Tue, Nov 3, 2015 at

                        6:49 AM, TAY wee-beng <span dir="ltr"><<a

                            moz-do-not-send="true"

                            href="mailto:zonexo@gmail.com"

                            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>></span>

                        wrote:<br>

                        <blockquote class="gmail_quote"

                          style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi,<br>

                          <br>

                          I tried and have attached the log.<br>

                          <br>

                          Ya, my Poisson eqn has Neumann boundary

                          condition. Do I need to specify some null

                          space stuff?  Like KSPSetNullSpace or

                          MatNullSpaceCreate?</blockquote>

                        <div><br>

                        </div>

                        <div>Yes, you need to attach the constant null

                          space to the matrix.</div>

                        <div><br>

                        </div>

                        <div>  Thanks,</div>

                        <div><br>

                        </div>

                        <div>     Matt</div>

                      </div>

                    </div>

                  </div>

                </blockquote>

                Ok so can you point me to a suitable example so that I

                know which one to use specifically?<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div><a moz-do-not-send="true"

href="https://bitbucket.org/petsc/petsc/src/9ae8fd060698c4d6fc0d13188aca8a1828c138ab/src/snes/examples/tutorials/ex12.c?at=master&fileviewer=file-view-default#ex12.c-761">https://bitbucket.org/petsc/petsc/src/9ae8fd060698c4d6fc0d13188aca8a1828c138ab/src/snes/examples/tutorials/ex12.c?at=master&fileviewer=file-view-default#ex12.c-761</a><br>

            </div>

            <div><br>

            </div>

            <div>  Matt</div>

            <div> </div>

          </div>

        </div>

      </div>

    </blockquote>

    Oh ya,<br>

    <br>

    How do I call:<br>

    <br>

    call

    MatNullSpaceCreate(MPI_COMM_WORLD,PETSC_TRUE,0,NULL,nullsp,ierr)<br>

    <br>

    But it says NULL is not defined. How do I define it?<br>

    <br>

    Thanks<br>

    <br>

    <br>

    <blockquote

cite="mid:CAMYG4GnKGhFuSokeczdFFaBhMWhhgkES03OAUgFjO2ETJb4LHA@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> Thanks.<br>

                <blockquote type="cite">

                  <div dir="ltr">

                    <div class="gmail_extra">

                      <div class="gmail_quote">

                        <div> </div>

                        <blockquote class="gmail_quote"

                          style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span><br>

                            Thank you<br>

                            <br>

                            Yours sincerely,<br>

                            <br>

                            TAY wee-beng<br>

                            <br>

                          </span>

                          <div>

                            <div> On 3/11/2015 12:45 PM, Barry Smith

                              wrote:<br>

                              <blockquote class="gmail_quote"

                                style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                <blockquote class="gmail_quote"

                                  style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                  On Nov 2, 2015, at 10:37 PM, TAY

                                  wee-beng<<a moz-do-not-send="true"

                                    href="mailto:zonexo@gmail.com"

                                    target="_blank">zonexo@gmail.com</a>> 

                                  wrote:<br>

                                  <br>

                                  Hi,<br>

                                  <br>

                                  I tried :<br>

                                  <br>

                                  1. -poisson_pc_gamg_agg_nsmooths 1

                                  -poisson_pc_type gamg<br>

                                  <br>

                                  2. -poisson_pc_type gamg<br>

                                </blockquote>

                                    Run with

                                -poisson_ksp_monitor_true_residual

                                -poisson_ksp_monitor_converged_reason<br>

                                Does your poisson have Neumann boundary

                                conditions? Do you have any zeros on the

                                diagonal for the matrix (you shouldn't).<br>

                                <br>

                                   There may be something wrong with

                                your poisson discretization that was

                                also messing up hypre<br>

                                <br>

                                <br>

                                <br>

                                <blockquote class="gmail_quote"

                                  style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                  Both options give:<br>

                                  <br>

                                      1      0.00150000      0.00000000 

                                      0.00000000 1.00000000           

                                   NaN             NaN             NaN<br>

                                  M Diverged but why?, time =           

                                  2<br>

                                  reason =           -9<br>

                                  <br>

                                  How can I check what's wrong?<br>

                                  <br>

                                  Thank you<br>

                                  <br>

                                  Yours sincerely,<br>

                                  <br>

                                  TAY wee-beng<br>

                                  <br>

                                  On 3/11/2015 3:18 AM, Barry Smith

                                  wrote:<br>

                                  <blockquote class="gmail_quote"

                                    style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                        hypre is just not scaling well

                                    here. I do not know why. Since hypre

                                    is a block box for us there is no

                                    way to determine why the poor

                                    scaling.<br>

                                    <br>

                                        If you make the same two runs

                                    with -pc_type gamg there will be a

                                    lot more information in the log

                                    summary about in what routines it is

                                    scaling well or poorly.<br>

                                    <br>

                                       Barry<br>

                                    <br>

                                    <br>

                                    <br>

                                    <blockquote class="gmail_quote"

                                      style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                      On Nov 2, 2015, at 3:17 AM, TAY

                                      wee-beng<<a

                                        moz-do-not-send="true"

                                        href="mailto:zonexo@gmail.com"

                                        target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                      wrote:<br>

                                      <br>

                                      Hi,<br>

                                      <br>

                                      I have attached the 2 files.<br>

                                      <br>

                                      Thank you<br>

                                      <br>

                                      Yours sincerely,<br>

                                      <br>

                                      TAY wee-beng<br>

                                      <br>

                                      On 2/11/2015 2:55 PM, Barry Smith

                                      wrote:<br>

                                      <blockquote class="gmail_quote"

                                        style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                           Run (158/2)x(266/2)x(150/2)

                                        grid on 8 processes  and then

                                        (158)x(266)x(150) on 64

                                        processors  and send the two

                                        -log_summary results<br>

                                        <br>

                                           Barry<br>

                                        <br>

                                          <br>

                                        <blockquote class="gmail_quote"

                                          style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                          On Nov 2, 2015, at 12:19 AM,

                                          TAY wee-beng<<a

                                            moz-do-not-send="true"

                                            href="mailto:zonexo@gmail.com"

                                            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                          wrote:<br>

                                          <br>

                                          Hi,<br>

                                          <br>

                                          I have attached the new

                                          results.<br>

                                          <br>

                                          Thank you<br>

                                          <br>

                                          Yours sincerely,<br>

                                          <br>

                                          TAY wee-beng<br>

                                          <br>

                                          On 2/11/2015 12:27 PM, Barry

                                          Smith wrote:<br>

                                          <blockquote

                                            class="gmail_quote"

                                            style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                               Run without the

                                            -momentum_ksp_view

                                            -poisson_ksp_view and send

                                            the new results<br>

                                            <br>

                                            <br>

                                               You can see from the log

                                            summary that the PCSetUp is

                                            taking a much smaller

                                            percentage of the time

                                            meaning that it is reusing

                                            the preconditioner and not

                                            rebuilding it each time.<br>

                                            <br>

                                            Barry<br>

                                            <br>

                                               Something makes no sense

                                            with the output: it gives<br>

                                            <br>

                                            KSPSolve             199 1.0

                                            2.3298e+03 1.0 5.20e+09 1.8

                                            3.8e+04 9.9e+05 5.0e+02

                                            90100 66100 24  90100 66100

                                            24   165<br>

                                            <br>

                                            90% of the time is in the

                                            solve but there is no

                                            significant amount of time

                                            in other events of the code

                                            which is just not possible.

                                            I hope it is due to your IO.<br>

                                            <br>

                                            <br>

                                            <br>

                                            <blockquote

                                              class="gmail_quote"

                                              style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                              On Nov 1, 2015, at 10:02

                                              PM, TAY wee-beng<<a

                                                moz-do-not-send="true"

                                                href="mailto:zonexo@gmail.com"

                                                target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                              wrote:<br>

                                              <br>

                                              Hi,<br>

                                              <br>

                                              I have attached the new

                                              run with 100 time steps

                                              for 48 and 96 cores.<br>

                                              <br>

                                              Only the Poisson eqn 's

                                              RHS changes, the LHS

                                              doesn't. So if I want to

                                              reuse the preconditioner,

                                              what must I do? Or what

                                              must I not do?<br>

                                              <br>

                                              Why does the number of

                                              processes increase so

                                              much? Is there something

                                              wrong with my coding?

                                              Seems to be so too for my

                                              new run.<br>

                                              <br>

                                              Thank you<br>

                                              <br>

                                              Yours sincerely,<br>

                                              <br>

                                              TAY wee-beng<br>

                                              <br>

                                              On 2/11/2015 9:49 AM,

                                              Barry Smith wrote:<br>

                                              <blockquote

                                                class="gmail_quote"

                                                style="margin:0px 0px

                                                0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                   If you are doing many

                                                time steps with the same

                                                linear solver then you

                                                MUST do your weak

                                                scaling studies with

                                                MANY time steps since

                                                the setup time of AMG

                                                only takes place in the

                                                first stimestep. So run

                                                both 48 and 96 processes

                                                with the same large

                                                number of time steps.<br>

                                                <br>

                                                   Barry<br>

                                                <br>

                                                <br>

                                                <br>

                                                <blockquote

                                                  class="gmail_quote"

                                                  style="margin:0px 0px

                                                  0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                  On Nov 1, 2015, at

                                                  7:35 PM, TAY

                                                  wee-beng<<a

                                                    moz-do-not-send="true"

href="mailto:zonexo@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                                  wrote:<br>

                                                  <br>

                                                  Hi,<br>

                                                  <br>

                                                  Sorry I forgot and use

                                                  the old a.out. I have

                                                  attached the new log

                                                  for 48cores (log48),

                                                  together with the

                                                  96cores log (log96).<br>

                                                  <br>

                                                  Why does the number of

                                                  processes increase so

                                                  much? Is there

                                                  something wrong with

                                                  my coding?<br>

                                                  <br>

                                                  Only the Poisson eqn

                                                  's RHS changes, the

                                                  LHS doesn't. So if I

                                                  want to reuse the

                                                  preconditioner, what

                                                  must I do? Or what

                                                  must I not do?<br>

                                                  <br>

                                                  Lastly, I only

                                                  simulated 2 time steps

                                                  previously. Now I run

                                                  for 10 timesteps

                                                  (log48_10). Is it

                                                  building the

                                                  preconditioner at

                                                  every timestep?<br>

                                                  <br>

                                                  Also, what about

                                                  momentum eqn? Is it

                                                  working well?<br>

                                                  <br>

                                                  I will try the gamg

                                                  later too.<br>

                                                  <br>

                                                  Thank you<br>

                                                  <br>

                                                  Yours sincerely,<br>

                                                  <br>

                                                  TAY wee-beng<br>

                                                  <br>

                                                  On 2/11/2015 12:30 AM,

                                                  Barry Smith wrote:<br>

                                                  <blockquote

                                                    class="gmail_quote"

                                                    style="margin:0px

                                                    0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                       You used gmres

                                                    with 48 processes

                                                    but richardson with

                                                    96. You need to be

                                                    careful and make

                                                    sure you don't

                                                    change the solvers

                                                    when you change the

                                                    number of processors

                                                    since you can get

                                                    very different

                                                    inconsistent results<br>

                                                    <br>

                                                        Anyways all the

                                                    time is being spent

                                                    in the BoomerAMG

                                                    algebraic multigrid

                                                    setup and it is is

                                                    scaling badly. When

                                                    you double the

                                                    problem size and

                                                    number of processes

                                                    it went from

                                                    3.2445e+01 to

                                                    4.3599e+02 seconds.<br>

                                                    <br>

                                                    PCSetUp             

                                                      3 1.0 3.2445e+01

                                                    1.0 9.58e+06 2.0

                                                    0.0e+00 0.0e+00

                                                    4.0e+00 62  8  0  0 

                                                    4  62  8  0  0  5   

                                                    11<br>

                                                    <br>

                                                    PCSetUp             

                                                      3 1.0 4.3599e+02

                                                    1.0 9.58e+06 2.0

                                                    0.0e+00 0.0e+00

                                                    4.0e+00 85 18  0  0 

                                                    6  85 18  0  0  6   

                                                     2<br>

                                                    <br>

                                                       Now is the

                                                    Poisson problem

                                                    changing at each

                                                    timestep or can you

                                                    use the same

                                                    preconditioner built

                                                    with BoomerAMG for

                                                    all the time steps?

                                                    Algebraic multigrid

                                                    has a large set up

                                                    time that you often

                                                    doesn't matter if

                                                    you have many time

                                                    steps but if you

                                                    have to rebuild it

                                                    each timestep it is

                                                    too large?<br>

                                                    <br>

                                                       You might also

                                                    try -pc_type gamg

                                                    and see how PETSc's

                                                    algebraic multigrid

                                                    scales for your

                                                    problem/machine.<br>

                                                    <br>

                                                       Barry<br>

                                                    <br>

                                                    <br>

                                                    <br>

                                                    <blockquote

                                                      class="gmail_quote"

                                                      style="margin:0px

                                                      0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                      On Nov 1, 2015, at

                                                      7:30 AM, TAY

                                                      wee-beng<<a

                                                        moz-do-not-send="true"

href="mailto:zonexo@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                                      wrote:<br>

                                                      <br>

                                                      <br>

                                                      On 1/11/2015 10:00

                                                      AM, Barry Smith

                                                      wrote:<br>

                                                      <blockquote

                                                        class="gmail_quote"

                                                        style="margin:0px

                                                        0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                        <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0px

                                                          0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                          On Oct 31,

                                                          2015, at 8:43

                                                          PM, TAY

                                                          wee-beng<<a

moz-do-not-send="true" href="mailto:zonexo@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                                          wrote:<br>

                                                          <br>

                                                          <br>

                                                          On 1/11/2015

                                                          12:47 AM,

                                                          Matthew

                                                          Knepley wrote:<br>

                                                          <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0px

                                                          0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                          On Sat, Oct

                                                          31, 2015 at

                                                          11:34 AM, TAY

                                                          wee-beng<<a

moz-do-not-send="true" href="mailto:zonexo@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>> 

                                                          wrote:<br>

                                                          Hi,<br>

                                                          <br>

                                                          I understand

                                                          that as

                                                          mentioned in

                                                          the faq, due

                                                          to the

                                                          limitations in

                                                          memory, the

                                                          scaling is not

                                                          linear. So, I

                                                          am trying to

                                                          write a

                                                          proposal to

                                                          use a

                                                          supercomputer.<br>

                                                          Its specs are:<br>

                                                          Compute nodes:

                                                          82,944 nodes

                                                          (SPARC64

                                                          VIIIfx; 16GB

                                                          of memory per

                                                          node)<br>

                                                          <br>

                                                          8 cores /

                                                          processor<br>

                                                          Interconnect:

                                                          Tofu

                                                          (6-dimensional

                                                          mesh/torus)

                                                          Interconnect<br>

                                                          Each cabinet

                                                          contains 96

                                                          computing

                                                          nodes,<br>

                                                          One of the

                                                          requirement is

                                                          to give the

                                                          performance of

                                                          my current

                                                          code with my

                                                          current set of

                                                          data, and

                                                          there is a

                                                          formula to

                                                          calculate the

                                                          estimated

                                                          parallel

                                                          efficiency

                                                          when using the

                                                          new large set

                                                          of data<br>

                                                          There are 2

                                                          ways to give

                                                          performance:<br>

                                                          1. Strong

                                                          scaling, which

                                                          is defined as

                                                          how the

                                                          elapsed time

                                                          varies with

                                                          the number of

                                                          processors for

                                                          a fixed<br>

                                                          problem.<br>

                                                          2. Weak

                                                          scaling, which

                                                          is defined as

                                                          how the

                                                          elapsed time

                                                          varies with

                                                          the number of

                                                          processors for

                                                          a<br>

                                                          fixed problem

                                                          size per

                                                          processor.<br>

                                                          I ran my cases

                                                          with 48 and 96

                                                          cores with my

                                                          current

                                                          cluster,

                                                          giving 140 and

                                                          90 mins

                                                          respectively.

                                                          This is

                                                          classified as

                                                          strong

                                                          scaling.<br>

                                                          Cluster specs:<br>

                                                          CPU: AMD 6234

                                                          2.4GHz<br>

                                                          8 cores /

                                                          processor

                                                          (CPU)<br>

                                                          6 CPU / node<br>

                                                          So 48 Cores /

                                                          CPU<br>

                                                          Not sure abt

                                                          the memory /

                                                          node<br>

                                                          <br>

                                                          The parallel

                                                          efficiency

                                                          ‘En’ for a

                                                          given degree

                                                          of parallelism

                                                          ‘n’ indicates

                                                          how much the

                                                          program is<br>

                                                          efficiently

                                                          accelerated by

                                                          parallel

                                                          processing.

                                                          ‘En’ is given

                                                          by the

                                                          following

                                                          formulae.

                                                          Although their<br>

                                                          derivation

                                                          processes are

                                                          different

                                                          depending on

                                                          strong and

                                                          weak scaling,

                                                          derived

                                                          formulae are

                                                          the<br>

                                                          same.<br>

                                                           From the

                                                          estimated

                                                          time, my

                                                          parallel

                                                          efficiency

                                                          using 

                                                          Amdahl's law

                                                          on the current

                                                          old cluster

                                                          was 52.7%.<br>

                                                          So is my

                                                          results

                                                          acceptable?<br>

                                                          For the large

                                                          data set, if

                                                          using 2205

                                                          nodes

                                                          (2205X8cores),

                                                          my expected

                                                          parallel

                                                          efficiency is

                                                          only 0.5%. The

                                                          proposal

                                                          recommends

                                                          value of >

                                                          50%.<br>

                                                          The problem

                                                          with this

                                                          analysis is

                                                          that the

                                                          estimated

                                                          serial

                                                          fraction from

                                                          Amdahl's Law 

                                                          changes as a

                                                          function<br>

                                                          of problem

                                                          size, so you

                                                          cannot take

                                                          the strong

                                                          scaling from

                                                          one problem

                                                          and apply it

                                                          to another

                                                          without a<br>

                                                          model of this

                                                          dependence.<br>

                                                          <br>

                                                          Weak scaling

                                                          does model

                                                          changes with

                                                          problem size,

                                                          so I would

                                                          measure weak

                                                          scaling on

                                                          your current<br>

                                                          cluster, and

                                                          extrapolate to

                                                          the big

                                                          machine. I

                                                          realize that

                                                          this does not

                                                          make sense for

                                                          many

                                                          scientific<br>

                                                          applications,

                                                          but neither

                                                          does requiring

                                                          a certain

                                                          parallel

                                                          efficiency.<br>

                                                          </blockquote>

                                                          Ok I check the

                                                          results for my

                                                          weak scaling

                                                          it is even

                                                          worse for the

                                                          expected

                                                          parallel

                                                          efficiency.

                                                          From the

                                                          formula used,

                                                          it's obvious

                                                          it's doing

                                                          some sort of

                                                          exponential

                                                          extrapolation

                                                          decrease. So

                                                          unless I can

                                                          achieve a near

                                                          > 90% speed

                                                          up when I

                                                          double the

                                                          cores and

                                                          problem size

                                                          for my current

                                                          48/96 cores

                                                          setup,   

                                                           extrapolating

                                                          from about 96

                                                          nodes to

                                                          10,000 nodes

                                                          will give a

                                                          much lower

                                                          expected

                                                          parallel

                                                          efficiency for

                                                          the new case.<br>

                                                          <br>

                                                          However, it's

                                                          mentioned in

                                                          the FAQ that

                                                          due to memory

                                                          requirement,

                                                          it's

                                                          impossible to

                                                          get >90%

                                                          speed when I

                                                          double the

                                                          cores and

                                                          problem size

                                                          (ie linear

                                                          increase in

                                                          performance),

                                                          which means

                                                          that I can't

                                                          get >90%

                                                          speed up when

                                                          I double the

                                                          cores and

                                                          problem size

                                                          for my current

                                                          48/96 cores

                                                          setup. Is that

                                                          so?<br>

                                                        </blockquote>

                                                           What is the

                                                        output of

                                                        -ksp_view

                                                        -log_summary on

                                                        the problem and

                                                        then on the

                                                        problem doubled

                                                        in size and

                                                        number of

                                                        processors?<br>

                                                        <br>

                                                           Barry<br>

                                                      </blockquote>

                                                      Hi,<br>

                                                      <br>

                                                      I have attached

                                                      the output<br>

                                                      <br>

                                                      48 cores: log48<br>

                                                      96 cores: log96<br>

                                                      <br>

                                                      There are 2

                                                      solvers - The

                                                      momentum linear

                                                      eqn uses bcgs,

                                                      while the Poisson

                                                      eqn uses hypre

                                                      BoomerAMG.<br>

                                                      <br>

                                                      Problem size

                                                      doubled from

                                                      158x266x150 to

                                                      158x266x300.<br>

                                                      <blockquote

                                                        class="gmail_quote"

                                                        style="margin:0px

                                                        0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                        <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0px

                                                          0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                          So is it fair

                                                          to say that

                                                          the main

                                                          problem does

                                                          not lie in my

                                                          programming

                                                          skills, but

                                                          rather the way

                                                          the linear

                                                          equations are

                                                          solved?<br>

                                                          <br>

                                                          Thanks.<br>

                                                          <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0px

                                                          0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                                             Thanks,<br>

                                                          <br>

                                                                Matt<br>

                                                          Is it possible

                                                          for this type

                                                          of scaling in

                                                          PETSc

                                                          (>50%),

                                                          when using

                                                          17640 (2205X8)

                                                          cores?<br>

                                                          Btw, I do not

                                                          have access to

                                                          the system.<br>

                                                          <br>

                                                          <br>

                                                          <br>

                                                          Sent using

                                                          CloudMagic

                                                          Email<br>

                                                          <br>

                                                          <br>

                                                          <br>

                                                          -- <br>

                                                          What most

                                                          experimenters

                                                          take for

                                                          granted before

                                                          they begin

                                                          their

                                                          experiments is

                                                          infinitely

                                                          more

                                                          interesting

                                                          than any

                                                          results to

                                                          which their

                                                          experiments

                                                          lead.<br>

                                                          -- Norbert

                                                          Wiener<br>

                                                          </blockquote>

                                                        </blockquote>

                                                      </blockquote>

<log48.txt><log96.txt><br>

                                                    </blockquote>

                                                  </blockquote>

<log48_10.txt><log48.txt><log96.txt><br>

                                                </blockquote>

                                              </blockquote>

<log96_100.txt><log48_100.txt><br>

                                            </blockquote>

                                          </blockquote>

<log96_100_2.txt><log48_100_2.txt><br>

                                        </blockquote>

                                      </blockquote>

<log64_100.txt><log8_100.txt><br>

                                    </blockquote>

                                  </blockquote>

                                </blockquote>

                              </blockquote>

                              <br>

                            </div>

                          </div>

                        </blockquote>

                      </div>

                      <br>

                      <br clear="all">

                      <span class=""><font color="#888888">

                          <div><br>

                          </div>

                          -- <br>

                          <div>What most experimenters take for granted

                            before they begin their experiments is

                            infinitely more interesting than any results

                            to which their experiments lead.<br>

                            -- Norbert Wiener</div>

                        </font></span></div>

                  </div>

                </blockquote>

                <br>

              </div>

            </blockquote>

          </div>

          <br>

          <br clear="all">

          <div><br>

          </div>

          -- <br>

          <div class="gmail_signature">What most experimenters take for

            granted before they begin their experiments is infinitely

            more interesting than any results to which their experiments

            lead.<br>

            -- Norbert Wiener</div>

        </div>

      </div>

    </blockquote>

    <br>

  </body>

</html>