<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr">Il giorno Gio 15 Nov 2018, 20:35 Ivan via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p><span class="m_8077297459423839365m_-4222405733861688740gmail-im"> Matthew,</span></p>
<b><i><span class="m_8077297459423839365m_-4222405733861688740gmail-im">As I wrote
before, its not impossible. You could be directly calling PMI,
but I do not think you are doing that.</span></i></b>
<p><span class="m_8077297459423839365m_-4222405733861688740gmail-im">Could you precise
what is PMI? and how can we directly use it? It might be a key
to this mystery!<br></span></p></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">There are no mysteries. The blas is multithreaded</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><p><span class="m_8077297459423839365m_-4222405733861688740gmail-im">
</span></p>
<p><b><i>Why do you think its running on 8 processes?</i></b></p>
<p>Well, we base our opinion on 3 points:<br>
1) htop shows a total loading of 8 processors<br>
2) system monitor shows the same behavior<br>
3) Time. 8 seconds vs 70 seconds, although we have a very similar
PC configs</p>
<div><b><i>I think its much more likely that there are differences
in the solver (use -ksp_view to see exactly what solver was
used), then to think it is parallelism. </i></b></div>
<div><br>
</div>
<div>We actually use the incidental code. Or do you think that
independently on this fact, and the fact that we precise in the
code "ksp.getPC().setFactorSolverType('mumps')" ksp may solve the
system of equations using different solver?<br>
</div>
<div><i><br>
</i></div>
<div><b><i>Moreover, you would never ever ever see that much speedup
on a laptop since all these computations </i></b></div>
<b>
</b>
<div><b><i>are bandwidth limited.</i></b></div>
<p>I agree to this point. But I would think that taking into account
that his computer is *a bit* more powerful and his code is
executed in parallel, we might have an acceleration. We, for
example, tested other, more physical codes. And noted acceleration
x4 - x6</p>
<p>Thank you for your contribution,</p>
<p>Ivan</p>
<div class="m_8077297459423839365moz-cite-prefix">On 15/11/2018 18:07, Matthew Knepley
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">On Thu, Nov 15, 2018 at 11:59 AM Ivan Voznyuk
<<a href="mailto:ivan.voznyuk.work@gmail.com" target="_blank" rel="noreferrer">ivan.voznyuk.work@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">Hi
Matthew,</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im"><br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">Does it
mean that by using just command python3 simple_code.py
(without mpiexec) you <u>cannot</u> obtain a parallel
execution? <br>
</span></div>
</div>
</blockquote>
<div><br>
</div>
<div>As I wrote before, its not impossible. You could be
directly calling PMI, but I do not think you are doing that.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">It s
been 5 days we are trying to understand with my
colleague how he managed to do so.</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">It means
that by using simply python3 simple_code.py he gets 8
processors workiing.</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">By the
way, we wrote in his code few lines:</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">rank =
PETSc.COMM_WORLD.Get_rank()<br>
size = PETSc.COMM_WORLD.Get_size() <br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">and we
got rank = 0, size = 1</span></div>
</div>
</blockquote>
<div><br>
</div>
<div>This is MPI telling you that you are only running on 1
processes.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">However,
we compilator arrives to KSP.solve(), somehow it turns
on 8 processors.</span></div>
</div>
</blockquote>
<div><br>
</div>
<div>Why do you think its running on 8 processes?</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">This
problem is solved on his PC in 5-8 sec (in parallel,
using <u>python3 simple_code.py</u>), on mine it
takes 70-90 secs (in sequantial, but with the same
command <u><span class="m_8077297459423839365m_-4222405733861688740gmail-im">python3
simple_code.py</span></u>)</span></div>
</div>
</blockquote>
<div><br>
</div>
<div>I think its much more likely that there are differences
in the solver (use -ksp_view to see exactly what solver was
used), then</div>
<div>to think it is parallelism. Moreover, you would never
ever ever see that much speedup on a laptop since all these
computations</div>
<div>are bandwidth limited.</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">So,
conclusion is that on his computer this code works in
the same way as scipy: all the code is executed in
sequantial mode, but when it comes to solution of
system of linear equations, it runs on all available
processors. All this with just running python3
my_code.py (without any mpi-smth)<br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im"><br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">Is it an
exception / abnormal behavior? I mean, is it something
irregular that you, developers, have never seen?</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im"><br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">Thanks
and have a good evening!</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">Ivan</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im"><br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im">P.S. I
don't think I know the answer regarding Scipy...<br>
</span></div>
<div><span class="m_8077297459423839365m_-4222405733861688740gmail-im"><br>
</span></div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Thu, Nov 15, 2018 at 2:39 PM Matthew
Knepley <<a href="mailto:knepley@gmail.com" target="_blank" rel="noreferrer">knepley@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">On Thu, Nov 15, 2018 at 8:07 AM Ivan
Voznyuk <<a href="mailto:ivan.voznyuk.work@gmail.com" target="_blank" rel="noreferrer">ivan.voznyuk.work@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>Hi Matthew,</div>
<div>Thanks for your reply!</div>
<div><br>
</div>
<div>Let me precise what I mean by defining few
questions:</div>
<div><br>
</div>
<div>1. In order to obtain a parallel execution
of simple_code.py, do I need to go with
mpiexec python3 simple_code.py, or I can just
launch python3 simple_code.py?</div>
</div>
</blockquote>
<div><br>
</div>
<div>mpiexec -n 2 python3 simple_code.py</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>2. This simple_code.py consists of 2 parts:
a) preparation of matrix b) solving the system
of linear equations with PETSc. If I launch
mpirun (or mpiexec) -np 8 python3
simple_code.py, I suppose that I will
basically obtain 8 matrices and 8 systems to
solve. However, I need to prepare only one
matrix, but launch this code in parallel on 8
processors.</div>
</div>
</blockquote>
<div><br>
</div>
<div>When you create the Mat object, you give it a
communicator (here PETSC_COMM_WORLD). That allows
us to distribute the data. This is all covered
extensively in the manual and the online
tutorials, as well as the example code.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>In fact, here attached you will find a
similar code (scipy_code.py) with only one
difference: the system of linear equations is
solved with scipy. So when I solve it, I can
clearly see that the solution is obtained in a
parallel way. However, I do not use the
command mpirun (or mpiexec). I just go with
python3 scipy_code.py.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Why do you think its running in parallel?</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>In this case, the first part (creation of
the sparse matrix) is not parallel, whereas
the solution of system is found in a parallel
way.</div>
<div>So my question is, Do you think that it s
possible to have the same behavior with PETSC?
And what do I need for this?<br>
</div>
<div><br>
</div>
<div>I am asking this because for my colleague
it worked! It means that he launches the
simple_code.py on his computer using the
command python3 simple_code.py (and not
mpi-smth python3 simple_code.py) and he
obtains a parallel execution of the same code.<br>
</div>
<div><br>
</div>
<div>Thanks for your help!</div>
<div>Ivan<br>
</div>
<div><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Thu, Nov 15, 2018 at 11:54 AM
Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank" rel="noreferrer">knepley@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">On Thu, Nov 15, 2018 at
4:53 AM Ivan Voznyuk via petsc-users
<<a href="mailto:petsc-users@mcs.anl.gov" target="_blank" rel="noreferrer">petsc-users@mcs.anl.gov</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<p>Dear PETSC community,</p>
<p>I have a question regarding the
parallel execution of petsc4py.</p>
<p>I have a simple code (here attached
simple_code.py) which solves a
system of linear equations Ax=b
using petsc4py. To execute it, I use
the command python3 simple_code.py
which yields a sequential
performance. With a colleague of my,
we launched this code on his
computer, and this time the
execution was in parallel. Although,
he used the same command python3
simple_code.py (without mpirun,
neither mpiexec).</p>
</div>
</blockquote>
<div>I am not sure what you mean. To run
MPI programs in parallel, you need a
launcher like mpiexec or mpirun. There
are Python programs (like nemesis) that
use the launcher API directly (called
PMI), but that is not part of petsc4py.</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<p>My configuration: Ubuntu x86_64
Ubuntu 16.04, Intel Core i7, PETSc
3.10.2,
PETSC_ARCH=arch-linux2-c-debug,
petsc4py 3.10.0 in virtualenv <br>
</p>
<p>In order to parallelize it, I have
already tried:<br>
- use 2 different PCs<br>
- use Ubuntu 16.04, 18.04<br>
- use different architectures
(arch-linux2-c-debug,
linux-gnu-c-debug, etc)<br>
- ofc use different configurations
(my present config can be found in
make.log that I attached here)<br>
- mpi from mpich, openmpi</p>
<p>Nothing worked.</p>
<p>Do you have any ideas?</p>
<p>Thanks and have a good day,<br>
Ivan</p>
<br>
-- <br>
<div dir="ltr" class="m_8077297459423839365m_-4222405733861688740m_-7563762080626270029m_1851220933935391964m_9043555073033899979m_4831720893541188530gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">Ivan VOZNYUK
<div>PhD in Computational
Electromagnetics</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="m_8077297459423839365m_-4222405733861688740m_-7563762080626270029m_1851220933935391964m_9043555073033899979gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters
take for granted before they
begin their experiments is
infinitely more interesting
than any results to which
their experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" rel="noreferrer">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="m_8077297459423839365m_-4222405733861688740m_-7563762080626270029m_1851220933935391964gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">Ivan VOZNYUK
<div>PhD in Computational Electromagnetics</div>
<div>+33 (0)6.95.87.04.55</div>
<div><a href="https://ivanvoznyukwork.wixsite.com/webpage" target="_blank" rel="noreferrer">My
webpage</a><br>
</div>
<div><a href="http://linkedin.com/in/ivan-voznyuk-b869b8106" target="_blank" rel="noreferrer">My
LinkedIn</a></div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="m_8077297459423839365m_-4222405733861688740m_-7563762080626270029gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters take for
granted before they begin their
experiments is infinitely more
interesting than any results to which
their experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" rel="noreferrer">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="m_8077297459423839365m_-4222405733861688740gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">Ivan VOZNYUK
<div>PhD in Computational Electromagnetics</div>
<div>+33 (0)6.95.87.04.55</div>
<div><a href="https://ivanvoznyukwork.wixsite.com/webpage" target="_blank" rel="noreferrer">My
webpage</a><br>
</div>
<div><a href="http://linkedin.com/in/ivan-voznyuk-b869b8106" target="_blank" rel="noreferrer">My
LinkedIn</a></div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="m_8077297459423839365gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters take for granted before
they begin their experiments is infinitely more
interesting than any results to which their
experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank" rel="noreferrer">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote></div></div></div>