<div dir="auto">600 unknowns is way too small to parallelize. Need at least 10,000 unknowns per MPI process: <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html#slowerparallel">https://www.mcs.anl.gov/petsc/documentation/faq.html#slowerparallel</a> </div><div dir="auto"><br></div><div dir="auto">What problem are you solving? Sounds like you either compiled PETSc with debugging mode on or you just have a really terrible solver. Show us the output of -log_view.</div><div><br><div class="gmail_quote"><div>On Fri, Oct 20, 2017 at 12:47 AM Luca Verzeroli <<a href="mailto:l.verzeroli@studenti.unibg.it">l.verzeroli@studenti.unibg.it</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div style="font-family:Calibri,sans-serif;font-size:11pt">Good morning,<br>For my thesis I'm dealing with GALILEO, one of the clusters owned by Cineca. <a href="http://www.hpc.cineca.it/hardware/galileo" target="_blank">http://www.hpc.cineca.it/hardware/galileo</a><br>The first question is: What is the best configuration to run petsc on this kind of cluster? My code is only a MPI program and I would like to know if it's better to use more nodes or more CPUs with mpirun.<br>This question comes from the speed up of my code using that cluster. I have a small problem. The global matrices are 600x600. Are they too small to see a speed up with more mpiprocess? I notice that a single core simulation and a multi cores one take a similar time (multi core a second more). The real problem comes when I have to run multiple simulation of the same code changing some parameters. So I would like to speed up the single simulation. <br>Any advices?</div></div></div><div><div><div style="font-family:Calibri,sans-serif;font-size:11pt"><br><br>Luca Verzeroli</div></div></div></blockquote></div></div>