<div dir="ltr">A quick test of a smaller system on my laptop with 25x25x25 mesh gives the following info.<br><div>The memory used keeps increasing from 1 to 3 CPUs, but slightly decreases with 4 CPUs.</div><div>On the other hand, 60x60x60 mesh (2.9M dofs) is also not a big system...</div><div><br></div><div><br></div><div>------------------------------------------------- 1 CPU -------------------------------------------------</div><div><div>Summary of Memory Usage in PETSc</div><div>Maximum (over computational time) process memory: total 4.7054e+09 max 4.7054e+09 min 4.7054e+09</div><div>Current process memory: total 4.7054e+09 max 4.7054e+09 min 4.7054e+09</div><div>Maximum (over computational time) space PetscMalloc()ed: total 1.6151e+09 max 1.6151e+09 min 1.6151e+09</div><div>Current space PetscMalloc()ed: total 7.7232e+04 max 7.7232e+04 min 7.7232e+04</div></div><div><br></div><div>------------------------------------------------- 2 CPU -------------------------------------------------<br></div><div><div>Summary of Memory Usage in PETSc</div><div>Maximum (over computational time) process memory: total 6.2389e+09 max 3.1275e+09 min 3.1113e+09</div><div>Current process memory: total 6.2389e+09 max 3.1275e+09 min 3.1113e+09</div><div>Maximum (over computational time) space PetscMalloc()ed: total 2.1589e+09 max 1.1193e+09 min 1.0397e+09</div><div>Current space PetscMalloc()ed: total 1.5446e+05 max 7.7232e+04 min 7.7232e+04</div></div><div><br></div><div>------------------------------------------------- 3 CPU -------------------------------------------------<br></div><div><div>Summary of Memory Usage in PETSc</div><div>Maximum (over computational time) process memory: total 7.7116e+09 max 1.9572e+09 min 1.8715e+09</div><div>Current process memory: total 7.7116e+09 max 1.9572e+09 min 1.8715e+09</div><div>Maximum (over computational time) space PetscMalloc()ed: total 2.1754e+09 max 5.8450e+08 min 5.0516e+08</div><div>Current space PetscMalloc()ed: total 3.0893e+05 max 7.7232e+04 min 7.7232e+04</div></div><div><br></div><div>------------------------------------------------- 4 CPU -------------------------------------------------<br></div><div><div>Summary of Memory Usage in PETSc</div><div>Maximum (over computational time) process memory: total 7.1188e+09 max 2.4651e+09 min 2.2909e+09</div><div>Current process memory: total 7.1188e+09 max 2.4651e+09 min 2.2909e+09</div><div>Maximum (over computational time) space PetscMalloc()ed: total 2.1750e+09 max 7.6982e+08 min 6.5289e+08</div><div>Current space PetscMalloc()ed: total 2.3170e+05 max 7.7232e+04 min 7.7232e+04</div></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 8, 2016 at 11:51 AM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Signal 9 SIGKILL on batch systems usually means the process was killed because it ran out of time or ran out of memory.<br>
<br>
Perhaps there is something in the code that is unscalable and requires more more memory with more processes. You can run on 1 2 and 3 processes and measure the memory usage to see if it goes up with the number of processes using for example -memory_view<br>
<br>
Barry<br>
<div><div class="h5"><br>
<br>
<br>
> On Jun 8, 2016, at 11:41 AM, Xujun Zhao <<a href="mailto:xzhao99@gmail.com">xzhao99@gmail.com</a>> wrote:<br>
><br>
> Hi all,<br>
><br>
> I am running a FE Stokes Solver with schur complement type PC on blues. The program runs well when mesh is 40X40X40 (0.88M dofs), but when I use 60X60X60 mesh, the program crashes and gives out some errors, which looks like a segmentation fault. The "strange" thing is that it runs well with 1CPU, 2CPUs, but fails on 4 or 8 CPUs. The log files are also attached. It seems like the global matrix and vector are assembled well, and errors come out before calling the KSPSolve().<br>
><br>
> btw, I use the recent PETSc 3.7 dbg version. for libMesh I use both dbg and opt version, but none of those can give useful information. Has anyone met such situations before? Many thinks.<br>
</div></div>> <ex01_validation_test.o1386940><ex01_validation_test.o1387007><br>
<br>
</blockquote></div><br></div>