[petsc-users] MPI Iterative solver crash on HPC

Dave May dave.mayhem23 at gmail.com
Thu Jan 10 02:59:29 CST 2019


On Thu, 10 Jan 2019 at 08:55, Sal Am via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> I am not sure what is exactly is wrong as the error changes slightly every
> time I run it (without changing the parameters).
>

This likely implies that you have a memory error in your code (a memory
leak would not cause this behaviour).
I strongly suggest you make sure your code is free of memory errors.
You can do this using valgrind. See here

https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

for an explanation of how to use valgrind.


> I have attached the first two run's errors and my code.
>
> Is there a memory leak somewhere? I have tried running it with
> -malloc_dump, but not getting anything printed out, however, when run with
> -log_view I see that Viewer is created 4 times, but destroyed 3 times. The
> way I see it, I have destroyed it where I see I no longer have use for it
> so not sure if I am wrong. Could this be the reason why it keeps crashing?
> It crashes as soon as it reads the matrix, before entering the solving mode
> (I have a print statement before solving starts that never prints).
>
> how I run it in the job script on 2 node with 32 processors using the
> clusters OpenMPI.
>
> mpiexec ./solveCSys -ksp_type bcgs -pc_type gamg -ksp_converged_reason
> -ksp_monitor_true_residual -log_view -ksp_error_if_not_converged
> -ksp_monitor -malloc_log -ksp_view
>
> the matrix:
> 2 122 821 366 (non-zero elements)
> 25 947 279 x 25 947 279
>
> Thanks and all the best
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190110/28a36880/attachment.html>


More information about the petsc-users mailing list