[petsc-users] [petsc-maint] petsc ksp solver hangs
Smith, Barry F.
bsmith at mcs.anl.gov
Sun Sep 29 11:24:33 CDT 2019
If you have TotalView or DDT or some other parallel debugger you can wait until it is "hanging" and then send a single to one or more of the processes to stop in and from this get the stack trace. You'll have to figure out for your debugger how that is done.
If you can start your 72 rank job in "interactive" mode you can launch it with the option -start_in_debugger noxterm -debugger_nodes 0 then it will only start the debugger on the first rank. Now wait until it hangs and do a control c and then you can type bt to get the traceback.
Barry
Note it is possible to run 72 rank jobs even on a laptop/workstations/non-cluster (so long as they don't use too much memory and take too long to get to the hang point) and the you can use the debugger as I indicated above.
> On Sep 28, 2019, at 5:32 AM, Michael Wick via petsc-maint <petsc-maint at mcs.anl.gov> wrote:
>
> I attached a debugger to my run. The code just hangs without throwing an error message, interestingly. I uses 72 processors. I turned on the ksp monitor. And I can see it hangs either at the beginning or the end of KSP iteration. I also uses valgrind to debug my code on my local machine, which does not detect any issue. I uses fgmres + fieldsplit, which is really a standard option.
>
> Do you have any suggestions to do?
>
> On Fri, Sep 27, 2019 at 8:17 PM Zhang, Junchao <jczhang at mcs.anl.gov> wrote:
> How many MPI ranks did you use? If it is done on your desktop, you can just attach a debugger to a MPI process to see what is going on.
>
> --Junchao Zhang
>
>
> On Fri, Sep 27, 2019 at 4:24 PM Michael Wick via petsc-maint <petsc-maint at mcs.anl.gov> wrote:
> Hi PETSc:
>
> I have been experiencing a code stagnation at certain KSP iterations. This happens rather randomly, which means the code may stop at the middle of a KSP solve and hangs there.
>
> I have used valgrind and detect nothing. I just wonder if you have any suggestions.
>
> Thanks!!!
> M
More information about the petsc-users
mailing list