[petsc-users] How to check where code hangs
Barry Smith
bsmith at mcs.anl.gov
Fri Oct 3 07:22:54 CDT 2014
You really have to use a debugger. When it is “hanging” you interrupt it and type bt. Or if you are lucky you can use total view. Unfortunately getting access to the debugger on a batch system is often difficult. If your nodes have access to X windows and xterm you can run the PETSc program with, for example,
-start_in_debugger -display $DISPLAY -debugger_nodes 0
to have only the 0th MPI process start up the debugger. You can also use -debugger_nodes 0,2,8 for example to start a debugger on 3 different MPI processes.
Barry
On Oct 3, 2014, at 2:13 AM, TAY wee-beng <zonexo at gmail.com> wrote:
> Hi,
>
> This qn is not PETSc related but I hope to get some ans from experienced users.
>
> I'm running an MPI code which uses 144 cpu. It aborts after a short while. I'm trying to find out where it aborts exactly.
>
> However, being an MPI code, it seems quite difficult.
>
> I used:
>
> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *, "xxx"
>
> where xxx = 1,2,3 ....
>
> If it prints up to 3, I know it aborts between 3 and 4. However, it doesn't seem to work as supposed.
>
> I wonder why.
>
> Also is there a better way to do this?
>
> --
> Thank you.
>
> Yours sincerely,
>
> TAY wee-beng
>
More information about the petsc-users
mailing list