[petsc-users] Debugging MatAssemblyEnd
Dominik Szczerba
dominik at itis.ethz.ch
Fri Aug 26 04:19:57 CDT 2011
I seem to have had a classical deadlock, A was being assembled while
some threads lurked around elsewhere. Adding some barriers seems to
fix the problem, at least with the cases I currently have.
What I still miss is what would be the advantage of
MPI_Barrier(((PetscObject)A)->comm) over
MPI_Barrier(PETSC_COMM_WORLD).
Many thanks
Dominik
On Fri, Aug 26, 2011 at 11:01 AM, Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, Aug 26, 2011 at 8:37 AM, Dominik Szczerba <dominik at itis.ethz.ch>
> wrote:
>>
>> > When you run in the debugger and break after it has obviously hung, are
>> > all
>> > processes stopped at the same place?
>>
>> Of course not, they are stuck at barriers elsewhere. Thanks for the
>> valuable question.
>>
>> > If you see an error condition, you can
>> > run
>> > CHKMEMQ;
>> > MPI_Barrier(((PetscObject)A)->comm);
>> > MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
>> > MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
>> > If it hangs, check where every process is stuck.
>>
>> I obviously seem to be missing some barriers. But why would I need
>> MPI_Barrier(((PetscObject)A)->comm) and not just
>> MPI_Barrier(PETSC_COMM_WORLD)? Would that only force a barrier for
>> A-related traffic?
>
> The idea here is the following:
> 1) We would like to isolate the mismatch in synchronizations
> 2) We can place barriers in the code to delimit the sections which contain
> the offending code,
> and also eliminate bugs in MatAssembly as a possible source of
> problems.
> 3) Do you have any MPI code you wrote yourself in here?
> Matt
>
>>
>> Dominik
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>
More information about the petsc-users
mailing list