[petsc-dev] [issue1595] Issues of limited number of MPI communicators when having many instances of hypre boomerAMG with Moose
Jed Brown
jed at jedbrown.org
Wed Apr 4 15:31:34 CDT 2018
1. Are you saying that the ranks involved in the coarse solve depends on the matrix entries?
2. Okay, but this might degrade performance, right?
4. I think you are right that *if* all sends and receives have been
posted (may be hard to guarantee if the user is using threads) and
MPI_ANY_SOURCE is not used, then there will not be deadlock or incorrect
message delivery. But the constraints above are hard to communicate to
the user and to verify, and the consequences are so dire
(non-deterministic deadlock, likely only seen at very large scale and
especially subtle to debug because it would be caused by interaction of
seemingly distant components, and users will blame the library) that
libraries need to go out of their way to ensure that it doesn't happen.
Rob Falgout hypre Tracker <hypre-support at llnl.gov> writes:
> Rob Falgout <rfalgout at llnl.gov> added the comment:
>
> Hi All,
>
> Some comments and questions:
>
> 1. The Comm_create is used to create a subcommunicator that involves only the currently active MPI tasks so that the Allgather() will happen only over that subset. I don't think we can create this once, attach it to a parent communicator, and use it again in a different solve based on a different AMG setup (because the active tasks will likely be different). If the AMG setup is the same, the same communicator will be used. When the AMG Destroy() is called, that communicator is freed.
>
> 2. Ulrike has already provided a way to get around the Comm_create() by changing one of the AMG parameters. This is something that PETSc has control over and can do.
>
> 3. The idea of dup'ing a communicator, attaching it as an attribute to a parent communicator, and using it as hypre's internal communicator makes sense to me. I think it would require quite a bit of code changes to implement. My next comment is important to consider as well.
>
> 4. Regarding pending point-to-point messages, you are right that this could create problems without creating deadlock. I had not thought about this scenario. However, as long as all of the corresponding user send requests have also been issued (in a non-blocking manner), is there still really a problem here? MPI is supposed to guarantee message order, so any receives that hypre posts from the same task with the same tag should not interfere with an already posted (but not finalized) receive that the user has posted, right?
>
> -Rob
>
> ____________________________________________
> hypre Issue Tracker <hypre-support at llnl.gov>
> <http://cascb1.llnl.gov/hypre/issue1595>
> ____________________________________________
More information about the petsc-dev
mailing list