[petsc-dev] VecScatter scaling problem on KNL

Thu Mar 9 06:20:14 CST 2017

>    Ok, in this situation VecScatter cannot detect that it is an all to all so will generate a message from each process to each other process. Given my past experience with Cray MPI (why do they even have their own MPI when Intel provides one; in fact why does Cray even exist when they just take other people's products and put their name on them) I am not totally surprised if the Cray MPI chocks on this flood of messages.
>
>    1) Test with Intel MPI, perhaps they handle this case in a scalable way
>
>     2) If Intel MPI also produces poor performance then (interesting, how come on other systems in the past this wasn't a bottleneck for the code?) the easiest solution is to separate the operation into two parts. Use a VecScatterCreateToAll() to get all the data to all the processes and then use another (purely sequential) VecScatter to get the data from this intermediate buffer into the final vector that has the "extra" locations for the boundary conditions in the final destination vector.

Yes, this is what I am thinking I will do. This sort of problem will
only get worse so we might as well do it at some point and I would bet
that we could just use Intel MPI now to get this project moving now.