Hi,<br><br>I also contacted Kim McMahon of Cray about this. He told me this:<br><br><br>===========================================<br>
Hi Justin,<br>
<br>
Hmm. This performance looks pretty bad. This reminds me of some<br>
MPI_Allgather performance results I gathered several months ago.<br>
We found that the MPICH2 MPI_Allgather algorithm was not scaling<br>
on Cray's XT systems. At that point, I re-wrote the MPI_Allgather<br>
algorithm for XT. Given your data, it looks like I might need to<br>
do the same for MPI_Allgatherv. As of our current release, we are<br>
still using the MPICH2 algorithm for MPI_Allgatherv.<br>
<br>
Is there any chance you can substitute an MPI_Allgather for your<br>
MPI_Allgatherv calls?<br>
<br>
I will look deeper into the Allgatherv performance in the next<br>
few weeks. I'll let you know if I have something new for you to<br>
try.<br>
<br>
Thanks for bringing this to our attention.<br>
<br>
-Kim<br>=================================================<br><br><br>I decided to change my test program to use all gather and performed the same tests. The results are attached. As you can see the performance is much better. I will try and figure out if cray has the new algorithm implemented. This might also be a problem with XT.<br>
<br>Justin<br><div class="gmail_quote">On Mon, Nov 2, 2009 at 5:37 PM, Pavan Balaji <span dir="ltr"><<a href="mailto:balaji@mcs.anl.gov">balaji@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="im"><br>
On 11/02/2009 06:27 PM, Justin Luitjens wrote:<br>
> Actually I just realized I was wrong with what I said. Our code has a<br>
> lot of variation but the allgather test that I wrote and timed in the<br>
> graph I attached will only have a single element of variation between<br>
> each process. Is there an environment variable I can set to force<br>
> either the ring algorithm or the recursive doubling algorithm to compare<br>
> the times?<br>
<br>
</div>No, there's no such provisioning (though we should probably add it).<br>
<br>
This does look like a performance issue. Can you create a ticket here:<br>
<a href="https://trac.mcs.anl.gov/projects/mpich2/newticket" target="_blank">https://trac.mcs.anl.gov/projects/mpich2/newticket</a><br>
<br>
Thanks,<br>
<div><div></div><div class="h5"><br>
-- Pavan<br>
<br>
--<br>
Pavan Balaji<br>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
_______________________________________________<br>
mpich-discuss mailing list<br>
<a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
<br>
</div></div></blockquote></div><br>