<div dir="ltr"><div><div>Exactly!<br></div>My tests were performed on Kraken (Cray XT5) and a Cray XE6 and I was using iterative solvers on the coarse grid.<br><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
On 22 November 2013 00:10, Jed Brown <span dir="ltr"><<a href="mailto:jedbrown@mcs.anl.gov" target="_blank">jedbrown@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">Dave May <<a href="mailto:dave.mayhem23@gmail.com">dave.mayhem23@gmail.com</a>> writes:<br>
<br>
> I argue it does matter as I've seen runs on 32k cores where a huge amount<br>
> of time is spent in those global reductions. I can provide an<br>
> implementation which uses a sub comm (PCSemiRedundant) if someone thinks<br>
> doing reductions on less cores is beneficial.<br>
<br>
</div>It doesn't matter much on Blue Gene, but is a big deal on older Crays.<br>
Aires seems to be in between. The default GAMG configuration doesn't do<br>
any reductions in the coarse grid, so the issue is moot. If an<br>
iterative coarse solver was used, I think we would be more motivated to<br>
put the coarse problem on a subcomm.<br>
</blockquote></div><br></div>