<div dir="ltr"><div><div>Exactly!<br></div>My tests were performed on Kraken (Cray XT5) and a Cray XE6 and I was using iterative solvers on the coarse grid.<br><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">

On 22 November 2013 00:10, Jed Brown <span dir="ltr"><<a href="mailto:jedbrown@mcs.anl.gov" target="_blank">jedbrown@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im">Dave May <<a href="mailto:dave.mayhem23@gmail.com">dave.mayhem23@gmail.com</a>> writes:<br>

<br>

> I argue it does matter as I've seen runs on 32k cores where a huge amount<br>

> of time is spent in those global reductions. I can provide an<br>

> implementation which uses a sub comm (PCSemiRedundant) if someone thinks<br>

> doing reductions on less cores is beneficial.<br>

<br>

</div>It doesn't matter much on Blue Gene, but is a big deal on older Crays.<br>

Aires seems to be in between.  The default GAMG configuration doesn't do<br>

any reductions in the coarse grid, so the issue is moot.  If an<br>

iterative coarse solver was used, I think we would be more motivated to<br>

put the coarse problem on a subcomm.<br>

</blockquote></div><br></div>