<div dir="ltr">This thread has lost the main developer Ed ... cc'ed along with the PI.<div><br></div><div>Ed and CS, I will forward a few messages on this thread.</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
On Thu, Dec 12, 2013 at 6:29 PM, Dominic Meiser <span dir="ltr"><<a href="mailto:dmeiser@txcorp.com" target="_blank">dmeiser@txcorp.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Hi Karli,<div class="im"><br>
<br>
On 12/12/2013 02:50 PM, Karl Rupp wrote:<br>
</div></div><div class="im">
<blockquote type="cite"><br>
Hmm, this does not sound like something I would consider a good
fit for GPUs. With 16 MPI processes you have additional congestion
of the one or two GPUs per node, so you would have the rethink the
solution procedure as a whole.<br>
<br>
</blockquote></div>
Are you sure about that for Titan? Supposedly the K20X's can deal
with multiple MPI processes hitting a single GPU pretty well using
Hyper-Q. Paul has seen pretty good speed up with small GPU kernels
simply by over-subscribing each GPU with 4 MPI processes.<br>
<br>
See here:<br>
<a href="http://blogs.nvidia.com/blog/2012/08/23/unleash-legacy-mpi-codes-with-keplers-hyper-q/" target="_blank">http://blogs.nvidia.com/blog/2012/08/23/unleash-legacy-mpi-codes-with-keplers-hyper-q/</a><br>
<br>
<br>
Cheers,<br>
Dominic<span class="HOEnZb"><font color="#888888"><br>
<br>
<br>
<pre cols="72">--
Dominic Meiser
Tech-X Corporation
5621 Arapahoe Avenue
Boulder, CO 80303
USA
Telephone: <a href="tel:303-996-2036" value="+13039962036" target="_blank">303-996-2036</a>
Fax: <a href="tel:303-448-7756" value="+13034487756" target="_blank">303-448-7756</a>
<a href="http://www.txcorp.com" target="_blank">www.txcorp.com</a></pre>
</font></span></div>
</blockquote></div><br></div>