<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial">Hi, Matt.</div><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><br></div><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial">I get it. I just test a so little system. I'll try a larger one.</div><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><br></div><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial">Zeng</div><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><br>ÔÚ 2012-03-13 22:03:12£¬"Matthew Knepley" <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>> дµÀ£º<br> <blockquote id="isReplyContent" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">On Tue, Mar 13, 2012 at 8:59 AM, Xiangze Zeng <span dir="ltr"><<a href="mailto:zengshixiangze@163.com">zengshixiangze@163.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="line-height:1.7;font-size:14px;font-family:arial"><div style="line-height:1.7;font-size:14px;font-family:arial"><div>Hi, Jed.</div><div>At the beginning and end of the codes for setting the matrices values, I add "printf", and compute the time of this period. It is much longer than that when I don't use the GPU. I just guess the time is used for copping data. My PCTYPE is sor. And 2000 iterations. Do you have any suggestion about this?</div>
</div></div></blockquote><div><br></div><div>1) You do not have to guess. Use -log_summary, and there are explicit events for copying to the GPU</div><div><br></div><div>2) GPUs only really become effective for large systems due to this overhead. I suggest looking at the</div>
<div> performance and overhead as a function of system size.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="line-height:1.7;font-size:14px;font-family:arial">
<div style="line-height:1.7;font-size:14px;font-family:arial"><div>Zeng</div><div><br></div>ÔÚ 2012-03-13 20:12:09£¬"Jed Brown" <<a href="mailto:jedbrown@mcs.anl.gov" target="_blank">jedbrown@mcs.anl.gov</a>> дµÀ£º<br>
<blockquote style="PADDING-LEFT:1ex;MARGIN:0px 0px 0px 0.8ex;BORDER-LEFT:#ccc 1px solid"><div class="gmail_quote">2012/3/13 Xiangze Zeng <span dir="ltr"><<a href="mailto:zengshixiangze@163.com" target="_blank">zengshixiangze@163.com</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>After I configure PETSc using --with-precision=single, I can run both ex19 and my own code. Good news! But it seems lots of time is using for copping the data from CPU to GPU. </div><div></div></blockquote></div><br>
<div>How are you measuring? What preconditioner are you using and how many iterations are typically required?</div>
</blockquote></div></div><br><br><span title="neteasefooter"><span></span></span></blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>
</blockquote></div></div><br><br><span title="neteasefooter"><span id="netease_mail_footer"></span></span>