<div dir="ltr"><div dir="ltr">On Tue, Oct 19, 2021 at 9:18 PM Swarnava Ghosh <<a href="mailto:swarnava89@gmail.com">swarnava89@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thank you Junchao! Is it possible to determine how much time is being spent on data transfer from the CPU mem to the GPU mem from the log?</div></blockquote><div><br></div><div>It looks like</div><div><br></div><div><p style="font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;margin:0px;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecCUDACopyTo        891 1.1 1.5322e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    842 6.23e+01    0 0.00e+00  0</span></p><p style="font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;margin:0px;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecCUDACopyFrom      891 1.1 1.5837e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0      0 0.00e+00  842 6.23e+01  0</span></p><p style="font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;margin:0px;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatCUSPARSCopyTo     891 1.1 1.5229e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0       0    842 1.93e+03    0 0.00e+00  0</span></p><br class="gmail-Apple-interchange-newline"></div><div>  Thanks,</div><div><br></div><div>     Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div>





<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">************************************************************************************************************************</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">*** <span>            </span>WIDEN YOUR WINDOW TO 120 CHARACTERS.<span>  </span>Use 'enscript -r -fCourier9' to print this document<span>            </span>***</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">************************************************************************************************************************</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">---------------------------------------------- PETSc Performance Summary: ----------------------------------------------</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">/ccsopen/home/swarnava/MiniApp_xl_cu/bin/sq on a<span>  </span>named h49n15 with 4 processors, by swarnava Tue Oct 19 21:10:56 2021</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Using Petsc Release Version 3.15.0, Mar 30, 2021</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>                         </span>Max <span>      </span>Max/Min <span>    </span>Avg <span>      </span>Total</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Time (sec): <span>          </span>1.172e+02 <span>    </span>1.000 <span>  </span>1.172e+02</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Objects:<span>              </span>1.160e+02 <span>    </span>1.000 <span>  </span>1.160e+02</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Flop: <span>                </span>5.832e+10 <span>    </span>1.125 <span>  </span>5.508e+10<span>  </span>2.203e+11</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Flop/sec: <span>            </span>4.974e+08 <span>    </span>1.125 <span>  </span>4.698e+08<span>  </span>1.879e+09</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MPI Messages: <span>        </span>0.000e+00 <span>    </span>0.000 <span>  </span>0.000e+00<span>  </span>0.000e+00</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MPI Message Lengths:<span>  </span>0.000e+00 <span>    </span>0.000 <span>  </span>0.000e+00<span>  </span>0.000e+00</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MPI Reductions: <span>      </span>1.320e+02 <span>    </span>1.000</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>                            </span>e.g., VecAXPY() for real vectors of length N --> 2N flop</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>                            </span>and VecAXPY() for complex vectors of length N --> 8N flop</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Summary of Stages: <span>  </span>----- Time ------<span>  </span>----- Flop ------<span>  </span>--- Messages ---<span>  </span>-- Message Lengths --<span>  </span>-- Reductions --</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>                        </span>Avg <span>    </span>%Total <span>    </span>Avg <span>    </span>%Total<span>    </span>Count <span>  </span>%Total <span>    </span>Avg <span>        </span>%Total<span>    </span>Count <span>  </span>%Total</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span> </span>0:<span>      </span>Main Stage: 1.1725e+02 100.0%<span>  </span>2.2033e+11 100.0%<span>  </span>0.000e+00 <span>  </span>0.0%<span>  </span>0.000e+00<span>        </span>0.0%<span>  </span>1.140e+02<span>  </span>86.4%</span></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p>
<p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">------------------------------------------------------------------------------------------------------------------------</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">See the 'Profiling' chapter of the users' manual for details on interpreting output.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Phase summary info:</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Count: number of times phase was executed</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Time and Flop: Max - maximum over all processors</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>                  </span>Ratio - ratio of maximum to minimum over all processors</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Mess: number of messages sent</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>AvgLen: average message length (bytes)</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Reduct: number of global reductions</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Global: entire computation</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>      </span>%T - percent time in this phase <span>        </span>%F - percent flop in this phase</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>      </span>%M - percent messages in this phase <span>    </span>%L - percent message lengths in this phase</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>      </span>%R - percent reductions in this phase</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors)</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>CpuToGpu Count: total number of CPU to GPU copies per processor</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor)</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>GpuToCpu Count: total number of GPU to CPU copies per processor</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor)</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>GPU %F: percent flops on GPU in this event</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">------------------------------------------------------------------------------------------------------------------------</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Event<span>                </span>Count<span>      </span>Time (sec) <span>    </span>Flop<span>                              </span>--- Global ---<span>  </span>--- Stage ----<span>  </span>Total <span>  </span>GPU<span>    </span>- CpuToGpu - <span>  </span>- GpuToCpu - GPU</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>                   </span>Max Ratio<span>  </span>Max <span>    </span>Ratio <span>  </span>Max<span>  </span>Ratio<span>  </span>Mess <span>  </span>AvgLen<span>  </span>Reduct<span>  </span>%T %F %M %L %R<span>  </span>%T %F %M %L %R Mflop/s Mflop/s Count <span>  </span>Size <span>  </span>Count <span>  </span>Size<span>  </span>%F</span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">




























</span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">---------------------------------------------------------------------------------------------------------------------------------------------------------------</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">--- Event Stage 0: Main Stage</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">BuildTwoSided<span>          </span>2 1.0 6.2501e-03145.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">BuildTwoSidedF <span>        </span>2 1.0 6.2628e-03123.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecDot <span>            </span>89991 1.1 3.4663e+00 1.2 1.67e+09 1.1 0.0e+00 0.0e+00 0.0e+00<span>  </span>3<span>  </span>3<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>3<span>  </span>3<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>1816<span>    </span>1841<span>      </span>0 0.00e+00 84992 6.80e-01 100</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecNorm<span>            </span>89991 1.1 5.5282e+00 1.2 1.67e+09 1.1 0.0e+00 0.0e+00 0.0e+00<span>  </span>4<span>  </span>3<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>4<span>  </span>3<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>1139<span>    </span>1148<span>      </span>0 0.00e+00 84992 6.80e-01 100</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecScale <span>          </span>89991 1.1 1.3902e+00 1.2 8.33e+08 1.1 0.0e+00 0.0e+00 0.0e+00<span>  </span>1<span>  </span>1<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>1<span>  </span>1<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2265<span>    </span>2343 <span>  </span>84992 6.80e-01<span>    </span>0 0.00e+00 100</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecCopy <span>          </span>178201 1.1 2.9825e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>2<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>2<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecSet<span>              </span>3589 1.1 1.0195e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecAXPY <span>          </span>179091 1.1 2.7456e+00 1.2 3.32e+09 1.1 0.0e+00 0.0e+00 0.0e+00<span>  </span>2<span>  </span>6<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>2<span>  </span>6<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>4564<span>    </span>4739 <span>  </span>169142 1.35e+00<span>    </span>0 0.00e+00 100</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecCUDACopyTo<span>        </span>891 1.1 1.5322e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>    </span>842 6.23e+01<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">VecCUDACopyFrom<span>      </span>891 1.1 1.5837e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>  </span>842 6.23e+01<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">DMCreateMat<span>            </span>5 1.0 7.3491e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00<span>  </span>1<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>5 <span>  </span>1<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>6 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">SFSetGraph <span>            </span>5 1.0 3.5016e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatMult<span>            </span>89991 1.1 2.0423e+00 1.2 5.08e+10 1.1 0.0e+00 0.0e+00 0.0e+00<span>  </span>2 87<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>2 87<span>  </span>0<span>  </span>0<span>  </span>0 94039 <span>  </span>105680 <span>  </span>1683 2.00e+03<span>    </span>0 0.00e+00 100</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatCopy<span>              </span>891 1.1 1.3600e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatConvert <span>            </span>2 1.0 1.0489e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>1<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>1<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatScale <span>              </span>2 1.0 2.7950e-04 1.3 3.18e+05 1.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>4530 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatAssemblyBegin <span>      </span>7 1.0 6.3768e-0368.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>2 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatAssemblyEnd <span>        </span>7 1.0 7.9870e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>3 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>4 <span>    </span>0 <span>      </span>0<span>      </span>0 0.00e+00<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">MatCUSPARSCopyTo <span>    </span>891 1.1 1.5229e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0<span>  </span>0 <span>    </span>0 <span>      </span>0<span>    </span>842 1.93e+03<span>    </span>0 0.00e+00<span>  </span>0</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">---------------------------------------------------------------------------------------------------------------------------------------------------------------</span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">




























</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Memory usage is given in bytes:</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Object Type<span>          </span>Creations <span>  </span>Destructions <span>    </span>Memory<span>  </span>Descendants' Mem.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Reports information only for process 0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">--- Event Stage 0: Main Stage</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span><br></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>              </span>Vector<span>    </span>69 <span>            </span>11<span>        </span>19112 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>    </span>Distributed Mesh <span>    </span>3<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>           </span>Index Set<span>    </span>12 <span>            </span>10 <span>      </span>187512 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>IS L to G Mapping <span>    </span>3<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Star Forest Graph<span>    </span>11<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>     </span>Discrete System <span>    </span>3<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>           </span>Weak Form <span>    </span>3<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Application Order <span>    </span>1<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>              </span>Matrix <span>    </span>8<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>       </span>Krylov Solver <span>    </span>1<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>      </span>Preconditioner <span>    </span>1<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><span>              </span>Viewer <span>    </span>1<span>              </span>0<span>            </span>0 <span>    </span>0.</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">========================================================================================================================</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Average time to get PetscTime(): 4.32e-08</span></p><p style="margin:0px;font:11px Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Average time for MPI_Barrier(): 9.94e-07</span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0);min-height:13px"><span style="font-variant-ligatures:no-common-ligatures"></span>



























</p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Average time for zero size MPI_Send(): 4.20135e-05</span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures"><br></span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">Sincerely,</span></p><p style="margin:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(0,0,0)"><span style="font-variant-ligatures:no-common-ligatures">SG</span></p></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Oct 19, 2021 at 12:28 AM Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 10:56 PM Swarnava Ghosh <<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">I am trying the port parts of the following function on GPUs. Essentially, the lines of codes between the two "TODO..." comments should be executed on the device. Here is the function:<div><br><div>PetscScalar CalculateSpectralNodesAndWeights(LSDFT_OBJ *pLsdft, int p, int LIp)<br>{<br><br>  PetscInt N_qp;      <br>  N_qp = pLsdft->N_qp;<br>  <br>  int k;<br>  PetscScalar *a, *b;<br>  k=0;<br><br>  PetscMalloc(sizeof(PetscScalar)*(N_qp+1), &a);<br>  PetscMalloc(sizeof(PetscScalar)*(N_qp+1), &b);<br><br>  /*<br>   * TODO: COPY a, b, pLsdft->Vk, pLsdft->Vkm1, pLsdft->Vkp1, pLsdft->LapPlusVeffOprloc, k,p,N_qp from HOST to DEVICE<br>   * DO THE FOLLOWING OPERATIONS ON DEVICE<br>   */<br>  <br>  //zero out vectors<br>  VecZeroEntries(pLsdft->Vk);<br>  VecZeroEntries(pLsdft->Vkm1);<br>  VecZeroEntries(pLsdft->Vkp1);<br><br>  VecSetValue(pLsdft->Vkm1, p, 1.0, INSERT_VALUES);   <br>  MatMult(pLsdft->LapPlusVeffOprloc,pLsdft->Vkm1,pLsdft->Vk);<br>  VecDot(pLsdft->Vkm1, pLsdft->Vk, &a[0]);<br>  VecAXPY(pLsdft->Vk, -a[0], pLsdft->Vkm1);<br>  VecNorm(pLsdft->Vk, NORM_2, &b[0]);<br>  VecScale(pLsdft->Vk, 1.0 / b[0]);<br>   <br>  for (k = 0; k < N_qp; k++) {<br>    MatMult(pLsdft->LapPlusVeffOprloc,pLsdft->Vk,pLsdft->Vkp1);<br>    VecDot(pLsdft->Vk, pLsdft->Vkp1, &a[k + 1]);<br>    VecAXPY(pLsdft->Vkp1, -a[k + 1], pLsdft->Vk);<br>    VecAXPY(pLsdft->Vkp1, -b[k], pLsdft->Vkm1);<br>    VecCopy(pLsdft->Vk, pLsdft->Vkm1);<br>    VecNorm(pLsdft->Vkp1, NORM_2, &b[k + 1]);<br>    VecCopy(pLsdft->Vkp1, pLsdft->Vk);<br>    VecScale(pLsdft->Vk, 1.0 / b[k + 1]);<br>  }<br><br>  /*<br>   * TODO: Copy back a, b, pLsdft->Vk, pLsdft->Vkm1, pLsdft->Vkp1, pLsdft->LapPlusVeffOprloc, k,p,N_qp from DEVICE to HOST<br>   */<br>  <br>  /*<br>   * Some operation with a, and b on HOST<br>   *<br>   */<br>  TridiagEigenVecSolve_NodesAndWeights(pLsdft, a, b, N_qp, LIp);  // operation on the host<br><br>  // free a,b<br>  PetscFree(a);<br>  PetscFree(b);</div><div><br>  return 0;<br>}<br><br></div><div>If I just use the command line options to set vectors Vk,Vkp1 and Vkm1 as cuda vectors and the matrix  LapPlusVeffOprloc as aijcusparse, will the lines of code between the two "TODO" comments be entirely executed on the device?</div></div></div></blockquote><div>yes, except  VecSetValue(pLsdft->Vkm1, p, 1.0, INSERT_VALUES);  which is done on CPU, by pulling down vector data from GPU to CPU and setting the value.  Subsequent vector operations will push the updated vector data to GPU again.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><br></div><div>Sincerely,</div><div>Swarnava </div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 10:13 PM Swarnava Ghosh <<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thanks for the clarification, Junchao.<div><br></div><div>Sincerely,</div><div>Swarnava</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 10:08 PM Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 8:47 PM Swarnava Ghosh <<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Junchao,<div><br><div>If I want to pass command line options as  -mymat_mat_type aijcusparse, should it be MatSetOptionsPrefix(A,"mymat"); or MatSetOptionsPrefix(A,"mymat_"); ? Could you please clarify?</div></div></div></blockquote><div> my fault, it should be MatSetOptionsPrefix(A,"mymat_"), as seen in mat/tests/ex62.c<br></div><div> Thanks</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><br></div><div>Sincerely,</div><div>Swarnava</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 9:23 PM Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div style="color:rgb(0,0,0);font-family:Menlo,Monaco,"Courier New",monospace;font-size:14px;line-height:21px;white-space:pre-wrap"><div><span style="color:rgb(121,94,38)">MatSetOptionsPrefix(A,"mymat")</span></div><div><div style="line-height:21px"><div><span style="color:rgb(121,94,38)">VecSetOptionsPrefix</span>(<span style="color:rgb(0,16,128)">v</span>,"myvec")</div><div><br></div></div></div></div><div><div dir="ltr"><div dir="ltr">--Junchao Zhang</div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 8:04 PM Chang Liu <<a href="mailto:cliu@pppl.gov" target="_blank">cliu@pppl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Junchao,<br>
<br>
Thank you for your answer. I tried MatConvert and it works. I didn't <br>
make it before because I forgot to convert a vector from mpi to mpicuda <br>
previously.<br>
<br>
For vector, there is no VecConvert to use, so I have to do VecDuplicate, <br>
VecSetType and VecCopy. Is there an easier option?<br></blockquote><div> As Matt suggested, you could single out the matrix and vector with options prefix and set their type on command line</div><div><br></div>MatSetOptionsPrefix(A,"mymat");<br>VecSetOptionsPrefix(v,"myvec");</div><div class="gmail_quote"><br></div><div class="gmail_quote">Then, -mymat_mat_type aijcusparse -myvec_vec_type cuda<br><div> </div><div>A simpler code is to have the vector type automatically set by MatCreateVecs(A,&v,NULL)<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Chang<br>
<br>
On 10/18/21 5:23 PM, Junchao Zhang wrote:<br>
> <br>
> <br>
> On Mon, Oct 18, 2021 at 3:42 PM Chang Liu via petsc-users <br>
> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <mailto:<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>>> wrote:<br>
> <br>
>     Hi Matt,<br>
> <br>
>     I have a related question. In my code I have many matrices and I only<br>
>     want to have one living on GPU, the others still staying on CPU mem.<br>
> <br>
>     I wonder if there is an easier way to copy a mpiaij matrix to<br>
>     mpiaijcusparse (in other words, copy data to GPUs). I can think of<br>
>     creating a new mpiaijcusparse matrix, and copying the data line by<br>
>     line.<br>
>     But I wonder if there is a better option.<br>
> <br>
>     I have tried MatCopy and MatConvert but neither work.<br>
> <br>
> Did you use MatConvert(mat,matype,MAT_INPLACE_MATRIX,&mat)?<br>
> <br>
> <br>
>     Chang<br>
> <br>
>     On 10/17/21 7:50 PM, Matthew Knepley wrote:<br>
>      > On Sun, Oct 17, 2021 at 7:12 PM Swarnava Ghosh<br>
>     <<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a> <mailto:<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>><br>
>      > <mailto:<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a> <mailto:<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>>>> wrote:<br>
>      ><br>
>      >     Do I need convert the MATSEQBAIJ to a cuda matrix in code?<br>
>      ><br>
>      ><br>
>      > You would need a call to MatSetFromOptions() to take that type<br>
>     from the<br>
>      > command line, and not have<br>
>      > the type hard-coded in your application. It is generally a bad<br>
>     idea to<br>
>      > hard code the implementation type.<br>
>      ><br>
>      >     If I do it from command line, then are the other MatVec calls are<br>
>      >     ported onto CUDA? I have many MatVec calls in my code, but I<br>
>      >     specifically want to port just one call.<br>
>      ><br>
>      ><br>
>      > You can give that one matrix an options prefix to isolate it.<br>
>      ><br>
>      >    Thanks,<br>
>      ><br>
>      >       Matt<br>
>      ><br>
>      >     Sincerely,<br>
>      >     Swarnava<br>
>      ><br>
>      >     On Sun, Oct 17, 2021 at 7:07 PM Junchao Zhang<br>
>      >     <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a> <mailto:<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>><br>
>     <mailto:<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a> <mailto:<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>>>><br>
>     wrote:<br>
>      ><br>
>      >         You can do that with command line options -mat_type<br>
>     aijcusparse<br>
>      >         -vec_type cuda<br>
>      ><br>
>      >         On Sun, Oct 17, 2021, 5:32 PM Swarnava Ghosh<br>
>      >         <<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a> <mailto:<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>><br>
>     <mailto:<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a> <mailto:<a href="mailto:swarnava89@gmail.com" target="_blank">swarnava89@gmail.com</a>>>> wrote:<br>
>      ><br>
>      >             Dear Petsc team,<br>
>      ><br>
>      >             I had a query regarding using CUDA to accelerate a matrix<br>
>      >             vector product.<br>
>      >             I have a sequential sparse matrix (MATSEQBAIJ type).<br>
>     I want<br>
>      >             to port a MatVec call onto GPUs. Is there any<br>
>     code/example I<br>
>      >             can look at?<br>
>      ><br>
>      >             Sincerely,<br>
>      >             SG<br>
>      ><br>
>      ><br>
>      ><br>
>      > --<br>
>      > What most experimenters take for granted before they begin their<br>
>      > experiments is infinitely more interesting than any results to which<br>
>      > their experiments lead.<br>
>      > -- Norbert Wiener<br>
>      ><br>
>      > <a href="https://www.cse.buffalo.edu/~knepley/" rel="noreferrer" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br>
>     <<a href="https://www.cse.buffalo.edu/~knepley/" rel="noreferrer" target="_blank">https://www.cse.buffalo.edu/~knepley/</a>><br>
>     <<a href="http://www.cse.buffalo.edu/~knepley/" rel="noreferrer" target="_blank">http://www.cse.buffalo.edu/~knepley/</a><br>
>     <<a href="http://www.cse.buffalo.edu/~knepley/" rel="noreferrer" target="_blank">http://www.cse.buffalo.edu/~knepley/</a>>><br>
> <br>
>     -- <br>
>     Chang Liu<br>
>     Staff Research Physicist<br>
>     +1 609 243 3438<br>
>     <a href="mailto:cliu@pppl.gov" target="_blank">cliu@pppl.gov</a> <mailto:<a href="mailto:cliu@pppl.gov" target="_blank">cliu@pppl.gov</a>><br>
>     Princeton Plasma Physics Laboratory<br>
>     100 Stellarator Rd, Princeton NJ 08540, USA<br>
> <br>
<br>
-- <br>
Chang Liu<br>
Staff Research Physicist<br>
+1 609 243 3438<br>
<a href="mailto:cliu@pppl.gov" target="_blank">cliu@pppl.gov</a><br>
Princeton Plasma Physics Laboratory<br>
100 Stellarator Rd, Princeton NJ 08540, USA<br>
</blockquote></div></div>
</blockquote></div>
</blockquote></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div></div>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>