<div><div>Hello Barry, </div><div dir="auto"><br>
> In runs did you zero the matrix before call the MatSetValues() initially?<br>
> <br></div></div><div><div>
I only zero the matrix when using add_values insert mode but otherwise I don’t <br></div></div><div><div><br>
<br>
> What kind of a processor is this? <br></div></div><div><div>
This is an i7-Processor 6 core 3.9Ghz if I can remember correctly <br><div dir="auto">
I tried it on another intel Xeon processor 14 core but using only 6 processor and I got a similar trend with increasing nrow buffer</div></div></div><div><div><br>
> Would you be able to run the code under gprof, vtune, instruments or something profiling package that gives line by line information about time being spent. In particular I'd like to see the results for MatSetValues(), MatSetValues2() with 1 row, with 2, rows and with 4 rows. <br>
> <br></div></div><div><div>I will try to do that but I am currently traveling to attend a conference so I can not get back to you on that until Thursday<br>
<br>
<br>
Regards,<br>
Kamra</div></div><div><div><br>
<br>
> On Jul 20, 2019, at 8:59 AM, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>> wrote:<br>
> <br>
> In runs did you zero the matrix before call the MatSetValues() initially?<br>
> <br>
> <br>
>> MatSetValues() <br>
>> FillPetscMat 3.6594e+00<br>
>> MatSetValues2_MPIAIJ() with an nrow_buffer = 1<br>
>> FillPetscMat 13.3920e+00<br>
>> nrow_buffer = 2<br>
>> FillPetscMat_with_MatSetValues2 3.3321e+00<br>
>> nrow_buffer = 5<br>
>> FillPetscMat_with_MatSetValues2 2.8842e+00<br>
>> nrow_buffer = 10<br>
>> FillPetscMat_with_MatSetValues2 2.7669e+00<br>
>> nrow_buffer = 20<br>
>> FillPetscMat_with_MatSetValues2 2.6834e+00<br>
>> nrow_buffer = 50<br>
>> FillPetscMat_with_MatSetValues2 2.6862e+00<br>
>> nrow_buffer = 100<br>
>> FillPetscMat_with_MatSetValues2 2.6170e+00<br>
>> <br>
> <br>
> The call to MatSetValues() has a little bit of checking and then another call to MatSetValues_MPIAIJ() so it is not surprising that going directly <br>
> to MatSetValues_MPIAIJ2() saves you a bit but it is a large savings than I would expect.<br>
> <br>
> I am greatly puzzled by the dramatic savings you get as you pass more rows to MatSetValues2. As far as I can see all you are saving is a function call, not much of anything else and that would NOT explain the huge time savings (functions calls are extremely cheap compared to .5 seconds. The multi row MatSetValues2() still has to do the same processing as with one call per row so why so much faster?<br>
> <br>
> What kind of a processor is this? <br>
> <br>
> Would you be able to run the code under gprof, vtune, instruments or something profiling package that gives line by line information about time being spent. In particular I'd like to see the results for MatSetValues(), MatSetValues2() with 1 row, with 2, rows and with 4 rows. <br>
> <br>
> <br>
> <br>
> Thanks<br>
> <br>
> Barry<br>
</div>
</div>