<div class="gmail_extra">On Wed, May 2, 2012 at 12:01 PM, Javier Fresno <span dir="ltr"><<a href="mailto:jfresno@infor.uva.es" target="_blank">jfresno@infor.uva.es</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
I have a very simple Petsc program that multiplies a matrix and a vector several times. It works fine but it has some scalability issues. I execute it in a shared memory machine with 16 processors and it only runs 5 or 6 times faster (only taking into account the MatMult call). I have programmed the same algorithm with C and MPI and it shows a proper speedup (around 14 or 15). The matrices I use have millions of non zero elements, so I think they are big enough.<br>
<br>
What can I do to get the same speedup that in the manual C version?<br>
<br>
I enclose an except of the code. Thank you in advance.<br></blockquote><div><br></div><div>We need the output of -log_summary to say anything about performance. Also, we need to know what the</div><div>machine architecture is.</div>
<div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Javier<br>
<br>
<br>
<br>
/**<br>
* Main function<br>
*/<br>
int main(int argc, char ** argv){<br>
<br>
// Initialize Petsc<br>
PetscInitialize(&argc, &argv, (char *) 0, NULL);<br>
<br>
// Timers<br>
PetscLogDouble t_start, t_end;<br>
<br>
// File Viewer<br>
PetscViewer fd;<br>
PetscViewerBinaryOpen(PETSC_<u></u>COMM_WORLD,"matrix_file",FILE_<u></u>MODE_READ,&fd);<br>
<br>
// M matrix<br>
Mat M;<br>
MatCreate(PETSC_COMM_WORLD,&M)<u></u>;<br>
MatSetFromOptions(M);<br>
MatLoad(M,fd);<br>
PetscViewerDestroy(&fd);<br>
MatAssemblyBegin(M,MAT_FINAL_<u></u>ASSEMBLY);<br>
MatAssemblyEnd(M,MAT_FINAL_<u></u>ASSEMBLY);<br>
<br>
PetscInt n, m, local_n, local_m;<br>
MatGetSize(M,&n,&m);<br>
MatGetLocalSize(M,&local_n,&<u></u>local_m);<br>
<br>
// b and c vectors<br>
Vec b,c;<br>
VecCreate(PETSC_COMM_WORLD,&b)<u></u>;<br>
VecSetFromOptions(b);<br>
VecSetSizes(b,local_n,n);<br>
<br>
VecCreate(PETSC_COMM_WORLD,&c)<u></u>;<br>
VecSetFromOptions(c);<br>
VecSetSizes(c,local_n,n);<br>
<br>
init_vector_values(b);<br>
<br>
VecAssemblyBegin(b);<br>
VecAssemblyEnd(b);<br>
<br>
<br>
// Main computation<br>
PetscGetTime(&t_start);<br>
int i;<br>
for(i=0; i<iter/2; i++){<br>
MatMult(M,b,c);<br>
MatMult(M,c,b);<br>
}<br>
PetscGetTime(&t_end);<br>
<br>
PetscPrintf(PETSC_COMM_WORLD,"<u></u>Comp time: %lf\n",t_end-t_start);<br>
<br>
PetscFinalize();<br>
<br>
return 0;<br>
}<br>
<br>
<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>
</div>