[petsc-users] MatMult scalability problem

Mark F. Adams mark.adams at columbia.edu
Wed May 2 11:24:43 CDT 2012


THings to check are:

1) run on a dedicated machine

2) are the matrices partitioned in the same way for both tests.

3) MatVec is memory bandwidth limited.  14 or 15 speedup on 16 shared memory cores is great.  5 or 6 is bad.  My experience with old IBS SPs was getting a speed of 12 and SPs have a great (and expensive) memory system.  I don't know what state-of-the-art is now but memory bandwidth has been going down generally wrt processor speed so I'd take a look at the number from your code again.

Mark

On May 2, 2012, at 12:01 PM, Javier Fresno wrote:

> 
> 
> I have a very simple Petsc program that multiplies a matrix and a vector several times. It works fine but it has some scalability issues. I execute it in a shared memory machine with 16 processors and it only runs 5 or 6 times faster (only taking into account the MatMult call). I have programmed the same algorithm with C and MPI and it shows a proper speedup (around 14 or 15). The matrices I use have millions of non zero elements, so I think they are big enough.
> 
> What can I do to get the same speedup that in the manual C version?
> 
> I enclose an except of the code. Thank you in advance.
> 
> Javier
> 
> 
> 
> /**
> * Main function
> */
> int main(int argc, char ** argv){
> 
>    // Initialize Petsc
>    PetscInitialize(&argc, &argv, (char *) 0, NULL);
> 
>    // Timers
>    PetscLogDouble t_start, t_end;
> 
>    // File Viewer
>    PetscViewer fd;
>    PetscViewerBinaryOpen(PETSC_COMM_WORLD,"matrix_file",FILE_MODE_READ,&fd);
> 
>    // M matrix
>    Mat M;
>    MatCreate(PETSC_COMM_WORLD,&M);
>    MatSetFromOptions(M);
>    MatLoad(M,fd);
>    PetscViewerDestroy(&fd);
>    MatAssemblyBegin(M,MAT_FINAL_ASSEMBLY);
>    MatAssemblyEnd(M,MAT_FINAL_ASSEMBLY);
> 
>    PetscInt n, m, local_n, local_m;
>    MatGetSize(M,&n,&m);
>    MatGetLocalSize(M,&local_n,&local_m);
> 
>    // b and c vectors
>    Vec b,c;
>    VecCreate(PETSC_COMM_WORLD,&b);
>    VecSetFromOptions(b);
>    VecSetSizes(b,local_n,n);
> 
>    VecCreate(PETSC_COMM_WORLD,&c);
>    VecSetFromOptions(c);
>    VecSetSizes(c,local_n,n);
> 
>    init_vector_values(b);
> 
>    VecAssemblyBegin(b);
>    VecAssemblyEnd(b);
> 
> 
>    // Main computation
>    PetscGetTime(&t_start);
>    int i;
>    for(i=0; i<iter/2; i++){
>        MatMult(M,b,c);
>        MatMult(M,c,b);
>    }
>    PetscGetTime(&t_end);
> 
>    PetscPrintf(PETSC_COMM_WORLD,"Comp time: %lf\n",t_end-t_start);
> 
>    PetscFinalize();
> 
>    return 0;
> }
> 
> 
> 



More information about the petsc-users mailing list