[petsc-users] Very poor speed up performance

Barry Smith bsmith at mcs.anl.gov
Mon Dec 20 12:36:34 CST 2010


  See http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#computers in particular note the discussion on memory bandwidth. Once you have started using multiple cores per CPU you will start to see very little speedup with Jacobi preconditioning since it is very memory bandwidth limited. In fact pretty much all sparse iterative solvers are memory bandwidth limited.

   Barry


On Dec 20, 2010, at 10:46 AM, Yongjun Chen wrote:

> 
> Hi everyone,
> 
> 
> 
> I use PETSC (version 3.1-p5) to solve a linear problem Ax=b. The matrix A and right hand vector b are read from files. The dimension of A is 1.2Million*1.2Million. I am pretty sure the matrix A and vector b have been read correctly.
> 
> I compiled the program with optimized version (--with-debugging=0), tested the speed up performance on two servers, and I have found that the performance is very poor.
> 
> For the two servers, one is 4 cpus * 4 cores per cpu, i.e., with a total 16 cores. And the other one is 4 cpus * 12 cores per cpu, with a total 48 cores.
> 
> On each of them, with the increasing of computing cores k from 1 to 8 (mpiexec –n  k ./Solver_MPI -pc_type jacobi -ksp-type gmres), the speed up will increase from 1 to 6, but when the computing cores k increase from 9 to 16(for the first server) or 48 (for the second server), the speed up decrease firstly and then remains a constant value 5.0 (for the first server) or 4.5(for the second server).
> 
> Actually, the program LAMMPS speed up excellently on these two servers.
> 
> Any comments are very appreciated! Thanks!
> 
>  
> --------------------------------------------------------------------------------------------------------------------------
> 
> PS: the related codes are as following,
> 
> 
> 
> //firstly read A and b from files
> 
> ...
> 
> //then
> 
>  
>               ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
> 
>               ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
> 
>               ierr = VecAssemblyBegin(b); CHKERRQ(ierr);
> 
>               ierr = VecAssemblyEnd(b); CHKERRQ(ierr);
> 
>  
>               ierr = MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); CHKERRQ(ierr);
> 
>               ierr = MatGetRowUpperTriangular(A); CHKERRQ(ierr);
> 
>               ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
> 
>  
>               ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
> 
>               ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
> 
>               ierr = KSPSetTolerances(ksp,1.e-7,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
> 
>               ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
> 
>  
>               ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
> 
>  
>               ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
> 
>  
>               ierr = KSPGetSolution(ksp, &x);CHKERRQ(ierr);
> 
>  
>               ierr = VecAssemblyBegin(x);CHKERRQ(ierr);
> 
>               ierr = VecAssemblyEnd(x);CHKERRQ(ierr);
> 
> ...
> 
>  
> 



More information about the petsc-users mailing list