[petsc-users] Why am I getting bad scalability for least squares problems?
Gaurish Telang
gaurish108 at gmail.com
Tue Jan 29 19:11:21 CST 2013
Hello,
I am trying to solve some over-determined / under-determined least-squares
systems in parallel using the LSQR routine of PETSc.
In spite of being able to solve the least squares problem || Ax-b ||
_correctly_ with this Krylov method on multiple processors, I don't seem
to be acheiving scalability with respect to wall-clock time
for the set of matrices that I am interested in.
Below, I have listed the matrix sizes, the number of non-zeros in each
matrix and the wall-clock times to required for 10 iterations of the LSQR
krylov method.
The timings have been amortized over 50 iterations of the solver i.e. I
solve the least-square problem 50 times ( where each time 10 iterations of
LSQR are carried out)
and obtain the arithmetic mean of wall-clock times so recorded (i.e
(sum-of-clock-times)/50).
Matrix size: 28407 x 19899
Non-zeros: 725363
Wall clock times
1 proc: 0.094 s
2 proc: 0.087 s
3 proc: 0.089 s
4 proc: 0.093 s
Matrix size: 95194 x 155402
Non-zeros: 3877164
Wall clock times :
1 proc: 0.23 s
2 proc: 0.21 s
3 proc: 0.22 s
4 proc: 0.23 s
Matrix size: 125878 x 207045
Non-zeros: 3717995
Wall clock times
1 proc: 0.24 s
2 proc: 0.22 s
3 proc: 0.23 s
4 proc: 0.24 s
I have other matrices which show similar bad scaling as I increase the
processor count.
Please let me know what I can do to improve the performance after I
increase the processor count.
I feel it is a data-layout problem i.e. I need to chose some other
data-structure with PETSc to represent the matrix and vector of my
least-squares problem.
Currently my Matrix type is "mpiaij" and Vec type is "mpi" which I set at
the command-line while running the executable.
I use the terminal command :
mpirun -np <num-procs> ./test_parallel
-fmat <binary-mat-file>
-frhs <binary-vec-file>
-vec_type mpi
-mat_type mpiaij
-ksp_type lsqr
-ksp_max_it 10
I have pasted the code I am using below for your reference.
Thank you,
Gaurish.
==============================================================================================================================
#include <petscksp.h>
#include <petscsys.h>
#include <petsctime.h>
#include <stdio.h>
#undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **args)
{
Vec x, b, residue; /* approx solution, RHS, residual */
Mat A; /* linear system matrix */
KSP ksp; /* linear solver context */
PC pc; /* preconditioner context */
PetscErrorCode ierr;
PetscInt m,n ; /* # number of rows and columns of the matrix
read in*/
PetscViewer fd ;
PetscInt size;
//tscInt its;
//PetscScalar norm, tol=1.e-5;
PetscBool flg;
char fmat[PETSC_MAX_PATH_LEN]; /* input file names */
char frhs[PETSC_MAX_PATH_LEN]; /* input file names */
PetscInitialize(&argc,&args,(char *)0,help);
ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size);CHKERRQ(ierr);
ierr =
PetscOptionsGetString(PETSC_NULL,"-fmat",fmat,PETSC_MAX_PATH_LEN,&flg);
ierr =
PetscOptionsGetString(PETSC_NULL,"-frhs",frhs,PETSC_MAX_PATH_LEN,&flg);
/* Read in the matrix */
ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD, fmat, FILE_MODE_READ, &fd
); CHKERRQ(ierr);
ierr = MatCreate(PETSC_COMM_WORLD,&A);
// ierr = MatSetType(A,MATSEQAIJ);
ierr = MatSetFromOptions(A);CHKERRQ(ierr);
ierr = MatLoad(A,fd);
ierr = PetscViewerDestroy(&fd);
/*Read in the right hand side*/
ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD, frhs, FILE_MODE_READ, &fd
); CHKERRQ(ierr);
ierr = VecCreate(PETSC_COMM_WORLD,&b);
ierr = VecSetFromOptions(b);CHKERRQ(ierr);
ierr = VecLoad(b,fd);
ierr = PetscViewerDestroy(&fd);
/* Get the matrix size. Used for setting the vector sizes*/
ierr = MatGetSize(A , &m, &n);
//printf("The size of the matrix read in is %d x %d\n", m , n);
VecCreate(PETSC_COMM_WORLD, &x);
VecSetSizes(x, PETSC_DECIDE, n);
VecSetFromOptions(x);
VecCreate(PETSC_COMM_WORLD, &residue);
VecSetSizes(residue, PETSC_DECIDE, m);
VecSetFromOptions(residue);
/* Set the solver type at the command-line */
ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
ierr = PCSetType(pc,PCNONE);CHKERRQ(ierr);
ierr =
KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
/* Set runtime options, e.g.,
-ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
These options will override those specified above as long as
KSPSetFromOptions() is called _after_ any other customization
routines */
ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
/* Initial guess for the krylov method set to zero*/
PetscScalar p = 0;
ierr = VecSet(x,p);CHKERRQ(ierr);
ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);
/* Solve linear system */
PetscLogDouble v1,v2,elapsed_time[50];
int i;
for (i = 0; i < 50; ++i)
{
ierr = PetscGetTime(&v1);CHKERRQ(ierr);
ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
ierr = PetscGetTime(&v2);CHKERRQ(ierr);
elapsed_time[i] = v2 - v1;
PetscPrintf(PETSC_COMM_WORLD,"[%d] Time for the solve: %g s\n", i
, elapsed_time[i]);
}
PetscLogDouble sum=0,amortized_time ;
for ( i = 0; i < 50; ++i)
{
sum += elapsed_time[i];
}
amortized_time = sum/50;
PetscPrintf(PETSC_COMM_WORLD,"\n\n***********************\nAmortized Time
for the solve: %g s\n***********************\n\n", amortized_time);
/* View solver info; we could instead use the option -ksp_view to print
this info to the screen at the conclusion of KSPSolve().*/
ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
/* Clean up */
ierr = VecDestroy(&x);CHKERRQ(ierr);
ierr = VecDestroy(&residue);CHKERRQ(ierr);
ierr = VecDestroy(&b);CHKERRQ(ierr);
ierr = MatDestroy(&A);CHKERRQ(ierr);
ierr = KSPDestroy(&ksp);CHKERRQ(ierr);
ierr = PetscFinalize();
return 0;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130129/07c3d681/attachment.html>
More information about the petsc-users
mailing list