[petsc-users] error when using ksplsqr for overdetermined system

Jin, Shuangshuang Shuangshuang.Jin at pnnl.gov
Sun Aug 5 15:49:03 CDT 2012


Hi, Mat, I replaced the inputs with a larger system (the A matrix has 27090 rows and 14959 columns). With the -pc_type none set up as the runtime option, the system can be solved correctly, but the new problem is the solving time is very slow, taking about 75 seconds to run KSPSolve for the lsqr problem. Is there any faster preconditioning type we can use here instead of "none"?

[d3m956 at olympus tutorials]$ mpiexec -n 1 ss -pc_type none
Pre KSPSolve Elapsed Time: 0.195198
KSPSolve Elapsed Time: 74.788414
KSPGetIterationNumber 10000
KSPGetResidualNorm 17921.356837
KSPConvergedReason -3
KSP Object: 1 MPI processes
  type: lsqr
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: none
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=27090, cols=14959
    total: nonzeros=70903, allocated nonzeros=70903
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Overall Elapsed Time: 75.155196

Another question is how to run lsqr in parallel. I tried to use 4 processors for example, but it didn't help to speed up.

[d3m956 at olympus tutorials]$ mpiexec -n 4 ss -pc_type none
Pre KSPSolve Elapsed Time: 0.177819
Pre KSPSolve Elapsed Time: 0.180807
Pre KSPSolve Elapsed Time: 0.179783
Pre KSPSolve Elapsed Time: 0.181975
KSPSolve Elapsed Time: 73.107033
KSPGetIterationNumber 10000
KSPGetResidualNorm 17921.356837
KSPConvergedReason -3
KSP Object: 1 MPI processes
  type: lsqr
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: none
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=27090, cols=14959
    total: nonzeros=70903, allocated nonzeros=70903
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Overall Elapsed Time: 73.455209
KSPSolve Elapsed Time: 73.425921
KSPGetIterationNumber 10000
KSPGetResidualNorm 17921.356837
KSPConvergedReason -3
KSP Object: 1 MPI processes
  type: lsqr
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: none
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=27090, cols=14959
    total: nonzeros=70903, allocated nonzeros=70903
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Overall Elapsed Time: 73.771948
KSPSolve Elapsed Time: 76.629414
KSPGetIterationNumber 10000
KSPGetResidualNorm 17921.356837
KSPConvergedReason -3
KSPSolve Elapsed Time: 76.698510
KSPGetIterationNumber 10000
KSPGetResidualNorm 17921.356837
KSPConvergedReason -3
KSP Object: 1 MPI processes
  type: lsqr
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: none
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=27090, cols=14959
    total: nonzeros=70903, allocated nonzeros=70903
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Overall Elapsed Time: 77.174527
KSP Object: 1 MPI processes
  type: lsqr
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: none
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=27090, cols=14959
    total: nonzeros=70903, allocated nonzeros=70903
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Overall Elapsed Time: 77.247951

Is there anything I need to do in the code to parallelize the ksp lsqr solver other than specify the mpiexec -n <NumOfPrcessors> option in the command line?

Thanks,
Shuangshuang


On 8/3/12 4:35 PM, "Matthew Knepley" <knepley at gmail.com> wrote:

On Fri, Aug 3, 2012 at 5:38 PM, Jin, Shuangshuang <Shuangshuang.Jin at pnnl.gov> wrote:
Hi, all, I'm trying to use ksplsqr to solve an overdetermined system.
I have read the following threads regarding the topic:
http://lists.mcs.anl.gov/pipermail/petsc-users/2011-January/007763.html <http://lists.mcs.anl.gov/pipermail/petsc-users/2011-January/007763.html>
http://lists.mcs.anl.gov/pipermail/petsc-users/2010-August/006784.html <http://lists.mcs.anl.gov/pipermail/petsc-users/2010-August/006784.html>

I defined a 4*3 matrix A and a vector b in sparse CSR format:
irow: 1 3 5 6 9
icol: 1 2 1 3 2 1 2 3
ival: 1 2 2 5 3 1 4 1
rhs: 1 2 3 4

A = 1 2 0
       2 0 5
       0 3 0
       1 4 1
b = 1
       2
       3
       4

I wrote the following code to solve this system:

static char help[]="Reading in a matrix\n";

#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#include "petscvec.h" /* This enables us to use vectors. */
#include "petscmat.h" /* This enables us to use Matrices. It includes the petscvec header file*/
#include "petscksp.h" /* Now we can solve linear systems. Solvers used are KSP. */

int main(int argc, char **argv)
{
  /* Declaration of Matrix A and some vectors x*/
  Vec b,x;
  Mat A;
  FILE *fp;

  MPI_Comm comm;
  PetscInt n=4,m=3,nz=8,index;
  PetscErrorCode ierr;
  PetscInt i,j;
  comm = MPI_COMM_SELF;
  PetscInt irow[n+1],icol[nz];
  PetscScalar ival[nz],rhs[n];
  PetscInt *cnt,*col;

  KSP ksp;

  /* This part is needed for the help flag supplied at run-time*/
  ierr = PetscInitialize(&argc,&argv,(char*)0,help);CHKERRQ(ierr);
  ierr = PetscOptionsGetInt(PETSC_NULL,"-n",&n,PETSC_NULL);CHKERRQ(ierr);

/*===================================================================================*/

  fp=fopen("irow.txt","r");
  if (fp==NULL)
  {
    fprintf(stderr, "Cannot open file\n");
    exit(1);
  }

  for (i = 0; i < n+1; i++)
  {
    if (fscanf(fp,"%d", &irow[i]) != 1)
    {
      fprintf(stderr, "Failed to read irow vector[%d]\n", i);
      exit(1);
    }
  }
  fclose(fp);
  for (i = 0; i < n+1; i++) printf("irow[%d]=%d\n",i,irow[i]);

  fp=fopen("icol.txt","r");
  if (fp==NULL)
  {
    fprintf(stderr, "Cannot open file\n");
    exit(1);
  }

  for (i = 0; i < nz; i++)
  {
    if (fscanf(fp,"%d", &icol[i]) != 1)
    {
      fprintf(stderr, "Failed to read icol vector[%d]\n", i);
      exit(1);
    }
  }
  fclose(fp);
  for (i = 0; i < nz; i++) printf("icol[%d]=%d\n",i,icol[i]);

  fp=fopen("ival.txt","r");
  if (fp==NULL)
  {
    fprintf(stderr, "Cannot open file\n");
    exit(1);
  }

  for (i = 0; i < nz; i++)
  {
    if (fscanf(fp,"%lf", &ival[i]) != 1)
    {
      fprintf(stderr, "Failed to read ival vector[%d]\n", i);
      exit(1);
    }
  }
  fclose(fp);
  for (i = 0; i < nz; i++) printf("ival[%d]=%lf\n",i,ival[i]);

/*===================================================================================*/

  // determine number of nonzeros per row in the new matrix
  PetscMalloc((n+1)*sizeof(PetscInt),&cnt);
  for (i=0;i<n;i++) {
    cnt[i]=irow[i+1]-irow[i];
  }
  MatCreateSeqAIJ(PETSC_COMM_SELF,n,m,0,cnt,&A);
  MatSetOption(A,MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE);
  // copy over the matrix entries from the inputs
  for (i=0;i<n;i++) {
    PetscMalloc(cnt[i]*sizeof(PetscInt),&col);
    for (j=0;j<cnt[i];j++) {
      col[j]=icol[irow[i]+j-1]-1;
    }
    for (j=irow[i]-1;j<irow[i]-1+cnt[i];j++)
    MatSetValues(A,1,&i,cnt[i],col,&ival[irow[i]-1],INSERT_VALUES);
  }
  MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
  MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
  ierr=PetscPrintf(PETSC_COMM_WORLD,"Matrix A:\n");CHKERRQ(ierr);
  MatView(A,PETSC_VIEWER_STDOUT_SELF);

/*===================================================================================*/

  /* Use options from the terminal to create a vector that is type shared or mpi. */
  ierr = VecCreate(PETSC_COMM_WORLD,&b);CHKERRQ(ierr); /* Vector creation */
  ierr = VecSetSizes(b,PETSC_DECIDE,n);CHKERRQ(ierr); /* Setting the vector size */
  ierr = VecSetFromOptions(b);CHKERRQ(ierr); /* Setting the vector type (shared, mpi etc) */
  /* The second argument is a PETSc scalar value.*/
  ierr = VecSet(b,0);CHKERRQ(ierr);

  /* Reading in the RHS vector. */
  /* Reading in the matrix from the file matrix.txt */
  //fp=fopen("b1.mat","r");
  fp=fopen("rhs.txt","r");
  if (fp==NULL)
  {
    fprintf(stderr, "Cannot open file\n");
    exit(1);
  }

  for (i = 0; i < n; i++)
  {
    if (fscanf(fp,"%lf", &rhs[i]) != 1)
    {
      fprintf(stderr, "Failed to read rhs vector[%d]\n", i);
      exit(1);
    }
  }
  fclose(fp);

  index=0;
  /*Putting x into final form */
  for (i=0; i<n; ++i)
  {
    /* Oneinsertion per step. Thats what the 1 in second argument stands for */
    ierr= VecSetValues(b,1,&index,&rhs[i],INSERT_VALUES);CHKERRQ(ierr);
    index=index+1;
  } /* The third and fourth arguments are addresses. The fifth argument is IORA */

  /* Assembling the vector. */
  ierr= VecAssemblyBegin(b);CHKERRQ(ierr);
  ierr=VecAssemblyEnd(b);CHKERRQ(ierr);

  /* Viewing the changed vector. */
  ierr=PetscPrintf(PETSC_COMM_WORLD,"Vector b:\n");CHKERRQ(ierr);
  ierr=VecView(b,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

/*===================================================================================*/

  /* solve kspslqr system */
  VecCreate(PETSC_COMM_WORLD,&x);
  VecSetSizes(x,PETSC_DECIDE,m);
  VecSetFromOptions(x);

  KSPCreate(PETSC_COMM_WORLD, &ksp);
  KSPSetType(ksp, KSPLSQR);
  KSPSetTolerances(ksp,1.0e-6,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);
  KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER);
  KSPSetFromOptions(ksp);
  KSPSolve(ksp, b, x);

  PetscInt num_iters;
  KSPGetIterationNumber(ksp, &num_iters); CHKERRQ(ierr);

  PetscReal rnorm;
  ierr = KSPGetResidualNorm(ksp, &rnorm); CHKERRQ(ierr);

  KSPConvergedReason reason;
  ierr = KSPGetConvergedReason(ksp, &reason); CHKERRQ(ierr);

  printf("KSPGetIterationNumber %d\n",num_iters);
  printf("KSPGetResidualNorm %f\n",rnorm);
  printf("KSPConvergedReason %d\n",reason);

  //VecView(x,PETSC_VIEWER_STDOUT_WORLD);
  //PetscViewerASCIIOpen(PETSC_COMM_WORLD, "x.data", &viewer);
  VecView(x,PETSC_VIEWER_STDOUT_WORLD);
  KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);

  /*
  Destroy any objects created.
  */
  ierr=VecDestroy(&b);CHKERRQ(ierr);
  ierr=MatDestroy(&A);CHKERRQ(ierr);
  ierr=PetscFinalize();CHKERRQ(ierr);

  return 0;
}

This code read the inputs data and formed the A and b matrix correctly as I can view them in the output. However, I got "Invalid argument!" error as shown below:

[d3m956 at olympus tutorials]$ ./ss
irow[0]=1
irow[1]=3
irow[2]=5
irow[3]=6
irow[4]=9
icol[0]=1
icol[1]=2
icol[2]=1
icol[3]=3
icol[4]=2
icol[5]=1
icol[6]=2
icol[7]=3
ival[0]=1.000000
ival[1]=2.000000
ival[2]=2.000000
ival[3]=5.000000
ival[4]=3.000000
ival[5]=1.000000
ival[6]=4.000000
ival[7]=1.000000
Matrix A:
Matrix Object: 1 MPI processes
  type: seqaij
row 0: (0, 1)  (1, 2)
row 1: (0, 2)  (2, 5)
row 2: (1, 3)
row 3: (0, 1)  (1, 4)  (2, 1)
Vector b:
Vector Object: 1 MPI processes
  type: seq
1
2
3
4
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Invalid argument!
[0]PETSC ERROR: Must be square matrix, rows 4 columns 3!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 0, Tue Jun  5 14:20:42 CDT 2012
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./ss on a arch-linu named olympus.local by d3m956 Fri Aug  3 15:34:07 2012
[0]PETSC ERROR: Libraries linked from /pic/projects/mca/ss/PETSC/petsc-3.3-p0/arch-linux2-c-debug/lib
[0]PETSC ERROR: Configure run at Thu Jun 14 17:00:19 2012
[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack --download-mpich
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: MatGetOrdering() line 228 in src/mat/order/sorder.c
[0]PETSC ERROR: PCSetUp_ILU() line 206 in src/ksp/pc/impls/factor/ilu/ilu.c

It is ILU, the default preconditioner, that requires a square matrix. Try -pc_type none

   Matt

[0]PETSC ERROR: PCSetUp() line 832 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 278 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 402 in src/ksp/ksp/interface/itfunc.c
KSPGetIterationNumber 0
KSPGetResidualNorm 0.000000
KSPConvergedReason 0
Vector Object: 1 MPI processes
  type: seq
0
0
0
KSP Object: 1 MPI processes
  type: lsqr
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: ilu
    ILU: out-of-place factorization
    0 levels of fill
    tolerance for zero pivot 2.22045e-14
    using diagonal shift to prevent zero pivot
    matrix ordering: natural
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=4, cols=3
    total: nonzeros=8, allocated nonzeros=8
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

I don't understand why the lsqr solver requires a square matrix "Must be square matrix, rows 4 columns 3!"

Can anyone look into the problem for me?

Thanks a lot!

Shuangshuang






More information about the petsc-users mailing list