[petsc-users] error when using ksplsqr for overdetermined system
Shri
abhyshr at mcs.anl.gov
Sun Aug 5 20:22:23 CDT 2012
i) The problem you are trying to solve did not converge, look at the number of KSP iterations (10000 is the default max)
and the residual norm.
ii) You are not solving the problem in parallel but solving the same instance of the problem on each processor. Here's what you have in your code.
comm = MPI_COMM_SELF;
I suggest storing your matrix in binary format first and then using MatLoad() http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatLoad.html to load your matrix in parallel, and then calling KSP.
The communicator should be MPI_COMM_WORLD and not MPI_COMM_SELF. As a start, you can modify
the example http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex27.c.html to suit what you are trying to do.
Btw, is this a power system state estimation problem that you are trying to solve?
Shri
----- Original Message -----
> Hi, Mat, I replaced the inputs with a larger system (the A matrix has
> 27090 rows and 14959 columns). With the -pc_type none set up as the
> runtime option, the system can be solved correctly, but the new
> problem is the solving time is very slow, taking about 75 seconds to
> run KSPSolve for the lsqr problem. Is there any faster preconditioning
> type we can use here instead of "none"?
>
> [d3m956 at olympus tutorials]$ mpiexec -n 1 ss -pc_type none
> Pre KSPSolve Elapsed Time: 0.195198
> KSPSolve Elapsed Time: 74.788414
> KSPGetIterationNumber 10000
> KSPGetResidualNorm 17921.356837
> KSPConvergedReason -3
> KSP Object: 1 MPI processes
> type: lsqr
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> left preconditioning
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: none
> linear system matrix = precond matrix:
> Matrix Object: 1 MPI processes
> type: seqaij
> rows=27090, cols=14959
> total: nonzeros=70903, allocated nonzeros=70903
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Overall Elapsed Time: 75.155196
>
> Another question is how to run lsqr in parallel. I tried to use 4
> processors for example, but it didn't help to speed up.
>
> [d3m956 at olympus tutorials]$ mpiexec -n 4 ss -pc_type none
> Pre KSPSolve Elapsed Time: 0.177819
> Pre KSPSolve Elapsed Time: 0.180807
> Pre KSPSolve Elapsed Time: 0.179783
> Pre KSPSolve Elapsed Time: 0.181975
> KSPSolve Elapsed Time: 73.107033
> KSPGetIterationNumber 10000
> KSPGetResidualNorm 17921.356837
> KSPConvergedReason -3
> KSP Object: 1 MPI processes
> type: lsqr
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> left preconditioning
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: none
> linear system matrix = precond matrix:
> Matrix Object: 1 MPI processes
> type: seqaij
> rows=27090, cols=14959
> total: nonzeros=70903, allocated nonzeros=70903
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Overall Elapsed Time: 73.455209
> KSPSolve Elapsed Time: 73.425921
> KSPGetIterationNumber 10000
> KSPGetResidualNorm 17921.356837
> KSPConvergedReason -3
> KSP Object: 1 MPI processes
> type: lsqr
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> left preconditioning
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: none
> linear system matrix = precond matrix:
> Matrix Object: 1 MPI processes
> type: seqaij
> rows=27090, cols=14959
> total: nonzeros=70903, allocated nonzeros=70903
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Overall Elapsed Time: 73.771948
> KSPSolve Elapsed Time: 76.629414
> KSPGetIterationNumber 10000
> KSPGetResidualNorm 17921.356837
> KSPConvergedReason -3
> KSPSolve Elapsed Time: 76.698510
> KSPGetIterationNumber 10000
> KSPGetResidualNorm 17921.356837
> KSPConvergedReason -3
> KSP Object: 1 MPI processes
> type: lsqr
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> left preconditioning
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: none
> linear system matrix = precond matrix:
> Matrix Object: 1 MPI processes
> type: seqaij
> rows=27090, cols=14959
> total: nonzeros=70903, allocated nonzeros=70903
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Overall Elapsed Time: 77.174527
> KSP Object: 1 MPI processes
> type: lsqr
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> left preconditioning
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: none
> linear system matrix = precond matrix:
> Matrix Object: 1 MPI processes
> type: seqaij
> rows=27090, cols=14959
> total: nonzeros=70903, allocated nonzeros=70903
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> Overall Elapsed Time: 77.247951
>
> Is there anything I need to do in the code to parallelize the ksp lsqr
> solver other than specify the mpiexec -n <NumOfPrcessors> option in
> the command line?
>
> Thanks,
> Shuangshuang
>
>
> On 8/3/12 4:35 PM, "Matthew Knepley" <knepley at gmail.com> wrote:
>
> On Fri, Aug 3, 2012 at 5:38 PM, Jin, Shuangshuang
> <Shuangshuang.Jin at pnnl.gov> wrote:
> Hi, all, I'm trying to use ksplsqr to solve an overdetermined system.
> I have read the following threads regarding the topic:
> http://lists.mcs.anl.gov/pipermail/petsc-users/2011-January/007763.html
> <http://lists.mcs.anl.gov/pipermail/petsc-users/2011-January/007763.html>
> http://lists.mcs.anl.gov/pipermail/petsc-users/2010-August/006784.html
> <http://lists.mcs.anl.gov/pipermail/petsc-users/2010-August/006784.html>
>
> I defined a 4*3 matrix A and a vector b in sparse CSR format:
> irow: 1 3 5 6 9
> icol: 1 2 1 3 2 1 2 3
> ival: 1 2 2 5 3 1 4 1
> rhs: 1 2 3 4
>
> A = 1 2 0
> 2 0 5
> 0 3 0
> 1 4 1
> b = 1
> 2
> 3
> 4
>
> I wrote the following code to solve this system:
>
> static char help[]="Reading in a matrix\n";
>
> #include<stdio.h>
> #include<math.h>
> #include<stdlib.h>
> #include "petscvec.h" /* This enables us to use vectors. */
> #include "petscmat.h" /* This enables us to use Matrices. It includes
> the petscvec header file*/
> #include "petscksp.h" /* Now we can solve linear systems. Solvers used
> are KSP. */
>
> int main(int argc, char **argv)
> {
> /* Declaration of Matrix A and some vectors x*/
> Vec b,x;
> Mat A;
> FILE *fp;
>
> MPI_Comm comm;
> PetscInt n=4,m=3,nz=8,index;
> PetscErrorCode ierr;
> PetscInt i,j;
> comm = MPI_COMM_SELF;
> PetscInt irow[n+1],icol[nz];
> PetscScalar ival[nz],rhs[n];
> PetscInt *cnt,*col;
>
> KSP ksp;
>
> /* This part is needed for the help flag supplied at run-time*/
> ierr = PetscInitialize(&argc,&argv,(char*)0,help);CHKERRQ(ierr);
> ierr =
> PetscOptionsGetInt(PETSC_NULL,"-n",&n,PETSC_NULL);CHKERRQ(ierr);
>
> /*===================================================================================*/
>
> fp=fopen("irow.txt","r");
> if (fp==NULL)
> {
> fprintf(stderr, "Cannot open file\n");
> exit(1);
> }
>
> for (i = 0; i < n+1; i++)
> {
> if (fscanf(fp,"%d", &irow[i]) != 1)
> {
> fprintf(stderr, "Failed to read irow vector[%d]\n", i);
> exit(1);
> }
> }
> fclose(fp);
> for (i = 0; i < n+1; i++) printf("irow[%d]=%d\n",i,irow[i]);
>
> fp=fopen("icol.txt","r");
> if (fp==NULL)
> {
> fprintf(stderr, "Cannot open file\n");
> exit(1);
> }
>
> for (i = 0; i < nz; i++)
> {
> if (fscanf(fp,"%d", &icol[i]) != 1)
> {
> fprintf(stderr, "Failed to read icol vector[%d]\n", i);
> exit(1);
> }
> }
> fclose(fp);
> for (i = 0; i < nz; i++) printf("icol[%d]=%d\n",i,icol[i]);
>
> fp=fopen("ival.txt","r");
> if (fp==NULL)
> {
> fprintf(stderr, "Cannot open file\n");
> exit(1);
> }
>
> for (i = 0; i < nz; i++)
> {
> if (fscanf(fp,"%lf", &ival[i]) != 1)
> {
> fprintf(stderr, "Failed to read ival vector[%d]\n", i);
> exit(1);
> }
> }
> fclose(fp);
> for (i = 0; i < nz; i++) printf("ival[%d]=%lf\n",i,ival[i]);
>
> /*===================================================================================*/
>
> // determine number of nonzeros per row in the new matrix
> PetscMalloc((n+1)*sizeof(PetscInt),&cnt);
> for (i=0;i<n;i++) {
> cnt[i]=irow[i+1]-irow[i];
> }
> MatCreateSeqAIJ(PETSC_COMM_SELF,n,m,0,cnt,&A);
> MatSetOption(A,MAT_IGNORE_ZERO_ENTRIES,PETSC_TRUE);
> // copy over the matrix entries from the inputs
> for (i=0;i<n;i++) {
> PetscMalloc(cnt[i]*sizeof(PetscInt),&col);
> for (j=0;j<cnt[i];j++) {
> col[j]=icol[irow[i]+j-1]-1;
> }
> for (j=irow[i]-1;j<irow[i]-1+cnt[i];j++)
> MatSetValues(A,1,&i,cnt[i],col,&ival[irow[i]-1],INSERT_VALUES);
> }
> MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
> MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
> ierr=PetscPrintf(PETSC_COMM_WORLD,"Matrix A:\n");CHKERRQ(ierr);
> MatView(A,PETSC_VIEWER_STDOUT_SELF);
>
> /*===================================================================================*/
>
> /* Use options from the terminal to create a vector that is type
> shared or mpi. */
> ierr = VecCreate(PETSC_COMM_WORLD,&b);CHKERRQ(ierr); /* Vector
> creation */
> ierr = VecSetSizes(b,PETSC_DECIDE,n);CHKERRQ(ierr); /* Setting the
> vector size */
> ierr = VecSetFromOptions(b);CHKERRQ(ierr); /* Setting the vector type
> (shared, mpi etc) */
> /* The second argument is a PETSc scalar value.*/
> ierr = VecSet(b,0);CHKERRQ(ierr);
>
> /* Reading in the RHS vector. */
> /* Reading in the matrix from the file matrix.txt */
> //fp=fopen("b1.mat","r");
> fp=fopen("rhs.txt","r");
> if (fp==NULL)
> {
> fprintf(stderr, "Cannot open file\n");
> exit(1);
> }
>
> for (i = 0; i < n; i++)
> {
> if (fscanf(fp,"%lf", &rhs[i]) != 1)
> {
> fprintf(stderr, "Failed to read rhs vector[%d]\n", i);
> exit(1);
> }
> }
> fclose(fp);
>
> index=0;
> /*Putting x into final form */
> for (i=0; i<n; ++i)
> {
> /* Oneinsertion per step. Thats what the 1 in second argument stands
> for */
> ierr= VecSetValues(b,1,&index,&rhs[i],INSERT_VALUES);CHKERRQ(ierr);
> index=index+1;
> } /* The third and fourth arguments are addresses. The fifth argument
> is IORA */
>
> /* Assembling the vector. */
> ierr= VecAssemblyBegin(b);CHKERRQ(ierr);
> ierr=VecAssemblyEnd(b);CHKERRQ(ierr);
>
> /* Viewing the changed vector. */
> ierr=PetscPrintf(PETSC_COMM_WORLD,"Vector b:\n");CHKERRQ(ierr);
> ierr=VecView(b,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
>
> /*===================================================================================*/
>
> /* solve kspslqr system */
> VecCreate(PETSC_COMM_WORLD,&x);
> VecSetSizes(x,PETSC_DECIDE,m);
> VecSetFromOptions(x);
>
> KSPCreate(PETSC_COMM_WORLD, &ksp);
> KSPSetType(ksp, KSPLSQR);
> KSPSetTolerances(ksp,1.0e-6,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);
> KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER);
> KSPSetFromOptions(ksp);
> KSPSolve(ksp, b, x);
>
> PetscInt num_iters;
> KSPGetIterationNumber(ksp, &num_iters); CHKERRQ(ierr);
>
> PetscReal rnorm;
> ierr = KSPGetResidualNorm(ksp, &rnorm); CHKERRQ(ierr);
>
> KSPConvergedReason reason;
> ierr = KSPGetConvergedReason(ksp, &reason); CHKERRQ(ierr);
>
> printf("KSPGetIterationNumber %d\n",num_iters);
> printf("KSPGetResidualNorm %f\n",rnorm);
> printf("KSPConvergedReason %d\n",reason);
>
> //VecView(x,PETSC_VIEWER_STDOUT_WORLD);
> //PetscViewerASCIIOpen(PETSC_COMM_WORLD, "x.data", &viewer);
> VecView(x,PETSC_VIEWER_STDOUT_WORLD);
> KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);
>
> /*
> Destroy any objects created.
> */
> ierr=VecDestroy(&b);CHKERRQ(ierr);
> ierr=MatDestroy(&A);CHKERRQ(ierr);
> ierr=PetscFinalize();CHKERRQ(ierr);
>
> return 0;
> }
>
> This code read the inputs data and formed the A and b matrix correctly
> as I can view them in the output. However, I got "Invalid argument!"
> error as shown below:
>
> [d3m956 at olympus tutorials]$ ./ss
> irow[0]=1
> irow[1]=3
> irow[2]=5
> irow[3]=6
> irow[4]=9
> icol[0]=1
> icol[1]=2
> icol[2]=1
> icol[3]=3
> icol[4]=2
> icol[5]=1
> icol[6]=2
> icol[7]=3
> ival[0]=1.000000
> ival[1]=2.000000
> ival[2]=2.000000
> ival[3]=5.000000
> ival[4]=3.000000
> ival[5]=1.000000
> ival[6]=4.000000
> ival[7]=1.000000
> Matrix A:
> Matrix Object: 1 MPI processes
> type: seqaij
> row 0: (0, 1) (1, 2)
> row 1: (0, 2) (2, 5)
> row 2: (1, 3)
> row 3: (0, 1) (1, 4) (2, 1)
> Vector b:
> Vector Object: 1 MPI processes
> type: seq
> 1
> 2
> 3
> 4
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Must be square matrix, rows 4 columns 3!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 0, Tue Jun 5
> 14:20:42 CDT 2012
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./ss on a arch-linu named olympus.local by d3m956 Fri
> Aug 3 15:34:07 2012
> [0]PETSC ERROR: Libraries linked from
> /pic/projects/mca/ss/PETSC/petsc-3.3-p0/arch-linux2-c-debug/lib
> [0]PETSC ERROR: Configure run at Thu Jun 14 17:00:19 2012
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
> --download-f-blas-lapack --download-mpich
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: MatGetOrdering() line 228 in src/mat/order/sorder.c
> [0]PETSC ERROR: PCSetUp_ILU() line 206 in
> src/ksp/pc/impls/factor/ilu/ilu.c
>
> It is ILU, the default preconditioner, that requires a square matrix.
> Try -pc_type none
>
> Matt
>
> [0]PETSC ERROR: PCSetUp() line 832 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 278 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 402 in src/ksp/ksp/interface/itfunc.c
> KSPGetIterationNumber 0
> KSPGetResidualNorm 0.000000
> KSPConvergedReason 0
> Vector Object: 1 MPI processes
> type: seq
> 0
> 0
> 0
> KSP Object: 1 MPI processes
> type: lsqr
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> left preconditioning
> using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: ilu
> ILU: out-of-place factorization
> 0 levels of fill
> tolerance for zero pivot 2.22045e-14
> using diagonal shift to prevent zero pivot
> matrix ordering: natural
> linear system matrix = precond matrix:
> Matrix Object: 1 MPI processes
> type: seqaij
> rows=4, cols=3
> total: nonzeros=8, allocated nonzeros=8
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> I don't understand why the lsqr solver requires a square matrix "Must
> be square matrix, rows 4 columns 3!"
>
> Can anyone look into the problem for me?
>
> Thanks a lot!
>
> Shuangshuang
More information about the petsc-users
mailing list