[petsc-users] strange and slow behavior when using KSPSolve

Barry Smith bsmith at mcs.anl.gov
Thu Mar 31 11:04:45 CDT 2011


  One thing that pops up immediately (though it shouldn't normally be a big deal). Create the VecScatterCreateToAll() only once up at the top, like the KSP.  You can run with -malloc and add PetscMallocDump() at the end of the each outer loop to see if more and more memory is being used.

  After the KSPSolve(), call KSPGetIterationNumber() to see if the number of iterations required is getting much worse with later solves. Also call KSPGetConvergedReason() and KSPGetResidualNorm() to see if it is oversolving or undersolving the linear system relative to earlier solves.

  

   Barry



   Barry

On Mar 31, 2011, at 10:19 AM, Alejandro Marcos Aragón wrote:

> Hi all,
> 
> I am currently using PETSc to solve a system of linear equations in parallel, so I created a wrapper class that interfaces my C++ code with the PETSc library.
> 
> Since the problem I'm trying to solve is nonlinear, I assemble the matrix once and I reuse the structure (the sparsity pattern is the same). The problem is that after a few times of solving the system very fast, it suddenly takes a considerable amount of time and I really don't understand why this is happening.
> 
> This is an extract of the code showing the timing issue that I have just before solving the system with 2 processes:
> 
> 
> [0] Inside PETSc solver:7.15256e-06 s
> [0]   Got ownership range... 3.21865e-05 s
> Inside PETSc solver:1.5974e-05 s
> [1]   Got ownership range... 4.00543e-05 s
> [1]   Zeroed-out sparse matrix entries... [0]   Zeroed-out sparse matrix entries... 0.000444174 s
> 0.000442982 s
> [1]   Filled sparse matrix... 0.0772262 s
> [0]   Filled sparse matrix... 0.081254 s
> [1]   Sparse matrix assembled... 0.0897892 s
> [0]   Sparse matrix assembled... 0.0954142 s
> [0]   Vectors created... 0.0955811 s[1]   Vectors created... 0.095592 s
> 
> [0]   Right hand side set... 0.0965121 s
> [1]   Right hand side set... 0.0965312 s
> [0]   Right-hand-side vector assembled... 0.096647 s
> [1]   Right-hand-side vector assembled... 0.096657 s
> [0]   System solved... 32.6188 s
> [1]   System solved... 32.6188 s
> [0]   Solution vector scattered... [1]   Solution vector scattered... 32.6192 s
> 32.6192 s
> [1]   Solution vector copied... 32.6215[0]   Solution vector copied... 32.6215 s
>  s
> [1]   Temporary data structures destroyed... 32.6216 s[0]   Temporary data structures destroyed... 32.6216 s
> 
> 
> 
> So you can see that it takes a considerable amount of time to solve the system at this point, when usually this is what I get:
> 
> [0]   Right-hand-side vector assembled... 0.089427 s
> [1]   Right-hand-side vector assembled... 0.089438 s
> [0]   System solved... 0.383509 s
> [1]   System solved... 0.38352 s
> 
> 
> Does anyone know why this is happening? Thanks in advance for any insight on this.
> 
> Alejandro M. Aragón
> 
> P.S.: The code I use is given below:
> 
> 	struct PETSc_solver {
> 		
>         	typedef doVERBOSEuble value_type;
> 		typedef sparse_vector<value_type> sparse_vector_type;
> 		typedef sparse_matrix<value_type> sparse_matrix_type;
>         
>         int argc_;
>         char **args_;
>         Mat A_;              //!< linear system matrix
>         Vec x_;              //!< Solution vector
>         KSP ksp_;            //!< linear solver context
>         bool allocated_;
>         
> 		PETSc_solver() : argc_(), args_(NULL), allocated_(false) {
>         
>             PetscInitialize(&argc_,&args_,NULL,NULL);
>             PetscErrorCode ierr = MatCreate(PETSC_COMM_WORLD,&A_);CHKERR(ierr);
>             ierr = KSPCreate(PETSC_COMM_WORLD,&ksp_);CHKERR(ierr);
>         }
> 		PETSc_solver(int argc, char** args) : argc_(argc), args_(args) {
>         
>             PetscInitialize(&argc_,&args_,NULL,NULL);
>             PetscErrorCode ierr = MatCreate(PETSC_COMM_WORLD,&A_);CHKERR(ierr);
>             ierr = KSPCreate(PETSC_COMM_WORLD,&ksp_);CHKERR(ierr);
>         }
>         
>         //! Overload operator() to solve system of linear equations
> 		sparse_vector_type operator()(const sparse_matrix_type& AA, const sparse_vector_type& bb);
> 
>         ~PETSc_solver() {
>             
>             PetscErrorCode ierr = MatDestroy(A_);CHKERR(ierr);
>             ierr = KSPDestroy(ksp_);CHKERR(ierr);
>             ierr = VecDestroy(x_);CHKERR(ierr);
>         }
>     };
> 
> 
> 
> PETSc_solver::sparse_vector_type PETSc_solver::operator()(const PETSc_solver::sparse_matrix_type& AA, const PETSc_solver::sparse_vector_type& bb) {
>     
>     assert(AA.rows() == bb.size());
>     
>     Vec            x,b;      /* approx solution, RHS */
>     PetscInt       Istart,Iend;
>     PetscErrorCode ierr;
>     
>     
> #ifdef VERBOSE
>     cpputils::ctimer timer;
>     MPIPRINT("Inside PETSc solver:" );
>     cout<<timer<<endl;
> #endif
> 
>     ierr = MatSetFromOptions(A_);CHKERR(ierr);
>     
>     if (!allocated_) {
>         
> #ifdef VERBOSE
>         MPIPRINT("  Allocating memory... ");
>         cout<<timer<<endl;
> #endif
>         
>         
>         ierr = MatSetSizes(A_,PETSC_DECIDE,PETSC_DECIDE, AA.rows(), AA.columns());CHKERR(ierr);
>         
>         // get number of non-zeros in both diagonal and non-diagonal portions of the matrix
>         std::pair<size_t,size_t> nz = AA.non_zero_off_diagonal();
>         ierr = MatMPIAIJSetPreallocation(A_, nz.first, PETSC_NULL, nz.second, PETSC_NULL); CHKERR(ierr);
>         ierr = MatSeqAIJSetPreallocation(A_, nz.first, PETSC_NULL); CHKERR(ierr);
>         
> #ifdef VERBOSE
>         cout<<"Ok! "<<timer<<endl;
> #endif
>     }
>     
>     /* 
>      Currently, all PETSc parallel matrix formats are partitioned by
>      contiguous chunks of rows across the processors.  Determine which
>      rows of the matrix are locally owned. 
>      */
>     ierr = MatGetOwnershipRange(A_,&Istart,&Iend);CHKERR(ierr);
>     
> #ifdef VERBOSE
>     MPIPRINT("  Got ownership range... ");
>     cout<<timer<<endl;
> #endif
>         
>     // zero-out matrix
>     MatZeroEntries(A_);
>     
> #ifdef VERBOSE
>     MPIPRINT("  Zeroed-out sparse matrix entries... ");
>     cout<<timer<<endl;
> #endif
>     
>     for (sparse_matrix_type::const_hash_iterator it = AA.map_.begin(); it != AA.map_.end(); ++it) {
>         
>         // get subscripts
>         std::pair<size_t,size_t> subs = AA.unhash(it->first);
>         int row = subs.first;
>         int col = subs.second;
>         ierr = MatSetValues(A_, 1, &row, 1, &col, &it->second, ADD_VALUES);CHKERR(ierr);
>     }
>     
> #ifdef VERBOSE
>     MPIPRINT("  Filled sparse matrix... ");
>     cout<<timer<<endl;
> #endif
>     
>     /* 
>      Assemble matrix, using the 2-step process:
>      MatAssemblyBegin(), MatAssemblyEnd()
>      Computations can be done while messages are in transition
>      by placing code between these two statements.
>      */
>     ierr = MatAssemblyBegin(A_,MAT_FINAL_ASSEMBLY);CHKERR(ierr);
>     ierr = MatAssemblyEnd(A_,MAT_FINAL_ASSEMBLY);CHKERR(ierr);
>     
> #ifdef VERBOSE
>     MPIPRINT("  Sparse matrix assembled... ");
>     cout<<timer<<endl;
> #endif
>     
>     ierr = MatSetOption(A_, MAT_NEW_NONZERO_LOCATIONS, PETSC_FALSE);CHKERR(ierr);
>     
>     
>     /* Create parallel vectors. */
>     ierr = VecCreate(PETSC_COMM_WORLD,&b);CHKERR(ierr);
>     ierr = VecSetSizes(b,PETSC_DECIDE, bb.size());CHKERR(ierr);
>     ierr = VecSetFromOptions(b);CHKERR(ierr);
>     
>     if (!allocated_) {
>         
>         ierr = VecDuplicate(b,&x_);CHKERR(ierr); 
>         
>         // set flag to true
>         allocated_ = true;
>     }
>     
>     VecZeroEntries(x_);    
>     
> #ifdef VERBOSE
>     MPIPRINT("  Vectors created... ");
>     cout<<timer<<endl;
> #endif
>     
>     /* Set hight-hand-side vector */
>     for (sparse_vector_type::const_hash_iterator it = bb.map_.begin(); it != bb.map_.end(); ++it) {
>         int row = it->first;
>         ierr = VecSetValues(b, 1, &row, &it->second, ADD_VALUES);
>     }
>     
> #ifdef VERBOSE
>     MPIPRINT("  Right hand side set... ");
>     cout<<timer<<endl;
> #endif
>     
>     /* 
>      Assemble vector, using the 2-step process:
>      VecAssemblyBegin(), VecAssemblyEnd()
>      Computations can be done while messages are in transition
>      by placing code between these two statements.
>      */
>     ierr = VecAssemblyBegin(b);CHKERR(ierr);
>     ierr = VecAssemblyEnd(b);CHKERR(ierr);
>     
> #ifdef VERBOSE
>     MPIPRINT("  Right-hand-side vector assembled... ");
>     cout<<timer<<endl;
> #endif
>     
>     /* 
>      Set operators. Here the matrix that defines the linear system
>      also serves as the preconditioning matrix.
>      */
>     ierr = KSPSetOperators(ksp_,A_,A_,SAME_NONZERO_PATTERN);CHKERR(ierr);
>     
>     
>     /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>      Solve the linear system
>      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */    
>     ierr = KSPSolve(ksp_,b,x_);CHKERR(ierr);
>     
> #ifdef VERBOSE
>     MPIPRINT("  System solved... ");
>     cout<<timer<<endl;
> #endif
>     
>     
>     /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>      Check solution and clean up
>      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
>      
>     VecScatter ctx;
>     Vec x_all;
>     
>     ierr = VecScatterCreateToAll(x_, &ctx, &x_all);CHKERR(ierr);
>     ierr = VecScatterBegin(ctx,x_,x_all,INSERT_VALUES,SCATTER_FORWARD);CHKERR(ierr);
>     ierr = VecScatterEnd(ctx,x_,x_all,INSERT_VALUES,SCATTER_FORWARD);CHKERR(ierr);
>     
> #ifdef VERBOSE
>     MPIPRINT("  Solution vector scattered... ");
>     cout<<timer<<endl;
> #endif
>     
>     /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>      Get values from solution and store them in the object that will be
>      returned
>      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
>     
>     sparse_vector_type xx(bb.size());
>     
>     /* Set solution vector */
>     for (sparse_vector_type::const_hash_iterator it = bb.map_.begin(); it != bb.map_.end(); ++it) {
>         int row = it->first;
>         ierr = VecGetValues(x_all, 1, &row, &xx[row]);
>     }
>     
> #ifdef VERBOSE
>     MPIPRINT("  Solution vector copied... ");
>     cout<<timer<<endl;
> #endif
>    
>     ierr = VecDestroy(b);CHKERR(ierr);
>     ierr = VecDestroy(x_all);CHKERR(ierr);
>     
> #ifdef VERBOSE
>     MPIPRINT("  Temporary data structures destroyed... ");
>     cout<<timer<<endl;
> #endif
>     
>     return xx;
> }
> 



More information about the petsc-users mailing list