From jerome.snho at gmail.com Wed Jan 14 23:04:44 2009 From: jerome.snho at gmail.com (jerome ho) Date: Thu, 15 Jan 2009 13:04:44 +0800 Subject: Increasing convergence rate Message-ID: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> Hi I'm trying to simulate with different solvers in order to have range of options. The matrix is sparse and square, and eventually will be used for parallel simulation. These are the results: boomeramg+minres: 388MB in 1min (8 iterations) icc+cg: 165MB in 30min (>5000 iterations) bjacobi+cg: 201MB in 50min (>5000 iterations) There are several others but it takes >5000 iterations to achieve the boomeramg+minres result. Boomeramg result is good, but takes too high memory. I wonder if there're any options, or any better precondition+solver combination that I should use in order to improve the runtime of the non-boomeramng simulation. Jerome -- From jed at 59A2.org Thu Jan 15 03:29:09 2009 From: jed at 59A2.org (Jed Brown) Date: Thu, 15 Jan 2009 00:29:09 -0900 Subject: Increasing convergence rate In-Reply-To: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> Message-ID: <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> On Wed, Jan 14, 2009 at 20:04, jerome ho wrote: > boomeramg+minres: 388MB in 1min (8 iterations) > icc+cg: 165MB in 30min (>5000 iterations) > bjacobi+cg: 201MB in 50min (>5000 iterations) Note that in serial, bjacobi is just whatever -sub_pc_type is (ilu by default). In parallel, it's always worth trying -pc_type asm as an alternative to bjacobi. You can frequently make the incomplete factorization stronger by using multiple levels (-pc_factor_levels N), but it will use more memory. It looks like multigrid works well for your problem so it will likely be very hard for a traditional method to compete. To reduce memory usage in BoomerAMG, try these options -pc_hypre_boomeramg_truncfactor <0>: Truncation factor for interpolation (0=no truncation) (None) -pc_hypre_boomeramg_P_max <0>: Max elements per row for interpolation operator ( 0=unlimited ) (None) -pc_hypre_boomeramg_agg_nl <0>: Number of levels of aggressive coarsening (None) -pc_hypre_boomeramg_agg_num_paths <1>: Number of paths for aggressive coarsening (None) -pc_hypre_boomeramg_strong_threshold <0.25>: Threshold for being strongly connected (None) For 3D problems, the manual suggests setting strong_threshold to 0.5. It's also worth trying ML, especially for vector problems. Jed From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Thu Jan 15 06:59:41 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Thu, 15 Jan 2009 13:59:41 +0100 Subject: MatCreateMPIAIJWithSplitArrays In-Reply-To: References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> Message-ID: <20090115135941.vomtsudj7yo0ww8c@webmail.ec-nantes.fr> Thank you I used PETSc 2.3.3 regards, Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE Barry Smith a ??crit? : > > You should be able to use MatCreateMPIAIJWithSplitArrays(), > MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() > from Fortran. Are you using PETSc 3.0.0? > > The arguments for MatCreateMPIAIJWithArrays() or > MatMPIAIJSetPreallocationCSR() have the same meaning > (in fact MatCreateMPIAIJWithArrays() essentially calls > MatCreateMPIAIJWithArrays()). > > Barry > > On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: > >> Oh, I could not use MatCreateMPIAIJWithArrays either but the >> mechanism below works. >> >> call MatCreate(PETSC_COMM_WORLD,D,ierr) >> call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, >> $ ierr) >> call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix >> call MatSetFromOptions(D,ierr) >> call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) >> >> Where pointer is start-row indices a >> Column is local column indices >> v is value >> >> Is there the different beteween the start-row indices in >> MatMPIAIJSetPreallocationCSR and row indices in >> MatCreateMPIAIJWithArrays ? >> >> >> >> Regards, >> Jarunan >> >> >> >> >> Hello, >> >> To define a matrix with arrays, I cannot use >> MatCreateMPIAIJWithSplitArrays in my program which is written in >> Fortran: >> >> call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, >> $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, >> $ oColumn,ov,D,ierr) >> >> The error is >> F:246: undefined reference to `matcreatempiaijwithsplitarrays_' >> >> I could use MatCreateMPIAIJWithArrays but the off diagonal values >> are missing with this command. >> >> I would be appreciate for any advice. Thank you before hand. >> >> Regards, >> Jarunan >> >> >> >> >> -- >> Jarunan PANYASANTISUK >> MSc. in Computational Mechanics >> Erasmus Mundus Master Program >> Ecole Centrale de Nantes >> 1, rue de la no?, 44321 NANTES, FRANCE >> >> >> >> >> > > From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Thu Jan 15 09:42:40 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Thu, 15 Jan 2009 16:42:40 +0100 Subject: MatCreateMPIAIJWithSplitArrays In-Reply-To: References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> Message-ID: <20090115164240.uo0edtyqa5ss444s@webmail.ec-nantes.fr> When I create a matrix with MatCreateMPIAIJWithSplitArrays, as it doesn't copy the values so I have to use MatSetValues to set the internal value? -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE Barry Smith a ??crit? : > > You should be able to use MatCreateMPIAIJWithSplitArrays(), > MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() > from Fortran. Are you using PETSc 3.0.0? > > The arguments for MatCreateMPIAIJWithArrays() or > MatMPIAIJSetPreallocationCSR() have the same meaning > (in fact MatCreateMPIAIJWithArrays() essentially calls > MatCreateMPIAIJWithArrays()). > > Barry > > On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: > >> Oh, I could not use MatCreateMPIAIJWithArrays either but the >> mechanism below works. >> >> call MatCreate(PETSC_COMM_WORLD,D,ierr) >> call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, >> $ ierr) >> call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix >> call MatSetFromOptions(D,ierr) >> call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) >> >> Where pointer is start-row indices a >> Column is local column indices >> v is value >> >> Is there the different beteween the start-row indices in >> MatMPIAIJSetPreallocationCSR and row indices in >> MatCreateMPIAIJWithArrays ? >> >> >> >> Regards, >> Jarunan >> >> >> >> >> Hello, >> >> To define a matrix with arrays, I cannot use >> MatCreateMPIAIJWithSplitArrays in my program which is written in >> Fortran: >> >> call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, >> $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, >> $ oColumn,ov,D,ierr) >> >> The error is >> F:246: undefined reference to `matcreatempiaijwithsplitarrays_' >> >> I could use MatCreateMPIAIJWithArrays but the off diagonal values >> are missing with this command. >> >> I would be appreciate for any advice. Thank you before hand. >> >> Regards, >> Jarunan >> >> >> >> >> -- >> Jarunan PANYASANTISUK >> MSc. in Computational Mechanics >> Erasmus Mundus Master Program >> Ecole Centrale de Nantes >> 1, rue de la no?, 44321 NANTES, FRANCE >> >> >> >> >> > > From knepley at gmail.com Thu Jan 15 10:06:52 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Jan 2009 10:06:52 -0600 Subject: MatCreateMPIAIJWithSplitArrays In-Reply-To: <20090115164240.uo0edtyqa5ss444s@webmail.ec-nantes.fr> References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> <20090115164240.uo0edtyqa5ss444s@webmail.ec-nantes.fr> Message-ID: On Thu, Jan 15, 2009 at 9:42 AM, Panyasantisuk Jarunan < Jarunan.Panyasantisuk at eleves.ec-nantes.fr> wrote: > When I create a matrix with MatCreateMPIAIJWithSplitArrays, as it doesn't > copy the values so I have to use MatSetValues to set the internal value? 1) You should upgrade to 3.0.0 2) You should not have to call MatSetValues(). It will use the arrays you provide. Matt > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > Barry Smith a ?(c)crit? : > > >> You should be able to use MatCreateMPIAIJWithSplitArrays(), >> MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() >> from Fortran. Are you using PETSc 3.0.0? >> >> The arguments for MatCreateMPIAIJWithArrays() or >> MatMPIAIJSetPreallocationCSR() have the same meaning >> (in fact MatCreateMPIAIJWithArrays() essentially calls >> MatCreateMPIAIJWithArrays()). >> >> Barry >> >> On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: >> >> Oh, I could not use MatCreateMPIAIJWithArrays either but the mechanism >>> below works. >>> >>> call MatCreate(PETSC_COMM_WORLD,D,ierr) >>> call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, >>> $ ierr) >>> call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix >>> call MatSetFromOptions(D,ierr) >>> call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) >>> >>> Where pointer is start-row indices a >>> Column is local column indices >>> v is value >>> >>> Is there the different beteween the start-row indices in >>> MatMPIAIJSetPreallocationCSR and row indices in MatCreateMPIAIJWithArrays >>> ? >>> >>> >>> >>> Regards, >>> Jarunan >>> >>> >>> >>> >>> Hello, >>> >>> To define a matrix with arrays, I cannot use >>> MatCreateMPIAIJWithSplitArrays in my program which is written in Fortran: >>> >>> call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, >>> $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, >>> $ oColumn,ov,D,ierr) >>> >>> The error is >>> F:246: undefined reference to `matcreatempiaijwithsplitarrays_' >>> >>> I could use MatCreateMPIAIJWithArrays but the off diagonal values are >>> missing with this command. >>> >>> I would be appreciate for any advice. Thank you before hand. >>> >>> Regards, >>> Jarunan >>> >>> >>> >>> >>> -- >>> Jarunan PANYASANTISUK >>> MSc. in Computational Mechanics >>> Erasmus Mundus Master Program >>> Ecole Centrale de Nantes >>> 1, rue de la no?, 44321 NANTES, FRANCE >>> >>> >>> >>> >>> >>> >> >> > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hung.V.Nguyen at usace.army.mil Thu Jan 15 10:24:58 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Thu, 15 Jan 2009 10:24:58 -0600 Subject: Stopping criteria In-Reply-To: References: Message-ID: Hello Matt, >however I would first check your matrix using -pc_type lu -ksp_type preonly to make sure its not singular. I got the error message below while running with option above. Do I have to build a matrix with type of seqaij/seqbaij to run with the -pc_type lu option? Thanks, -Hung np .. 138524 np .. 143882 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: --------------------- Error Message ------------------------------------ [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message ------------------------------------ [13]PETSC ERROR: --------------------- Error Message ------------------------------------ --------------------- Error Message ------------------------------------ [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for this object type! [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for this object type! No support for this operation for this object type! [12]PETSC ERROR: Matrix type mpiaij symbolic LU! Matrix type mpiaij symbolic LU! [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij symbolic LU! [13]PETSC ERROR: ------------------------------------------------------------------------ -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Wednesday, January 14, 2009 2:05 PM To: petsc-users at mcs.anl.gov Subject: Re: Stopping criteria On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS wrote: Hello All, I tried to solve an ill-conditioned system using cg with Jacobi preconditioned. The KSP solver was stopping due to diverged reason within a few iterations. Is there a way to keep KSP solver running until max_it? There is no way to continue CG here because it gets a zero divisor, and interprets this as an indefinite matrix. You can try GMRES, however I would first check your matrix using -pc_type lu -ksp_type preonly to make sure its not singular. Matt Thanks, -hung hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type jacobi -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor -ksp_converged_reason 0 KSP Residual norm 1.379074550666e+04 1 KSP Residual norm 7.252034661743e+03 2 KSP Residual norm 7.302184771313e+03 3 KSP Residual norm 1.162244351275e+04 4 KSP Residual norm 7.912531765659e+03 5 KSP Residual norm 4.094706251487e+03 6 KSP Residual norm 5.486131070301e+03 7 KSP Residual norm 6.367904529202e+03 8 KSP Residual norm 6.312767173219e+03 Linear solve did not converge due to DIVERGED_INDEFINITE_MAT iterations 9 Time in PETSc solver: 0.452695 seconds The number of iteration = 9 The solution residual error = 6.312767e+03 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Thu Jan 15 10:40:07 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Jan 2009 10:40:07 -0600 Subject: Stopping criteria In-Reply-To: References: Message-ID: On Thu, Jan 15, 2009 at 10:24 AM, Nguyen, Hung V ERDC-ITL-MS < Hung.V.Nguyen at usace.army.mil> wrote: > Hello Matt, > > >however I would first check your matrix using -pc_type lu -ksp_type > preonly > to make sure its not singular. > > I got the error message below while running with option above. Do I have to > build a matrix with type of seqaij/seqbaij to run with the -pc_type lu > option? 1) Either run on a single process, or 2) Install a parallel LU, such as SuperLU --download-superlu Matt > > Thanks, > > -Hung > > > > np .. 138524 > np .. 143882 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: > --------------------- Error Message ------------------------------------ > [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [13]PETSC ERROR: --------------------- Error Message > ------------------------------------ > --------------------- Error Message ------------------------------------ > [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for this > object type! > [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for this > object type! > No support for this operation for this object type! > [12]PETSC ERROR: Matrix type mpiaij symbolic LU! > Matrix type mpiaij symbolic LU! > [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij > symbolic LU! > [13]PETSC ERROR: > ------------------------------------------------------------------------ > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] > On > Behalf Of Matthew Knepley > Sent: Wednesday, January 14, 2009 2:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Stopping criteria > > On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS > wrote: > > > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason > within a > few iterations. Is there a way to keep KSP solver running until > max_it? > > > There is no way to continue CG here because it gets a zero divisor, and > interprets this as an indefinite matrix. You can try GMRES, however I would > first check your matrix using -pc_type lu -ksp_type preonly to make sure > its > not singular. > > Matt > > > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg > -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor > -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > > > > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 15 10:40:07 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Jan 2009 10:40:07 -0600 Subject: Stopping criteria In-Reply-To: References: Message-ID: On Thu, Jan 15, 2009 at 10:24 AM, Nguyen, Hung V ERDC-ITL-MS < Hung.V.Nguyen at usace.army.mil> wrote: > Hello Matt, > > >however I would first check your matrix using -pc_type lu -ksp_type > preonly > to make sure its not singular. > > I got the error message below while running with option above. Do I have to > build a matrix with type of seqaij/seqbaij to run with the -pc_type lu > option? 1) Either run on a single process, or 2) Install a parallel LU, such as SuperLU --download-superlu Matt > > Thanks, > > -Hung > > > > np .. 138524 > np .. 143882 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: > --------------------- Error Message ------------------------------------ > [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [13]PETSC ERROR: --------------------- Error Message > ------------------------------------ > --------------------- Error Message ------------------------------------ > [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for this > object type! > [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for this > object type! > No support for this operation for this object type! > [12]PETSC ERROR: Matrix type mpiaij symbolic LU! > Matrix type mpiaij symbolic LU! > [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij > symbolic LU! > [13]PETSC ERROR: > ------------------------------------------------------------------------ > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] > On > Behalf Of Matthew Knepley > Sent: Wednesday, January 14, 2009 2:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Stopping criteria > > On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS > wrote: > > > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason > within a > few iterations. Is there a way to keep KSP solver running until > max_it? > > > There is no way to continue CG here because it gets a zero divisor, and > interprets this as an indefinite matrix. You can try GMRES, however I would > first check your matrix using -pc_type lu -ksp_type preonly to make sure > its > not singular. > > Matt > > > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg > -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor > -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > > > > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hung.V.Nguyen at usace.army.mil Thu Jan 15 10:47:49 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Thu, 15 Jan 2009 10:47:49 -0600 Subject: Stopping criteria In-Reply-To: <383ade90901141314q482b8ac1v9726c8fa16736827@mail.gmail.com> References: <383ade90901141314q482b8ac1v9726c8fa16736827@mail.gmail.com> Message-ID: Hello, > CG does not work for indefinite matrices, the natural thing to try is -ksp_type minres. You can also try a nonsymmetric KSP which gives you more choices for preconditioning, although good preconditioning for an indefinite matrix generally uses problem-specific information. Where does this matrix come from? How scalable does the solver need to be? This matrix is from CFD application and supposed to be SPD. The CFD used cg solver with jacobi preconditioner. It took a large number of iterations for the true residual norm drops by about 6 orders of magnitude. Thank you for the info. -hung -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Jed Brown Sent: Wednesday, January 14, 2009 3:15 PM To: petsc-users at mcs.anl.gov Subject: Re: Stopping criteria On Wed, Jan 14, 2009 at 10:54, Nguyen, Hung V ERDC-ITL-MS wrote: > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 CG does not work for indefinite matrices, the natural thing to try is -ksp_type minres. You can also try a nonsymmetric KSP which gives you more choices for preconditioning, although good preconditioning for an indefinite matrix generally uses problem-specific information. Where does this matrix come from? How scalable does the solver need to be? Jed From hzhang at mcs.anl.gov Thu Jan 15 11:08:38 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 15 Jan 2009 11:08:38 -0600 (CST) Subject: Stopping criteria In-Reply-To: References: Message-ID: >> >>> however I would first check your matrix using -pc_type lu -ksp_type >> preonly >> to make sure its not singular. >> >> I got the error message below while running with option above. Do I have to >> build a matrix with type of seqaij/seqbaij to run with the -pc_type lu >> option? > > > 1) Either run on a single process, or > > 2) Install a parallel LU, such as SuperLU --download-superlu SuperLU is a sequential package. Use SperLU_DIST or MUMPS. Configure petsc with '--download-superlu_dist' or '--download-mumps --download-scalapack --download-blacs' Hong > > >> >> Thanks, >> >> -Hung >> >> >> >> np .. 138524 >> np .. 143882 >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: >> --------------------- Error Message ------------------------------------ >> [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [13]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> --------------------- Error Message ------------------------------------ >> [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for this >> object type! >> [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for this >> object type! >> No support for this operation for this object type! >> [12]PETSC ERROR: Matrix type mpiaij symbolic LU! >> Matrix type mpiaij symbolic LU! >> [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij >> symbolic LU! >> [13]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] >> On >> Behalf Of Matthew Knepley >> Sent: Wednesday, January 14, 2009 2:05 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: Stopping criteria >> >> On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS >> wrote: >> >> >> >> Hello All, >> >> I tried to solve an ill-conditioned system using cg with Jacobi >> preconditioned. The KSP solver was stopping due to diverged reason >> within a >> few iterations. Is there a way to keep KSP solver running until >> max_it? >> >> >> There is no way to continue CG here because it gets a zero divisor, and >> interprets this as an indefinite matrix. You can try GMRES, however I would >> first check your matrix using -pc_type lu -ksp_type preonly to make sure >> its >> not singular. >> >> Matt >> >> >> >> Thanks, >> >> -hung >> >> hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg >> -pc_type >> jacobi >> -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor >> -ksp_converged_reason >> 0 KSP Residual norm 1.379074550666e+04 >> 1 KSP Residual norm 7.252034661743e+03 >> 2 KSP Residual norm 7.302184771313e+03 >> 3 KSP Residual norm 1.162244351275e+04 >> 4 KSP Residual norm 7.912531765659e+03 >> 5 KSP Residual norm 4.094706251487e+03 >> 6 KSP Residual norm 5.486131070301e+03 >> 7 KSP Residual norm 6.367904529202e+03 >> 8 KSP Residual norm 6.312767173219e+03 >> Linear solve did not converge due to DIVERGED_INDEFINITE_MAT >> iterations 9 >> Time in PETSc solver: 0.452695 seconds >> The number of iteration = 9 >> The solution residual error = 6.312767e+03 >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From hzhang at mcs.anl.gov Thu Jan 15 11:08:38 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 15 Jan 2009 11:08:38 -0600 (CST) Subject: Stopping criteria In-Reply-To: References: Message-ID: >> >>> however I would first check your matrix using -pc_type lu -ksp_type >> preonly >> to make sure its not singular. >> >> I got the error message below while running with option above. Do I have to >> build a matrix with type of seqaij/seqbaij to run with the -pc_type lu >> option? > > > 1) Either run on a single process, or > > 2) Install a parallel LU, such as SuperLU --download-superlu SuperLU is a sequential package. Use SperLU_DIST or MUMPS. Configure petsc with '--download-superlu_dist' or '--download-mumps --download-scalapack --download-blacs' Hong > > >> >> Thanks, >> >> -Hung >> >> >> >> np .. 138524 >> np .. 143882 >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: >> --------------------- Error Message ------------------------------------ >> [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [13]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> --------------------- Error Message ------------------------------------ >> [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for this >> object type! >> [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for this >> object type! >> No support for this operation for this object type! >> [12]PETSC ERROR: Matrix type mpiaij symbolic LU! >> Matrix type mpiaij symbolic LU! >> [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij >> symbolic LU! >> [13]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] >> On >> Behalf Of Matthew Knepley >> Sent: Wednesday, January 14, 2009 2:05 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: Stopping criteria >> >> On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS >> wrote: >> >> >> >> Hello All, >> >> I tried to solve an ill-conditioned system using cg with Jacobi >> preconditioned. The KSP solver was stopping due to diverged reason >> within a >> few iterations. Is there a way to keep KSP solver running until >> max_it? >> >> >> There is no way to continue CG here because it gets a zero divisor, and >> interprets this as an indefinite matrix. You can try GMRES, however I would >> first check your matrix using -pc_type lu -ksp_type preonly to make sure >> its >> not singular. >> >> Matt >> >> >> >> Thanks, >> >> -hung >> >> hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg >> -pc_type >> jacobi >> -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor >> -ksp_converged_reason >> 0 KSP Residual norm 1.379074550666e+04 >> 1 KSP Residual norm 7.252034661743e+03 >> 2 KSP Residual norm 7.302184771313e+03 >> 3 KSP Residual norm 1.162244351275e+04 >> 4 KSP Residual norm 7.912531765659e+03 >> 5 KSP Residual norm 4.094706251487e+03 >> 6 KSP Residual norm 5.486131070301e+03 >> 7 KSP Residual norm 6.367904529202e+03 >> 8 KSP Residual norm 6.312767173219e+03 >> Linear solve did not converge due to DIVERGED_INDEFINITE_MAT >> iterations 9 >> Time in PETSc solver: 0.452695 seconds >> The number of iteration = 9 >> The solution residual error = 6.312767e+03 >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From bsmith at mcs.anl.gov Thu Jan 15 12:05:27 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Jan 2009 12:05:27 -0600 Subject: Stopping criteria In-Reply-To: References: Message-ID: <388EAC75-484A-459C-89A3-1D71A4E66790@mcs.anl.gov> When something so basic goes wrong. CG kicks out because the matrix is not symmetric positive definite, you need to step back and understand more about the matrix before worrying about running in parallel fast. So, yes, you should run sequentially to "play around" with the matrix. You can use a MatType of MATAIJ in MatSetType() to have it automatically be SEQAIJ on one process and MPIAIJ on more than one. or MATBAIJ for BAIJ format. You can then try direct solver, for example, trivially to begin to understand the matrix. You can also use MatIsSymmetric() to determine if it truly is symmetric or something went wrong in assembly. You can also try -pc_type none and see what happens. Barry On Jan 15, 2009, at 10:24 AM, Nguyen, Hung V ERDC-ITL-MS wrote: > Hello Matt, > >> however I would first check your matrix using -pc_type lu -ksp_type >> preonly > to make sure its not singular. > > I got the error message below while running with option above. Do I > have to > build a matrix with type of seqaij/seqbaij to run with the -pc_type lu > option? > > Thanks, > > -Hung > > > > np .. 138524 > np .. 143882 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: > --------------------- Error Message > ------------------------------------ > [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [13]PETSC ERROR: --------------------- Error Message > ------------------------------------ > --------------------- Error Message > ------------------------------------ > [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for > this > object type! > [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for > this > object type! > No support for this operation for this object type! > [12]PETSC ERROR: Matrix type mpiaij symbolic LU! > Matrix type mpiaij symbolic LU! > [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij > symbolic LU! > [13]PETSC ERROR: > ------------------------------------------------------------------------ > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] On > Behalf Of Matthew Knepley > Sent: Wednesday, January 14, 2009 2:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Stopping criteria > > On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS > wrote: > > > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason > within a > few iterations. Is there a way to keep KSP solver running until > max_it? > > > There is no way to continue CG here because it gets a zero divisor, > and > interprets this as an indefinite matrix. You can try GMRES, however > I would > first check your matrix using -pc_type lu -ksp_type preonly to make > sure its > not singular. > > Matt > > > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor > -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > > > > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their > experiments > lead. > -- Norbert Wiener > From bsmith at mcs.anl.gov Thu Jan 15 12:05:27 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Jan 2009 12:05:27 -0600 Subject: Stopping criteria In-Reply-To: References: Message-ID: <388EAC75-484A-459C-89A3-1D71A4E66790@mcs.anl.gov> When something so basic goes wrong. CG kicks out because the matrix is not symmetric positive definite, you need to step back and understand more about the matrix before worrying about running in parallel fast. So, yes, you should run sequentially to "play around" with the matrix. You can use a MatType of MATAIJ in MatSetType() to have it automatically be SEQAIJ on one process and MPIAIJ on more than one. or MATBAIJ for BAIJ format. You can then try direct solver, for example, trivially to begin to understand the matrix. You can also use MatIsSymmetric() to determine if it truly is symmetric or something went wrong in assembly. You can also try -pc_type none and see what happens. Barry On Jan 15, 2009, at 10:24 AM, Nguyen, Hung V ERDC-ITL-MS wrote: > Hello Matt, > >> however I would first check your matrix using -pc_type lu -ksp_type >> preonly > to make sure its not singular. > > I got the error message below while running with option above. Do I > have to > build a matrix with type of seqaij/seqbaij to run with the -pc_type lu > option? > > Thanks, > > -Hung > > > > np .. 138524 > np .. 143882 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: > --------------------- Error Message > ------------------------------------ > [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [13]PETSC ERROR: --------------------- Error Message > ------------------------------------ > --------------------- Error Message > ------------------------------------ > [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for > this > object type! > [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for > this > object type! > No support for this operation for this object type! > [12]PETSC ERROR: Matrix type mpiaij symbolic LU! > Matrix type mpiaij symbolic LU! > [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij > symbolic LU! > [13]PETSC ERROR: > ------------------------------------------------------------------------ > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] On > Behalf Of Matthew Knepley > Sent: Wednesday, January 14, 2009 2:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Stopping criteria > > On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS > wrote: > > > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason > within a > few iterations. Is there a way to keep KSP solver running until > max_it? > > > There is no way to continue CG here because it gets a zero divisor, > and > interprets this as an indefinite matrix. You can try GMRES, however > I would > first check your matrix using -pc_type lu -ksp_type preonly to make > sure its > not singular. > > Matt > > > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor > -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > > > > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their > experiments > lead. > -- Norbert Wiener > From Hung.V.Nguyen at usace.army.mil Thu Jan 15 13:41:19 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Thu, 15 Jan 2009 13:41:19 -0600 Subject: Stopping criteria In-Reply-To: <388EAC75-484A-459C-89A3-1D71A4E66790@mcs.anl.gov> References: <388EAC75-484A-459C-89A3-1D71A4E66790@mcs.anl.gov> Message-ID: Thank you for the info which is very helpful. -Hung -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Thursday, January 15, 2009 12:05 PM To: PETSc users list Cc: petsc-users at mcs.anl.gov Subject: Re: Stopping criteria When something so basic goes wrong. CG kicks out because the matrix is not symmetric positive definite, you need to step back and understand more about the matrix before worrying about running in parallel fast. So, yes, you should run sequentially to "play around" with the matrix. You can use a MatType of MATAIJ in MatSetType() to have it automatically be SEQAIJ on one process and MPIAIJ on more than one. or MATBAIJ for BAIJ format. You can then try direct solver, for example, trivially to begin to understand the matrix. You can also use MatIsSymmetric() to determine if it truly is symmetric or something went wrong in assembly. You can also try -pc_type none and see what happens. Barry On Jan 15, 2009, at 10:24 AM, Nguyen, Hung V ERDC-ITL-MS wrote: > Hello Matt, > >> however I would first check your matrix using -pc_type lu -ksp_type >> preonly > to make sure its not singular. > > I got the error message below while running with option above. Do I > have to build a matrix with type of seqaij/seqbaij to run with the > -pc_type lu option? > > Thanks, > > -Hung > > > > np .. 138524 > np .. 143882 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: > --------------------- Error Message > ------------------------------------ > [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [13]PETSC ERROR: --------------------- Error Message > ------------------------------------ > --------------------- Error Message > ------------------------------------ > [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for > this object type! > [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for > this object type! > No support for this operation for this object type! > [12]PETSC ERROR: Matrix type mpiaij symbolic LU! > Matrix type mpiaij symbolic LU! > [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij > symbolic LU! > [13]PETSC ERROR: > ---------------------------------------------------------------------- > -- > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov > ] On > Behalf Of Matthew Knepley > Sent: Wednesday, January 14, 2009 2:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Stopping criteria > > On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS > wrote: > > > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason > within a > few iterations. Is there a way to keep KSP solver running until > max_it? > > > There is no way to continue CG here because it gets a zero divisor, > and interprets this as an indefinite matrix. You can try GMRES, > however I would first check your matrix using -pc_type lu -ksp_type > preonly to make sure its not singular. > > Matt > > > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor > -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From Hung.V.Nguyen at usace.army.mil Thu Jan 15 13:41:19 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Thu, 15 Jan 2009 13:41:19 -0600 Subject: Stopping criteria In-Reply-To: <388EAC75-484A-459C-89A3-1D71A4E66790@mcs.anl.gov> References: <388EAC75-484A-459C-89A3-1D71A4E66790@mcs.anl.gov> Message-ID: Thank you for the info which is very helpful. -Hung -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Thursday, January 15, 2009 12:05 PM To: PETSc users list Cc: petsc-users at mcs.anl.gov Subject: Re: Stopping criteria When something so basic goes wrong. CG kicks out because the matrix is not symmetric positive definite, you need to step back and understand more about the matrix before worrying about running in parallel fast. So, yes, you should run sequentially to "play around" with the matrix. You can use a MatType of MATAIJ in MatSetType() to have it automatically be SEQAIJ on one process and MPIAIJ on more than one. or MATBAIJ for BAIJ format. You can then try direct solver, for example, trivially to begin to understand the matrix. You can also use MatIsSymmetric() to determine if it truly is symmetric or something went wrong in assembly. You can also try -pc_type none and see what happens. Barry On Jan 15, 2009, at 10:24 AM, Nguyen, Hung V ERDC-ITL-MS wrote: > Hello Matt, > >> however I would first check your matrix using -pc_type lu -ksp_type >> preonly > to make sure its not singular. > > I got the error message below while running with option above. Do I > have to build a matrix with type of seqaij/seqbaij to run with the > -pc_type lu option? > > Thanks, > > -Hung > > > > np .. 138524 > np .. 143882 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: [8]PETSC ERROR: > --------------------- Error Message > ------------------------------------ > [4]PETSC ERROR: [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [13]PETSC ERROR: --------------------- Error Message > ------------------------------------ > --------------------- Error Message > ------------------------------------ > [15]PETSC ERROR: [13]PETSC ERROR: No support for this operation for > this object type! > [12]PETSC ERROR: [15]PETSC ERROR: No support for this operation for > this object type! > No support for this operation for this object type! > [12]PETSC ERROR: Matrix type mpiaij symbolic LU! > Matrix type mpiaij symbolic LU! > [13]PETSC ERROR: [15]PETSC ERROR: [12]PETSC ERROR: Matrix type mpiaij > symbolic LU! > [13]PETSC ERROR: > ---------------------------------------------------------------------- > -- > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov > ] On > Behalf Of Matthew Knepley > Sent: Wednesday, January 14, 2009 2:05 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Stopping criteria > > On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS > wrote: > > > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason > within a > few iterations. Is there a way to keep KSP solver running until > max_it? > > > There is no way to continue CG here because it gets a zero divisor, > and interprets this as an indefinite matrix. You can try GMRES, > however I would first check your matrix using -pc_type lu -ksp_type > preonly to make sure its not singular. > > Matt > > > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor > -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT > iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From tim.kroeger at cevis.uni-bremen.de Fri Jan 16 06:58:39 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Fri, 16 Jan 2009 13:58:39 +0100 (CET) Subject: VecGetLocalSize() Message-ID: Dear PETSc team, If I create a vector using VecCreateGhost() and then query the local size using VerGetLocalSize(), will the resulting number then include the number of ghost values or not? In either case, how can I ask a vector about the number of ghost values that are stored locally? I assume that if I know this number, I can use the ISLocalToGlobalMapping*() functions to get the global indices of the ghost values. I also assume that the values supplied by VecGetOwnershipRange() do not include ghost cells (since it wouldn't be a consecutive range otherwise). Are these assumptions correct? (My conjecture is that VecGetLocalSize() does include the ghost cells while I can get the number without the ghost cells by subtracting the two numbers that VecGetOwnershipRange() supplies.) Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From knepley at gmail.com Fri Jan 16 09:09:44 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 Jan 2009 09:09:44 -0600 Subject: VecGetLocalSize() In-Reply-To: References: Message-ID: On Fri, Jan 16, 2009 at 6:58 AM, Tim Kroeger < tim.kroeger at cevis.uni-bremen.de> wrote: > Dear PETSc team, > > If I create a vector using VecCreateGhost() and then query the local size > using VerGetLocalSize(), will the resulting number then include the number > of ghost values or not? In either case, how can I ask a vector about the > number of ghost values that are stored locally? No, it will have the local size without ghosts. You can get the number of ghosts from the size of the local form VecGhostGetLocalForm() or from the size of the LocalToGlobalMapping. > > I assume that if I know this number, I can use the > ISLocalToGlobalMapping*() functions to get the global indices of the ghost > values. > > I also assume that the values supplied by VecGetOwnershipRange() do not > include ghost cells (since it wouldn't be a consecutive range otherwise). Yes. Matt > > Are these assumptions correct? > > (My conjecture is that VecGetLocalSize() does include the ghost cells > while I can get the number without the ghost cells by subtracting the > two numbers that VecGetOwnershipRange() supplies.) > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.kroeger at cevis.uni-bremen.de Mon Jan 19 04:18:20 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Mon, 19 Jan 2009 11:18:20 +0100 (CET) Subject: VecGetLocalSize() In-Reply-To: References: Message-ID: Dear Matt, On Fri, 16 Jan 2009, Matthew Knepley wrote: > On Fri, Jan 16, 2009 at 6:58 AM, Tim Kroeger < > tim.kroeger at cevis.uni-bremen.de> wrote: > >> If I create a vector using VecCreateGhost() and then query the local size >> using VerGetLocalSize(), will the resulting number then include the number >> of ghost values or not? In either case, how can I ask a vector about the >> number of ghost values that are stored locally? > > No, it will have the local size without ghosts. You can get the number of > ghosts from the size of the local form VecGhostGetLocalForm() > or from the size of the LocalToGlobalMapping. Thank you very much. My next question: How do I obtain the vector's LocalToGlobalMapping? I thought there would be a function like VecGhostGetLocalToGlobalMapping(), but that doesn't exist. Can you help me? Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Mon Jan 19 06:46:41 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Mon, 19 Jan 2009 13:46:41 +0100 Subject: Installation petsc-3.0.0 Message-ID: <20090119134641.k9ng4eydlrswc0kg@webmail.ec-nantes.fr> Hi, I have upgraded Petsc to 3.0.0. For the configuration(1), which worked with petsc-2.3.3, I got C++ error. Anyway I could do make but make test failed for Fortran example(2). Do you have any idea? I would be appreciate for any advise. Thank you very much. Regards, Jarunan - - - - (1) - - - - ./config/configure.py --with-cc=/usr/local/mpich-ifc-ssh/bin/mpicc --with-fc=/usr/local/mpich-ifc-ssh/bin/mpif90 --download-f-blas-lapack=1 --download-hypre=1 --download-ml=1 --with-cxx=icpc --with-mpi-dir=/usr/local/mpich-ifc-ssh --with-shared=0 ================================================================================= Configuring PETSc to compile on your system ================================================================================= ================================================================================= Warning: [with-mpi-dir] option is used along with options: ['with-cc', 'with-fc', 'with-cxx'] This prevents configure from picking up MPI compilers from specified mpi-dir. Sugest using *only* [with-mpi-dir] option - and no other compiler option. This way - mpi compilers from /usr/local/mpich-ifc-ssh are used. ================================================================================= TESTING: CxxMPICheck from config.packages.MPI(config/BuildSystem/config/packages/MPI.py:598) ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- C++ error! MPI_Finalize() could not be located! - - - - (2) - - - - make test Running test examples to verify correct installation C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 MPI processes Graphics example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html 0 - : Could not convert index 1140850688 into a pointer The index may be an incorrect argument. Possible sources of this problem are a missing "include 'mpif.h'", a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD) or a misspelled user variable for an MPI object (e.g., com instead of comm). [0] Aborting program ! [0] Aborting program! p0_31302: p4_error: : 9039 Completed test examples -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Mon Jan 19 06:46:42 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Mon, 19 Jan 2009 13:46:42 +0100 Subject: Installation petsc-3.0.0 Message-ID: <20090119134642.r2yhybcteou8wwww@webmail.ec-nantes.fr> Hi, I have upgraded Petsc to 3.0.0. For the configuration(1), which worked with petsc-2.3.3, I got C++ error. Anyway I could do make but make test failed for Fortran example(2). Do you have any idea? I would be appreciate for any advise. Thank you very much. Regards, Jarunan - - - - (1) - - - - ./config/configure.py --with-cc=/usr/local/mpich-ifc-ssh/bin/mpicc --with-fc=/usr/local/mpich-ifc-ssh/bin/mpif90 --download-f-blas-lapack=1 --download-hypre=1 --download-ml=1 --with-cxx=icpc --with-mpi-dir=/usr/local/mpich-ifc-ssh --with-shared=0 ================================================================================= Configuring PETSc to compile on your system ================================================================================= ================================================================================= Warning: [with-mpi-dir] option is used along with options: ['with-cc', 'with-fc', 'with-cxx'] This prevents configure from picking up MPI compilers from specified mpi-dir. Sugest using *only* [with-mpi-dir] option - and no other compiler option. This way - mpi compilers from /usr/local/mpich-ifc-ssh are used. ================================================================================= TESTING: CxxMPICheck from config.packages.MPI(config/BuildSystem/config/packages/MPI.py:598) ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- C++ error! MPI_Finalize() could not be located! - - - - (2) - - - - make test Running test examples to verify correct installation C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 MPI processes Graphics example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html 0 - : Could not convert index 1140850688 into a pointer The index may be an incorrect argument. Possible sources of this problem are a missing "include 'mpif.h'", a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD) or a misspelled user variable for an MPI object (e.g., com instead of comm). [0] Aborting program ! [0] Aborting program! p0_31302: p4_error: : 9039 Completed test examples -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Mon Jan 19 06:43:08 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Mon, 19 Jan 2009 13:43:08 +0100 Subject: Installation In-Reply-To: References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> <20090115164240.uo0edtyqa5ss444s@webmail.ec-nantes.fr> Message-ID: <20090119134308.ebht7dxck1vk0o80@webmail.ec-nantes.fr> Hi, I have upgraded Petsc to 3.0.0. For the configuration(1) I got C++ error. Anyway I could do make but make test failed for Fortran example(2). Do you have any idea? I would be appreciate for any advise. Thank you very much. Regards, Jarunan - - - - (1) - - - - ./config/configure.py --with-cc=/usr/local/mpich-ifc-ssh/bin/mpicc --with-fc=/usr/local/mpich-ifc-ssh/bin/mpif90 --download-f-blas-lapack=1 --download-hypre=1 --download-ml=1 --with-cxx=icpc --with-mpi-dir=/usr/local/mpich-ifc-ssh --with-shared=0 ================================================================================= Configuring PETSc to compile on your system ================================================================================= ================================================================================= Warning: [with-mpi-dir] option is used along with options: ['with-cc', 'with-fc', 'with-cxx'] This prevents configure from picking up MPI compilers from specified mpi-dir. Sugest using *only* [with-mpi-dir] option - and no other compiler option. This way - mpi compilers from /usr/local/mpich-ifc-ssh are used. ================================================================================= TESTING: CxxMPICheck from config.packages.MPI(config/BuildSystem/config/packages/MPI.py:598) ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- C++ error! MPI_Finalize() could not be located! - - - - (2) - - - - make test Running test examples to verify correct installation C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 MPI processes Graphics example src/snes/examples/tutorials/ex19 run successfully with 1 MPI process Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html 0 - : Could not convert index 1140850688 into a pointer The index may be an incorrect argument. Possible sources of this problem are a missing "include 'mpif.h'", a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD) or a misspelled user variable for an MPI object (e.g., com instead of comm). [0] Aborting program ! [0] Aborting program! p0_31302: p4_error: : 9039 Completed test examples -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE Matthew Knepley a ??crit? : > On Thu, Jan 15, 2009 at 9:42 AM, Panyasantisuk Jarunan < > Jarunan.Panyasantisuk at eleves.ec-nantes.fr> wrote: > >> When I create a matrix with MatCreateMPIAIJWithSplitArrays, as it doesn't >> copy the values so I have to use MatSetValues to set the internal value? > > > 1) You should upgrade to 3.0.0 > > 2) You should not have to call MatSetValues(). It will use the arrays you > provide. > > Matt > > >> >> -- >> Jarunan PANYASANTISUK >> MSc. in Computational Mechanics >> Erasmus Mundus Master Program >> Ecole Centrale de Nantes >> 1, rue de la no?, 44321 NANTES, FRANCE >> >> >> >> Barry Smith a ?(c)crit? : >> >> >>> You should be able to use MatCreateMPIAIJWithSplitArrays(), >>> MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() >>> from Fortran. Are you using PETSc 3.0.0? >>> >>> The arguments for MatCreateMPIAIJWithArrays() or >>> MatMPIAIJSetPreallocationCSR() have the same meaning >>> (in fact MatCreateMPIAIJWithArrays() essentially calls >>> MatCreateMPIAIJWithArrays()). >>> >>> Barry >>> >>> On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: >>> >>> Oh, I could not use MatCreateMPIAIJWithArrays either but the mechanism >>>> below works. >>>> >>>> call MatCreate(PETSC_COMM_WORLD,D,ierr) >>>> call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, >>>> $ ierr) >>>> call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix >>>> call MatSetFromOptions(D,ierr) >>>> call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) >>>> >>>> Where pointer is start-row indices a >>>> Column is local column indices >>>> v is value >>>> >>>> Is there the different beteween the start-row indices in >>>> MatMPIAIJSetPreallocationCSR and row indices in >>>> MatCreateMPIAIJWithArrays >>>> ? >>>> >>>> >>>> >>>> Regards, >>>> Jarunan >>>> >>>> >>>> >>>> >>>> Hello, >>>> >>>> To define a matrix with arrays, I cannot use >>>> MatCreateMPIAIJWithSplitArrays in my program which is written in >>>> Fortran: >>>> >>>> call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, >>>> $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, >>>> $ oColumn,ov,D,ierr) >>>> >>>> The error is >>>> F:246: undefined reference to `matcreatempiaijwithsplitarrays_' >>>> >>>> I could use MatCreateMPIAIJWithArrays but the off diagonal values are >>>> missing with this command. >>>> >>>> I would be appreciate for any advice. Thank you before hand. >>>> >>>> Regards, >>>> Jarunan >>>> >>>> >>>> >>>> >>>> -- >>>> Jarunan PANYASANTISUK >>>> MSc. in Computational Mechanics >>>> Erasmus Mundus Master Program >>>> Ecole Centrale de Nantes >>>> 1, rue de la no?, 44321 NANTES, FRANCE >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >> >> >> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From bsmith at mcs.anl.gov Mon Jan 19 09:19:09 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 19 Jan 2009 09:19:09 -0600 Subject: VecGetLocalSize() In-Reply-To: References: Message-ID: There isn't a routine for accessing the local to global mapping. But you can access it as Vec x; x->mapping Barry On Jan 19, 2009, at 4:18 AM, Tim Kroeger wrote: > Dear Matt, > > On Fri, 16 Jan 2009, Matthew Knepley wrote: > >> On Fri, Jan 16, 2009 at 6:58 AM, Tim Kroeger < >> tim.kroeger at cevis.uni-bremen.de> wrote: >> >>> If I create a vector using VecCreateGhost() and then query the >>> local size >>> using VerGetLocalSize(), will the resulting number then include >>> the number >>> of ghost values or not? In either case, how can I ask a vector >>> about the >>> number of ghost values that are stored locally? >> >> No, it will have the local size without ghosts. You can get the >> number of >> ghosts from the size of the local form VecGhostGetLocalForm() >> or from the size of the LocalToGlobalMapping. > > Thank you very much. My next question: How do I obtain the vector's > LocalToGlobalMapping? I thought there would be a function like > VecGhostGetLocalToGlobalMapping(), but that doesn't exist. Can you > help me? > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > From balay at mcs.anl.gov Mon Jan 19 09:49:29 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 19 Jan 2009 09:49:29 -0600 (CST) Subject: Installation petsc-3.0.0 In-Reply-To: <20090119134642.r2yhybcteou8wwww@webmail.ec-nantes.fr> References: <20090119134642.r2yhybcteou8wwww@webmail.ec-nantes.fr> Message-ID: On Mon, 19 Jan 2009, Panyasantisuk Jarunan wrote: > Hi, > > I have upgraded Petsc to 3.0.0. For the configuration(1), which worked > with petsc-2.3.3, I got C++ error. Anyway I could do make but make test > failed for Fortran example(2). > > Do you have any idea? I would be appreciate for any advise. Thank you > very much. > > Regards, > Jarunan > > - - - - (1) - - - - > > ./config/configure.py --with-cc=/usr/local/mpich-ifc-ssh/bin/mpicc > --with-fc=/usr/local/mpich-ifc-ssh/bin/mpif90 > --download-f-blas-lapack=1 --download-hypre=1 --download-ml=1 > --with-cxx=icpc --with-mpi-dir=/usr/local/mpich-ifc-ssh --with-shared=0 Perhaps you should be using '--with-cxx=/usr/local/mpich-ifc-ssh/bin/mpicxx' ? > ================================================================================= > Configuring PETSc to compile on your system > ================================================================================= > ================================================================================= > Warning: [with-mpi-dir] option is used along with options: ['with-cc', > 'with-fc', 'with-cxx'] This > prevents configure from picking up MPI compilers from specified mpi-dir. > Sugest using *only* [with-mpi-dir] option - and no other compiler option. > This way - mpi compilers from /usr/local/mpich-ifc-ssh are used. > ================================================================================= > TESTING: CxxMPICheck from > config.packages.MPI(config/BuildSystem/config/packages/MPI.py:598) > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > --------------------------------------------------------------------------------------- > C++ error! MPI_Finalize() could not be located! Looks like configure did not complete. So how were you able to build libraries and - run the tests? If you still have isues - send the corresponding configure.log to petsc-maint at mcs.anl.gov. Satish > > > > - - - - (2) - - - - > > make test > > Running test examples to verify correct installation > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 > MPI process > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 > MPI processes > Graphics example src/snes/examples/tutorials/ex19 run successfully with > 1 MPI process > Error running Fortran example src/snes/examples/tutorials/ex5f with 1 > MPI process > See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > 0 - : Could not convert index 1140850688 into a pointer > The index may be an incorrect argument. > Possible sources of this problem are a missing "include 'mpif.h'", > a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD) > or a misspelled user variable for an MPI object (e.g., > com instead of comm). > [0] Aborting program ! > [0] Aborting program! > p0_31302: p4_error: : 9039 > Completed test examples > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > > > From balay at mcs.anl.gov Mon Jan 19 09:49:53 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 19 Jan 2009 09:49:53 -0600 (CST) Subject: Installation petsc-3.0.0 In-Reply-To: <20090119134641.k9ng4eydlrswc0kg@webmail.ec-nantes.fr> References: <20090119134641.k9ng4eydlrswc0kg@webmail.ec-nantes.fr> Message-ID: replied to the other copy of this e-mail. satish On Mon, 19 Jan 2009, Panyasantisuk Jarunan wrote: > Hi, > > I have upgraded Petsc to 3.0.0. For the configuration(1), which worked > with petsc-2.3.3, I got C++ error. Anyway I could do make but make test > failed for Fortran example(2). > > Do you have any idea? I would be appreciate for any advise. Thank you > very much. > > Regards, > Jarunan > > - - - - (1) - - - - > > ./config/configure.py --with-cc=/usr/local/mpich-ifc-ssh/bin/mpicc > --with-fc=/usr/local/mpich-ifc-ssh/bin/mpif90 > --download-f-blas-lapack=1 --download-hypre=1 --download-ml=1 > --with-cxx=icpc --with-mpi-dir=/usr/local/mpich-ifc-ssh --with-shared=0 > ================================================================================= > Configuring PETSc to compile on your system > ================================================================================= > ================================================================================= > Warning: [with-mpi-dir] option is used along with options: ['with-cc', > 'with-fc', 'with-cxx'] This > prevents configure from picking up MPI compilers from specified mpi-dir. > Sugest using *only* [with-mpi-dir] option - and no other compiler option. > This way - mpi compilers from /usr/local/mpich-ifc-ssh are used. > ================================================================================= > TESTING: CxxMPICheck from > config.packages.MPI(config/BuildSystem/config/packages/MPI.py:598) > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > --------------------------------------------------------------------------------------- > C++ error! MPI_Finalize() could not be located! > > > > - - - - (2) - - - - > > make test > > Running test examples to verify correct installation > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1 > MPI process > C/C++ example src/snes/examples/tutorials/ex19 run successfully with 2 > MPI processes > Graphics example src/snes/examples/tutorials/ex19 run successfully with > 1 MPI process > Error running Fortran example src/snes/examples/tutorials/ex5f with 1 > MPI process > See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > 0 - : Could not convert index 1140850688 into a pointer > The index may be an incorrect argument. > Possible sources of this problem are a missing "include 'mpif.h'", > a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD) > or a misspelled user variable for an MPI object (e.g., > com instead of comm). > [0] Aborting program ! > [0] Aborting program! > p0_31302: p4_error: : 9039 > Completed test examples > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > > > From xy2102 at columbia.edu Thu Jan 1 06:55:35 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Thu, 01 Jan 2009 07:55:35 -0500 Subject: Math function "ln" in Petsc In-Reply-To: References: <535919.87151.qm@web36201.mail.mud.yahoo.com> <20081231112430.mbh33xdc04s0o0w0@cubmail.cc.columbia.edu> <20081231153036.ex23wo4l280o8skg@cubmail.cc.columbia.edu> <20081231190906.g6ph2cnqko4ckkgg@cubmail.cc.columbia.edu> <20081231221447.azw0svwiisgcks8g@cubmail.cc.columbia.edu> Message-ID: <20090101075535.jdwr2x22kgos04k0@cubmail.cc.columbia.edu> Hey,Barry, Thanks a lot! Rebecca Quoting Barry Smith : > > It is called PetscLogScalar() > > Barry > > On Dec 31, 2008, at 9:14 PM, (Rebecca) Xuefei YUAN wrote: > >> I know that there is a PetscExpScalar, but it seems that there is >> no such a function like PetscLnScalar. >> >> Thanks! >> -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From xy2102 at columbia.edu Thu Jan 1 07:08:47 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Thu, 01 Jan 2009 08:08:47 -0500 Subject: Math function "ln" in Petsc In-Reply-To: References: <535919.87151.qm@web36201.mail.mud.yahoo.com> <20081231112430.mbh33xdc04s0o0w0@cubmail.cc.columbia.edu> <20081231153036.ex23wo4l280o8skg@cubmail.cc.columbia.edu> <20081231190906.g6ph2cnqko4ckkgg@cubmail.cc.columbia.edu> <20081231221447.azw0svwiisgcks8g@cubmail.cc.columbia.edu> Message-ID: <20090101080847.jmuhxciiowww0g44@cubmail.cc.columbia.edu> Hey,Barry, However, I was not able to find PetscLogScalar function in the Petscmath.h file and PetscLogScalar does not work... Thanks, Rebecca Quoting Barry Smith : > > It is called PetscLogScalar() > > Barry > > On Dec 31, 2008, at 9:14 PM, (Rebecca) Xuefei YUAN wrote: > >> I know that there is a PetscExpScalar, but it seems that there is >> no such a function like PetscLnScalar. >> >> Thanks! >> -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From xy2102 at columbia.edu Thu Jan 1 07:16:41 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Thu, 01 Jan 2009 08:16:41 -0500 Subject: Math function "ln" in Petsc In-Reply-To: References: <535919.87151.qm@web36201.mail.mud.yahoo.com> <20081231112430.mbh33xdc04s0o0w0@cubmail.cc.columbia.edu> <20081231153036.ex23wo4l280o8skg@cubmail.cc.columbia.edu> <20081231190906.g6ph2cnqko4ckkgg@cubmail.cc.columbia.edu> <20081231221447.azw0svwiisgcks8g@cubmail.cc.columbia.edu> Message-ID: <20090101081641.32r94rsu848kkksw@cubmail.cc.columbia.edu> Hey,Barry, I add the following into petscmath.h file, and it works now. Thanks very much! # define PetscLogScalar(a) log(a) Rebecca Quoting Barry Smith : > > It is called PetscLogScalar() > > Barry > > On Dec 31, 2008, at 9:14 PM, (Rebecca) Xuefei YUAN wrote: > >> I know that there is a PetscExpScalar, but it seems that there is >> no such a function like PetscLnScalar. >> >> Thanks! >> -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From bsmith at mcs.anl.gov Thu Jan 1 09:09:14 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 Jan 2009 09:09:14 -0600 Subject: Math function "ln" in Petsc In-Reply-To: <20090101081641.32r94rsu848kkksw@cubmail.cc.columbia.edu> References: <535919.87151.qm@web36201.mail.mud.yahoo.com> <20081231112430.mbh33xdc04s0o0w0@cubmail.cc.columbia.edu> <20081231153036.ex23wo4l280o8skg@cubmail.cc.columbia.edu> <20081231190906.g6ph2cnqko4ckkgg@cubmail.cc.columbia.edu> <20081231221447.azw0svwiisgcks8g@cubmail.cc.columbia.edu> <20090101081641.32r94rsu848kkksw@cubmail.cc.columbia.edu> Message-ID: I do not see this problem with petsc-3.0.0 or petsc-dev. Are you using something strange like complex numbers or singe precision storage or quad floating point? Barry On Jan 1, 2009, at 7:16 AM, (Rebecca) Xuefei YUAN wrote: > Hey,Barry, > > I add the following into petscmath.h file, and it works now. Thanks > very much! > > # define PetscLogScalar(a) log(a) > > Rebecca > > Quoting Barry Smith : > >> >> It is called PetscLogScalar() >> >> Barry >> >> On Dec 31, 2008, at 9:14 PM, (Rebecca) Xuefei YUAN wrote: >> >>> I know that there is a PetscExpScalar, but it seems that there is >>> no such a function like PetscLnScalar. >>> >>> Thanks! >>> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From recrusader at gmail.com Thu Jan 1 18:45:01 2009 From: recrusader at gmail.com (Yujie) Date: Thu, 1 Jan 2009 16:45:01 -0800 Subject: about MatGetOwnershipRange Message-ID: <7ff0ee010901011645g47b70dbay30d7ec1eedad25ef@mail.gmail.com> I got a submatrix B from A in parallel mode. Because the rows numbers of B in some processors of the cluster become 0, I want to use MatGetOwnershipRange() to get the range in each processor. I have checked this function, the description is as folllows: " MatGetOwnershipRange Returns the range of matrix rows owned by this processor, assuming that the matrix is laid out with the first n1 rows on the first processor, the next n2 rows on the second, etc. For certain parallel layouts this range may not be well defined. " There is an assumption in this function, I am wondering whether I can use this function here. if not, how to do it? thanks a lot. Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jan 1 18:59:50 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 Jan 2009 18:59:50 -0600 Subject: about MatGetOwnershipRange In-Reply-To: <7ff0ee010901011645g47b70dbay30d7ec1eedad25ef@mail.gmail.com> References: <7ff0ee010901011645g47b70dbay30d7ec1eedad25ef@mail.gmail.com> Message-ID: <6B693146-D063-4E0D-9A80-C361F63F481C@mcs.anl.gov> For all Mats that PETSc provides including MatMFFD and MatShell() the assumption is satisfied. Barry On Jan 1, 2009, at 6:45 PM, Yujie wrote: > I got a submatrix B from A in parallel mode. Because the rows > numbers of B in some processors of the cluster become 0, I want to > use MatGetOwnershipRange() to get the range in each processor. I > have checked this function, the description is as folllows: > " > MatGetOwnershipRange > Returns the range of matrix rows owned by this processor, assuming > that the matrix is laid out with the first n1 rows on the first > processor, the next n2 rows on the second, etc. For certain parallel > layouts this range may not be well defined. > " > There is an assumption in this function, I am wondering whether I > can use this function here. if not, how to do it? thanks a lot. > > Yujie From recrusader at gmail.com Thu Jan 1 19:45:36 2009 From: recrusader at gmail.com (Yujie) Date: Thu, 1 Jan 2009 17:45:36 -0800 Subject: about MatGetOwnershipRange In-Reply-To: <6B693146-D063-4E0D-9A80-C361F63F481C@mcs.anl.gov> References: <7ff0ee010901011645g47b70dbay30d7ec1eedad25ef@mail.gmail.com> <6B693146-D063-4E0D-9A80-C361F63F481C@mcs.anl.gov> Message-ID: <7ff0ee010901011745n6d614a5ck24e776bdf34f65d3@mail.gmail.com> thank you very much, Barry On Thu, Jan 1, 2009 at 4:59 PM, Barry Smith wrote: > > For all Mats that PETSc provides including MatMFFD and MatShell() the > assumption is satisfied. > > Barry > > > On Jan 1, 2009, at 6:45 PM, Yujie wrote: > > I got a submatrix B from A in parallel mode. Because the rows numbers of B >> in some processors of the cluster become 0, I want to use >> MatGetOwnershipRange() to get the range in each processor. I have checked >> this function, the description is as folllows: >> " >> MatGetOwnershipRange >> Returns the range of matrix rows owned by this processor, assuming that >> the matrix is laid out with the first n1 rows on the first processor, the >> next n2 rows on the second, etc. For certain parallel layouts this range may >> not be well defined. >> " >> There is an assumption in this function, I am wondering whether I can use >> this function here. if not, how to do it? thanks a lot. >> >> Yujie >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 13:41:55 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 11:41:55 -0800 Subject: how to extract a subvector? Message-ID: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> Like MatGetSubMatrix(), whether is there a function to get a subvector in parallel mode? I have checked some scatter functios, they don't likely work. thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jan 2 13:46:36 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 Jan 2009 13:46:36 -0600 Subject: how to extract a subvector? In-Reply-To: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> Message-ID: VecScatter is for this purpose. Rational: extracting subparts of vectors for ghost points etc takes place many times in a simulation; maybe millions. Thus separating it into a set-up followed by many uses is a worthwhile optimization. Extracting submatrices occur much less often in a simulation, maybe tens, hundreds or thousands of times so it is not worth the extra complexity of having separate set-up followed by many uses. One could argue that uniformity of design means we should have handled matrices with a MatScatter concept to parallel the Vec approach, but it is too late now :-). Barry On Jan 2, 2009, at 1:41 PM, Yujie wrote: > Like MatGetSubMatrix(), whether is there a function to get a > subvector in parallel mode? I have checked some scatter functios, > they don't likely work. thanks a lot. > > Regards, > > Yujie > From recrusader at gmail.com Fri Jan 2 13:59:38 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 11:59:38 -0800 Subject: how to extract a subvector? In-Reply-To: References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> Message-ID: <7ff0ee010901021159k6a4b3fe6r4b8c54fb59ba3bf8@mail.gmail.com> thank you very much, Barry. I made a misunderstanding about the parameters "ix" and "iy". Further question is after finishing scatter, the subvector will be redistributed? I mean like matrix, because I get a submatrix, the rows in some processes will become zero, is the subvector the same with the matrix? thanks. Yujie On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith wrote: > > VecScatter is for this purpose. > > Rational: extracting subparts of vectors for ghost points etc takes place > many times in a simulation; maybe millions. > Thus separating it into a set-up followed by many uses is a worthwhile > optimization. Extracting submatrices occur > much less often in a simulation, maybe tens, hundreds or thousands of times > so it is not worth the extra complexity > of having separate set-up followed by many uses. One could argue that > uniformity of design means we should have > handled matrices with a MatScatter concept to parallel the Vec approach, > but it is too late now :-). > > Barry > > On Jan 2, 2009, at 1:41 PM, Yujie wrote: > > Like MatGetSubMatrix(), whether is there a function to get a subvector in >> parallel mode? I have checked some scatter functios, they don't likely work. >> thanks a lot. >> >> Regards, >> >> Yujie >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 14:09:31 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 12:09:31 -0800 Subject: the performance about combining several matrices into one matrix Message-ID: <7ff0ee010901021209v523880d4o60bfe2302c257077@mail.gmail.com> I am trying to combine several matrices into one matrix in parallel mode for MPIDense. Assuming combing A1, A2 into B I would like to use "MatGetArray() gets the local matrix of A1 and A2, MatSetValues() inserts the local matrix into B" I have checked the mail list, I got the following discussion "We do not have such function. You can create a new matrix C of size m x (n1 + n2), and do a loop: for each row: MatGetRow(A, row,...) MatSetValues(C,1,&row,...) MatGetRow(B, ) MatSetValues(C,1,&row,...) or create A and B with the size m x (n1 + n2), and get B=A+B by calling MatAXPY(), Hong - Hide quoted text - On Wed, 19 Sep 2007, Alejandro Garzon wrote: > > Hi, Is there a function to combine two matrices, sizes m x n1 and m x n2, in a > single matrix size m x (n1 + n2) or the analog case for rows? Thanks. > > Alejandro. > > " To the first method Hong proposed, MatGetRow() just get the local row of the matrix, right? it should be slower than getting the local marix regarding MatSetValues(), right? Do you have any comments about the above and any better advice for matrix combination? thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 14:21:35 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 12:21:35 -0800 Subject: how to extract a subvector? In-Reply-To: References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> Message-ID: <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> Dear Barry: I have a new question about the parameter "iy"(new index set of subvector). To parallel vector, how to provide "ix" and "iy"? Just providing the local index subset for "ix" and "iy"? if it is, it is a little difficult to let the local vector know its global position in the new subvector? If the user needs to provide the global position of local vector for "iy". some MPI communication should be needed, it looks like not a good method. In exacting submatrix, the function hides this problem. could you give me any comments? thanks a lot. Regards, Yujie On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith wrote: > > VecScatter is for this purpose. > > Rational: extracting subparts of vectors for ghost points etc takes place > many times in a simulation; maybe millions. > Thus separating it into a set-up followed by many uses is a worthwhile > optimization. Extracting submatrices occur > much less often in a simulation, maybe tens, hundreds or thousands of times > so it is not worth the extra complexity > of having separate set-up followed by many uses. One could argue that > uniformity of design means we should have > handled matrices with a MatScatter concept to parallel the Vec approach, > but it is too late now :-). > > Barry > > On Jan 2, 2009, at 1:41 PM, Yujie wrote: > > Like MatGetSubMatrix(), whether is there a function to get a subvector in >> parallel mode? I have checked some scatter functios, they don't likely work. >> thanks a lot. >> >> Regards, >> >> Yujie >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jan 2 19:15:14 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 Jan 2009 19:15:14 -0600 Subject: how to extract a subvector? In-Reply-To: <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> Message-ID: <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> The ix and iy are always global indices based on where the vector lives. If you are going from a parallel to parallel vector then both indices are "global" Barry On Jan 2, 2009, at 2:21 PM, Yujie wrote: > Dear Barry: > > I have a new question about the parameter "iy"(new index set of > subvector). To parallel vector, how to provide "ix" and "iy"? > > Just providing the local index subset for "ix" and "iy"? if it is, > it is a little difficult to let the local vector know its global > position in the new subvector? If the user needs to provide the > global position of local vector for "iy". some MPI communication > should be needed, it looks like not a good method. In exacting > submatrix, the function hides this problem. could you give me any > comments? thanks a lot. > > Regards, > > Yujie > > > On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith > wrote: > > VecScatter is for this purpose. > > Rational: extracting subparts of vectors for ghost points etc > takes place many times in a simulation; maybe millions. > Thus separating it into a set-up followed by many uses is a > worthwhile optimization. Extracting submatrices occur > much less often in a simulation, maybe tens, hundreds or thousands > of times so it is not worth the extra complexity > of having separate set-up followed by many uses. One could argue > that uniformity of design means we should have > handled matrices with a MatScatter concept to parallel the Vec > approach, but it is too late now :-). > > Barry > > > On Jan 2, 2009, at 1:41 PM, Yujie wrote: > > Like MatGetSubMatrix(), whether is there a function to get a > subvector in parallel mode? I have checked some scatter functios, > they don't likely work. thanks a lot. > > Regards, > > Yujie > > > From recrusader at gmail.com Fri Jan 2 19:31:32 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 17:31:32 -0800 Subject: how to extract a subvector? In-Reply-To: <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> Message-ID: <7ff0ee010901021731v2de0258erad9f874c7d6b7f1e@mail.gmail.com> You mean I need to let all processes have a copy of the whole global index and to use this function? thanks On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: > > The ix and iy are always global indices based on where the vector lives. > > If you are going from a parallel to parallel vector then both indices are > "global" > > Barry > > > On Jan 2, 2009, at 2:21 PM, Yujie wrote: > > Dear Barry: >> >> I have a new question about the parameter "iy"(new index set of >> subvector). To parallel vector, how to provide "ix" and "iy"? >> >> Just providing the local index subset for "ix" and "iy"? if it is, it is a >> little difficult to let the local vector know its global position in the new >> subvector? If the user needs to provide the global position of local vector >> for "iy". some MPI communication should be needed, it looks like not a good >> method. In exacting submatrix, the function hides this problem. could you >> give me any comments? thanks a lot. >> >> Regards, >> >> Yujie >> >> >> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith wrote: >> >> VecScatter is for this purpose. >> >> Rational: extracting subparts of vectors for ghost points etc takes place >> many times in a simulation; maybe millions. >> Thus separating it into a set-up followed by many uses is a worthwhile >> optimization. Extracting submatrices occur >> much less often in a simulation, maybe tens, hundreds or thousands of >> times so it is not worth the extra complexity >> of having separate set-up followed by many uses. One could argue that >> uniformity of design means we should have >> handled matrices with a MatScatter concept to parallel the Vec approach, >> but it is too late now :-). >> >> Barry >> >> >> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >> >> Like MatGetSubMatrix(), whether is there a function to get a subvector in >> parallel mode? I have checked some scatter functios, they don't likely work. >> thanks a lot. >> >> Regards, >> >> Yujie >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 19:58:27 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 17:58:27 -0800 Subject: distribution of submatrix and subvector Re: how to extract a subvector? Message-ID: <7ff0ee010901021758l1c20dad9vebc204be1d96ab05@mail.gmail.com> Dear Barry: When using MatGetSubmatrix() to get a submatrix in parallel mode, to my knowledge, this function will exact the rows and cols for submatrix. This submatrix is not redistributed. For example, if there is zero row and zero col in a process, the row and col of this submatrix in this process are zero, right? However, to get a subvector using Vec scatter, the user needs to create the subvector using MPI_COMM, that is new distribution for subvector is generated. Just regarding only rows or cols of the submatrix and the subvector, their distribution should be different even if using the same index to get them, right? thanks a lot. Regards, Yujie On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: > > The ix and iy are always global indices based on where the vector lives. > > If you are going from a parallel to parallel vector then both indices are > "global" > > Barry > > > On Jan 2, 2009, at 2:21 PM, Yujie wrote: > > Dear Barry: >> >> I have a new question about the parameter "iy"(new index set of >> subvector). To parallel vector, how to provide "ix" and "iy"? >> >> Just providing the local index subset for "ix" and "iy"? if it is, it is a >> little difficult to let the local vector know its global position in the new >> subvector? If the user needs to provide the global position of local vector >> for "iy". some MPI communication should be needed, it looks like not a good >> method. In exacting submatrix, the function hides this problem. could you >> give me any comments? thanks a lot. >> >> Regards, >> >> Yujie >> >> >> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith wrote: >> >> VecScatter is for this purpose. >> >> Rational: extracting subparts of vectors for ghost points etc takes place >> many times in a simulation; maybe millions. >> Thus separating it into a set-up followed by many uses is a worthwhile >> optimization. Extracting submatrices occur >> much less often in a simulation, maybe tens, hundreds or thousands of >> times so it is not worth the extra complexity >> of having separate set-up followed by many uses. One could argue that >> uniformity of design means we should have >> handled matrices with a MatScatter concept to parallel the Vec approach, >> but it is too late now :-). >> >> Barry >> >> >> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >> >> Like MatGetSubMatrix(), whether is there a function to get a subvector in >> parallel mode? I have checked some scatter functios, they don't likely work. >> thanks a lot. >> >> Regards, >> >> Yujie >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 2 20:19:24 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 Jan 2009 20:19:24 -0600 Subject: how to extract a subvector? In-Reply-To: <7ff0ee010901021731v2de0258erad9f874c7d6b7f1e@mail.gmail.com> References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> <7ff0ee010901021731v2de0258erad9f874c7d6b7f1e@mail.gmail.com> Message-ID: No, you use the global indices ONLY for the values needed on this processor. I would also note that there is an extensive discussion of this IN the manual. Matt On Fri, Jan 2, 2009 at 7:31 PM, Yujie wrote: > You mean I need to let all processes have a copy of the whole global index > and to use this function? thanks > > > On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: > >> >> The ix and iy are always global indices based on where the vector lives. >> >> If you are going from a parallel to parallel vector then both indices >> are "global" >> >> Barry >> >> >> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >> >> Dear Barry: >>> >>> I have a new question about the parameter "iy"(new index set of >>> subvector). To parallel vector, how to provide "ix" and "iy"? >>> >>> Just providing the local index subset for "ix" and "iy"? if it is, it is >>> a little difficult to let the local vector know its global position in the >>> new subvector? If the user needs to provide the global position of local >>> vector for "iy". some MPI communication should be needed, it looks like not >>> a good method. In exacting submatrix, the function hides this problem. could >>> you give me any comments? thanks a lot. >>> >>> Regards, >>> >>> Yujie >>> >>> >>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith wrote: >>> >>> VecScatter is for this purpose. >>> >>> Rational: extracting subparts of vectors for ghost points etc takes >>> place many times in a simulation; maybe millions. >>> Thus separating it into a set-up followed by many uses is a worthwhile >>> optimization. Extracting submatrices occur >>> much less often in a simulation, maybe tens, hundreds or thousands of >>> times so it is not worth the extra complexity >>> of having separate set-up followed by many uses. One could argue that >>> uniformity of design means we should have >>> handled matrices with a MatScatter concept to parallel the Vec approach, >>> but it is too late now :-). >>> >>> Barry >>> >>> >>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>> >>> Like MatGetSubMatrix(), whether is there a function to get a subvector in >>> parallel mode? I have checked some scatter functios, they don't likely work. >>> thanks a lot. >>> >>> Regards, >>> >>> Yujie >>> >>> >>> >>> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Jan 2 20:20:25 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 Jan 2009 20:20:25 -0600 Subject: distribution of submatrix and subvector Re: how to extract a subvector? In-Reply-To: <7ff0ee010901021758l1c20dad9vebc204be1d96ab05@mail.gmail.com> References: <7ff0ee010901021758l1c20dad9vebc204be1d96ab05@mail.gmail.com> Message-ID: On Fri, Jan 2, 2009 at 7:58 PM, Yujie wrote: > Dear Barry: > > When using MatGetSubmatrix() to get a submatrix in parallel mode, to my > knowledge, this function will exact the rows and cols for submatrix. This > submatrix is not redistributed. For example, if there is zero row and zero > col in a process, the row and col of this submatrix in this process are > zero, right? > However, to get a subvector using Vec scatter, the user needs to create the > subvector using MPI_COMM, that is new distribution for subvector is > generated. Just regarding only rows or cols of the submatrix and the > subvector, their distribution should be different even if using the same > index to get them, right? thanks a lot. No. They work exactly the same way. Matt > > Regards, > Yujie > > On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: > >> >> The ix and iy are always global indices based on where the vector lives. >> >> If you are going from a parallel to parallel vector then both indices >> are "global" >> >> Barry >> >> >> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >> >> Dear Barry: >>> >>> I have a new question about the parameter "iy"(new index set of >>> subvector). To parallel vector, how to provide "ix" and "iy"? >>> >>> Just providing the local index subset for "ix" and "iy"? if it is, it is >>> a little difficult to let the local vector know its global position in the >>> new subvector? If the user needs to provide the global position of local >>> vector for "iy". some MPI communication should be needed, it looks like not >>> a good method. In exacting submatrix, the function hides this problem. could >>> you give me any comments? thanks a lot. >>> >>> Regards, >>> >>> Yujie >>> >>> >>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith wrote: >>> >>> VecScatter is for this purpose. >>> >>> Rational: extracting subparts of vectors for ghost points etc takes >>> place many times in a simulation; maybe millions. >>> Thus separating it into a set-up followed by many uses is a worthwhile >>> optimization. Extracting submatrices occur >>> much less often in a simulation, maybe tens, hundreds or thousands of >>> times so it is not worth the extra complexity >>> of having separate set-up followed by many uses. One could argue that >>> uniformity of design means we should have >>> handled matrices with a MatScatter concept to parallel the Vec approach, >>> but it is too late now :-). >>> >>> Barry >>> >>> >>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>> >>> Like MatGetSubMatrix(), whether is there a function to get a subvector in >>> parallel mode? I have checked some scatter functios, they don't likely work. >>> thanks a lot. >>> >>> Regards, >>> >>> Yujie >>> >>> >>> >>> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 22:06:05 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 20:06:05 -0800 Subject: how to extract a subvector? In-Reply-To: References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> <7ff0ee010901021731v2de0258erad9f874c7d6b7f1e@mail.gmail.com> Message-ID: <7ff0ee010901022006s5c27b0ddtc1ca820c964ef213@mail.gmail.com> Dear Matthew: Assuming the parent vector is 0-9 and there are 3 processes (0,1,2; 3,4,5; 6,7,8,9), Now the subvector contains 2; 4; 8, 9. I should create "ix" in 3 processores with MPI_COMM using arraies {2}, {4}, {8,9} respectively? create "iy" using arraies {0},{1},{2,3} respectively? If it is, I think it is a little difficult to get "iy". Since the user needs to communicate between processors to confirm the global position of the local subvector. thanks a lot. Regards, Yujie On Fri, Jan 2, 2009 at 6:19 PM, Matthew Knepley wrote: > No, you use the global indices ONLY for the values needed on this > processor. I would > also note that there is an extensive discussion of this IN the manual. > > Matt > > > On Fri, Jan 2, 2009 at 7:31 PM, Yujie wrote: > >> You mean I need to let all processes have a copy of the whole global index >> and to use this function? thanks >> >> >> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >> >>> >>> The ix and iy are always global indices based on where the vector >>> lives. >>> >>> If you are going from a parallel to parallel vector then both indices >>> are "global" >>> >>> Barry >>> >>> >>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>> >>> Dear Barry: >>>> >>>> I have a new question about the parameter "iy"(new index set of >>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>> >>>> Just providing the local index subset for "ix" and "iy"? if it is, it is >>>> a little difficult to let the local vector know its global position in the >>>> new subvector? If the user needs to provide the global position of local >>>> vector for "iy". some MPI communication should be needed, it looks like not >>>> a good method. In exacting submatrix, the function hides this problem. could >>>> you give me any comments? thanks a lot. >>>> >>>> Regards, >>>> >>>> Yujie >>>> >>>> >>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>> wrote: >>>> >>>> VecScatter is for this purpose. >>>> >>>> Rational: extracting subparts of vectors for ghost points etc takes >>>> place many times in a simulation; maybe millions. >>>> Thus separating it into a set-up followed by many uses is a worthwhile >>>> optimization. Extracting submatrices occur >>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>> times so it is not worth the extra complexity >>>> of having separate set-up followed by many uses. One could argue that >>>> uniformity of design means we should have >>>> handled matrices with a MatScatter concept to parallel the Vec approach, >>>> but it is too late now :-). >>>> >>>> Barry >>>> >>>> >>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>> >>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>> in parallel mode? I have checked some scatter functios, they don't likely >>>> work. thanks a lot. >>>> >>>> Regards, >>>> >>>> Yujie >>>> >>>> >>>> >>>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 22:08:53 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 20:08:53 -0800 Subject: distribution of submatrix and subvector Re: how to extract a subvector? In-Reply-To: References: <7ff0ee010901021758l1c20dad9vebc204be1d96ab05@mail.gmail.com> Message-ID: <7ff0ee010901022008h176a35dcpb0e6388fc9e12f89@mail.gmail.com> Dear Matthew: They have the same distribution according to the distribution I have mentioned about submatrix? Could you give me more details? thanks a lot. Regards, Yujie On Fri, Jan 2, 2009 at 6:20 PM, Matthew Knepley wrote: > On Fri, Jan 2, 2009 at 7:58 PM, Yujie wrote: > >> Dear Barry: >> >> When using MatGetSubmatrix() to get a submatrix in parallel mode, to my >> knowledge, this function will exact the rows and cols for submatrix. This >> submatrix is not redistributed. For example, if there is zero row and zero >> col in a process, the row and col of this submatrix in this process are >> zero, right? >> However, to get a subvector using Vec scatter, the user needs to create >> the subvector using MPI_COMM, that is new distribution for subvector is >> generated. Just regarding only rows or cols of the submatrix and the >> subvector, their distribution should be different even if using the same >> index to get them, right? thanks a lot. > > > No. They work exactly the same way. > > Matt > > >> >> Regards, >> Yujie >> >> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >> >>> >>> The ix and iy are always global indices based on where the vector >>> lives. >>> >>> If you are going from a parallel to parallel vector then both indices >>> are "global" >>> >>> Barry >>> >>> >>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>> >>> Dear Barry: >>>> >>>> I have a new question about the parameter "iy"(new index set of >>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>> >>>> Just providing the local index subset for "ix" and "iy"? if it is, it is >>>> a little difficult to let the local vector know its global position in the >>>> new subvector? If the user needs to provide the global position of local >>>> vector for "iy". some MPI communication should be needed, it looks like not >>>> a good method. In exacting submatrix, the function hides this problem. could >>>> you give me any comments? thanks a lot. >>>> >>>> Regards, >>>> >>>> Yujie >>>> >>>> >>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>> wrote: >>>> >>>> VecScatter is for this purpose. >>>> >>>> Rational: extracting subparts of vectors for ghost points etc takes >>>> place many times in a simulation; maybe millions. >>>> Thus separating it into a set-up followed by many uses is a worthwhile >>>> optimization. Extracting submatrices occur >>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>> times so it is not worth the extra complexity >>>> of having separate set-up followed by many uses. One could argue that >>>> uniformity of design means we should have >>>> handled matrices with a MatScatter concept to parallel the Vec approach, >>>> but it is too late now :-). >>>> >>>> Barry >>>> >>>> >>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>> >>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>> in parallel mode? I have checked some scatter functios, they don't likely >>>> work. thanks a lot. >>>> >>>> Regards, >>>> >>>> Yujie >>>> >>>> >>>> >>>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Jan 2 22:13:10 2009 From: dave.mayhem23 at gmail.com (Dave May) Date: Sat, 3 Jan 2009 15:13:10 +1100 Subject: how to extract a subvector? In-Reply-To: <7ff0ee010901022006s5c27b0ddtc1ca820c964ef213@mail.gmail.com> References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> <7ff0ee010901021731v2de0258erad9f874c7d6b7f1e@mail.gmail.com> <7ff0ee010901022006s5c27b0ddtc1ca820c964ef213@mail.gmail.com> Message-ID: <956373f0901022013h1a849308o1524902065c5ffb0@mail.gmail.com> Your ix array looks fine. If you are happy ordering the sub vector as you've indicated, just pass in PETSC_NULL for iy when you define the VecScatter. This will indicate to just insert/add the entries for the sub vector in order from lowest global index largest global index. Cheers, Dave On Sat, Jan 3, 2009 at 3:06 PM, Yujie wrote: > Dear Matthew: > > Assuming the parent vector is 0-9 and there are 3 processes (0,1,2; 3,4,5; > 6,7,8,9), > Now the subvector contains 2; 4; 8, 9. > I should create "ix" in 3 processores with MPI_COMM using arraies {2}, {4}, > {8,9} respectively? > create "iy" using arraies {0},{1},{2,3} respectively? > If it is, I think it is a little difficult to get "iy". Since the user > needs to communicate between processors to confirm the global position of > the local subvector. thanks a lot. > > Regards, > Yujie > > > On Fri, Jan 2, 2009 at 6:19 PM, Matthew Knepley wrote: > >> No, you use the global indices ONLY for the values needed on this >> processor. I would >> also note that there is an extensive discussion of this IN the manual. >> >> Matt >> >> >> On Fri, Jan 2, 2009 at 7:31 PM, Yujie wrote: >> >>> You mean I need to let all processes have a copy of the whole global >>> index and to use this function? thanks >>> >>> >>> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >>> >>>> >>>> The ix and iy are always global indices based on where the vector >>>> lives. >>>> >>>> If you are going from a parallel to parallel vector then both indices >>>> are "global" >>>> >>>> Barry >>>> >>>> >>>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>>> >>>> Dear Barry: >>>>> >>>>> I have a new question about the parameter "iy"(new index set of >>>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>>> >>>>> Just providing the local index subset for "ix" and "iy"? if it is, it >>>>> is a little difficult to let the local vector know its global position in >>>>> the new subvector? If the user needs to provide the global position of local >>>>> vector for "iy". some MPI communication should be needed, it looks like not >>>>> a good method. In exacting submatrix, the function hides this problem. could >>>>> you give me any comments? thanks a lot. >>>>> >>>>> Regards, >>>>> >>>>> Yujie >>>>> >>>>> >>>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>>> wrote: >>>>> >>>>> VecScatter is for this purpose. >>>>> >>>>> Rational: extracting subparts of vectors for ghost points etc takes >>>>> place many times in a simulation; maybe millions. >>>>> Thus separating it into a set-up followed by many uses is a worthwhile >>>>> optimization. Extracting submatrices occur >>>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>>> times so it is not worth the extra complexity >>>>> of having separate set-up followed by many uses. One could argue that >>>>> uniformity of design means we should have >>>>> handled matrices with a MatScatter concept to parallel the Vec >>>>> approach, but it is too late now :-). >>>>> >>>>> Barry >>>>> >>>>> >>>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>>> >>>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>>> in parallel mode? I have checked some scatter functios, they don't likely >>>>> work. thanks a lot. >>>>> >>>>> Regards, >>>>> >>>>> Yujie >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Jan 2 22:20:56 2009 From: recrusader at gmail.com (Yujie) Date: Fri, 2 Jan 2009 20:20:56 -0800 Subject: how to extract a subvector? In-Reply-To: <956373f0901022013h1a849308o1524902065c5ffb0@mail.gmail.com> References: <7ff0ee010901021141q6643846fj44897d2865da6582@mail.gmail.com> <7ff0ee010901021221y2ad26d8dxca1da64400af4027@mail.gmail.com> <8687D003-D2A6-49EB-921A-7EEBAE080DC2@mcs.anl.gov> <7ff0ee010901021731v2de0258erad9f874c7d6b7f1e@mail.gmail.com> <7ff0ee010901022006s5c27b0ddtc1ca820c964ef213@mail.gmail.com> <956373f0901022013h1a849308o1524902065c5ffb0@mail.gmail.com> Message-ID: <7ff0ee010901022020p1d1ccb62mb1517bd253607178@mail.gmail.com> Dear Dave: thank you very much for your help :). Regards, Yujie On Fri, Jan 2, 2009 at 8:13 PM, Dave May wrote: > Your ix array looks fine. > > If you are happy ordering the sub vector as you've indicated, just pass in > PETSC_NULL for iy when you define the VecScatter. This will indicate to just > insert/add the entries for the sub vector in order from lowest global index > largest global index. > > Cheers, > Dave > > > > On Sat, Jan 3, 2009 at 3:06 PM, Yujie wrote: > >> Dear Matthew: >> >> Assuming the parent vector is 0-9 and there are 3 processes (0,1,2; 3,4,5; >> 6,7,8,9), >> Now the subvector contains 2; 4; 8, 9. >> I should create "ix" in 3 processores with MPI_COMM using arraies {2}, >> {4}, {8,9} respectively? >> create "iy" using arraies {0},{1},{2,3} respectively? >> If it is, I think it is a little difficult to get "iy". Since the user >> needs to communicate between processors to confirm the global position of >> the local subvector. thanks a lot. >> >> Regards, >> Yujie >> >> >> On Fri, Jan 2, 2009 at 6:19 PM, Matthew Knepley wrote: >> >>> No, you use the global indices ONLY for the values needed on this >>> processor. I would >>> also note that there is an extensive discussion of this IN the manual. >>> >>> Matt >>> >>> >>> On Fri, Jan 2, 2009 at 7:31 PM, Yujie wrote: >>> >>>> You mean I need to let all processes have a copy of the whole global >>>> index and to use this function? thanks >>>> >>>> >>>> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >>>> >>>>> >>>>> The ix and iy are always global indices based on where the vector >>>>> lives. >>>>> >>>>> If you are going from a parallel to parallel vector then both indices >>>>> are "global" >>>>> >>>>> Barry >>>>> >>>>> >>>>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>>>> >>>>> Dear Barry: >>>>>> >>>>>> I have a new question about the parameter "iy"(new index set of >>>>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>>>> >>>>>> Just providing the local index subset for "ix" and "iy"? if it is, it >>>>>> is a little difficult to let the local vector know its global position in >>>>>> the new subvector? If the user needs to provide the global position of local >>>>>> vector for "iy". some MPI communication should be needed, it looks like not >>>>>> a good method. In exacting submatrix, the function hides this problem. could >>>>>> you give me any comments? thanks a lot. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Yujie >>>>>> >>>>>> >>>>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>>>> wrote: >>>>>> >>>>>> VecScatter is for this purpose. >>>>>> >>>>>> Rational: extracting subparts of vectors for ghost points etc takes >>>>>> place many times in a simulation; maybe millions. >>>>>> Thus separating it into a set-up followed by many uses is a >>>>>> worthwhile optimization. Extracting submatrices occur >>>>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>>>> times so it is not worth the extra complexity >>>>>> of having separate set-up followed by many uses. One could argue that >>>>>> uniformity of design means we should have >>>>>> handled matrices with a MatScatter concept to parallel the Vec >>>>>> approach, but it is too late now :-). >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>>>> >>>>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>>>> in parallel mode? I have checked some scatter functios, they don't likely >>>>>> work. thanks a lot. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Yujie >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Jan 3 12:52:42 2009 From: recrusader at gmail.com (Yujie) Date: Sat, 3 Jan 2009 10:52:42 -0800 Subject: further about MatGetSubMatrix() Re: distribution of submatrix and subvector Re: how to extract a subvector? Message-ID: <7ff0ee010901031052r681106b1l79e4aac221504f91@mail.gmail.com> Dear Matthew: I have checked the description of MatGetSubMatrix() and ISAllGather(). I am wondering if isrow is created by MPI_COMM, does it work when iscol is obtained based on PETSC_COMM_SELF? thanks. MatGetSubMatrix(Mat mat,IS isrow,IS iscol,PetscInt csize,MatReuse cll,Mat *newmat) Regards, Yujie On Fri, Jan 2, 2009 at 6:20 PM, Matthew Knepley wrote: > On Fri, Jan 2, 2009 at 7:58 PM, Yujie wrote: > >> Dear Barry: >> >> When using MatGetSubmatrix() to get a submatrix in parallel mode, to my >> knowledge, this function will exact the rows and cols for submatrix. This >> submatrix is not redistributed. For example, if there is zero row and zero >> col in a process, the row and col of this submatrix in this process are >> zero, right? >> However, to get a subvector using Vec scatter, the user needs to create >> the subvector using MPI_COMM, that is new distribution for subvector is >> generated. Just regarding only rows or cols of the submatrix and the >> subvector, their distribution should be different even if using the same >> index to get them, right? thanks a lot. > > > No. They work exactly the same way. > > Matt > > >> >> Regards, >> Yujie >> >> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >> >>> >>> The ix and iy are always global indices based on where the vector >>> lives. >>> >>> If you are going from a parallel to parallel vector then both indices >>> are "global" >>> >>> Barry >>> >>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>> >>> Dear Barry: >>>> >>>> I have a new question about the parameter "iy"(new index set of >>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>> >>>> Just providing the local index subset for "ix" and "iy"? if it is, it is >>>> a little difficult to let the local vector know its global position in the >>>> new subvector? If the user needs to provide the global position of local >>>> vector for "iy". some MPI communication should be needed, it looks like not >>>> a good method. In exacting submatrix, the function hides this problem. could >>>> you give me any comments? thanks a lot. >>>> >>>> Regards, >>>> >>>> Yujie >>>> >>>> >>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>> wrote: >>>> >>>> VecScatter is for this purpose. >>>> >>>> Rational: extracting subparts of vectors for ghost points etc takes >>>> place many times in a simulation; maybe millions. >>>> Thus separating it into a set-up followed by many uses is a worthwhile >>>> optimization. Extracting submatrices occur >>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>> times so it is not worth the extra complexity >>>> of having separate set-up followed by many uses. One could argue that >>>> uniformity of design means we should have >>>> handled matrices with a MatScatter concept to parallel the Vec approach, >>>> but it is too late now :-). >>>> >>>> Barry >>>> >>>> >>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>> >>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>> in parallel mode? I have checked some scatter functios, they don't likely >>>> work. thanks a lot. >>>> >>>> Regards, >>>> >>>> Yujie >>>> >>>> >>>> >>>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jan 3 12:57:12 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 3 Jan 2009 12:57:12 -0600 Subject: further about MatGetSubMatrix() Re: distribution of submatrix and subvector Re: how to extract a subvector? In-Reply-To: <7ff0ee010901031052r681106b1l79e4aac221504f91@mail.gmail.com> References: <7ff0ee010901031052r681106b1l79e4aac221504f91@mail.gmail.com> Message-ID: On Sat, Jan 3, 2009 at 12:52 PM, Yujie wrote: > Dear Matthew: > > I have checked the description of MatGetSubMatrix() and ISAllGather(). I am > wondering if isrow is created by MPI_COMM, does it work when iscol is > obtained based on PETSC_COMM_SELF? thanks. > > MatGetSubMatrix(Mat mat,IS isrow,IS iscol,PetscInt csize,MatReuse cll,Mat > *newmat) > 1) We do not check the comm of IS in this function 2) If we did we would issue an error code which would be checked by CHKERRQ and report the mismatch. This allows experimentation with the code to uncover these answers. Matt > > Regards, > > Yujie > > On Fri, Jan 2, 2009 at 6:20 PM, Matthew Knepley wrote: > >> On Fri, Jan 2, 2009 at 7:58 PM, Yujie wrote: >> >>> Dear Barry: >>> >>> When using MatGetSubmatrix() to get a submatrix in parallel mode, to my >>> knowledge, this function will exact the rows and cols for submatrix. This >>> submatrix is not redistributed. For example, if there is zero row and zero >>> col in a process, the row and col of this submatrix in this process are >>> zero, right? >>> However, to get a subvector using Vec scatter, the user needs to create >>> the subvector using MPI_COMM, that is new distribution for subvector is >>> generated. Just regarding only rows or cols of the submatrix and the >>> subvector, their distribution should be different even if using the same >>> index to get them, right? thanks a lot. >> >> >> No. They work exactly the same way. >> >> Matt >> >> >>> >>> Regards, >>> Yujie >>> >>> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >>> >>>> >>>> The ix and iy are always global indices based on where the vector >>>> lives. >>>> >>>> If you are going from a parallel to parallel vector then both indices >>>> are "global" >>>> >>>> Barry >>>> >>>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>>> >>>> Dear Barry: >>>>> >>>>> I have a new question about the parameter "iy"(new index set of >>>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>>> >>>>> Just providing the local index subset for "ix" and "iy"? if it is, it >>>>> is a little difficult to let the local vector know its global position in >>>>> the new subvector? If the user needs to provide the global position of local >>>>> vector for "iy". some MPI communication should be needed, it looks like not >>>>> a good method. In exacting submatrix, the function hides this problem. could >>>>> you give me any comments? thanks a lot. >>>>> >>>>> Regards, >>>>> >>>>> Yujie >>>>> >>>>> >>>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>>> wrote: >>>>> >>>>> VecScatter is for this purpose. >>>>> >>>>> Rational: extracting subparts of vectors for ghost points etc takes >>>>> place many times in a simulation; maybe millions. >>>>> Thus separating it into a set-up followed by many uses is a worthwhile >>>>> optimization. Extracting submatrices occur >>>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>>> times so it is not worth the extra complexity >>>>> of having separate set-up followed by many uses. One could argue that >>>>> uniformity of design means we should have >>>>> handled matrices with a MatScatter concept to parallel the Vec >>>>> approach, but it is too late now :-). >>>>> >>>>> Barry >>>>> >>>>> >>>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>>> >>>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>>> in parallel mode? I have checked some scatter functios, they don't likely >>>>> work. thanks a lot. >>>>> >>>>> Regards, >>>>> >>>>> Yujie >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Jan 3 13:51:15 2009 From: recrusader at gmail.com (Yujie) Date: Sat, 3 Jan 2009 11:51:15 -0800 Subject: further about MatGetSubMatrix() Re: distribution of submatrix and subvector Re: how to extract a subvector? In-Reply-To: References: <7ff0ee010901031052r681106b1l79e4aac221504f91@mail.gmail.com> Message-ID: <7ff0ee010901031151g74d0f2b3ue3f8a0f813f7015c@mail.gmail.com> Dear Matthew: thanks again. sorry for posting lots of equestions. Regards, Yujie On Sat, Jan 3, 2009 at 10:57 AM, Matthew Knepley wrote: > On Sat, Jan 3, 2009 at 12:52 PM, Yujie wrote: > >> Dear Matthew: >> >> I have checked the description of MatGetSubMatrix() and ISAllGather(). I >> am wondering if isrow is created by MPI_COMM, does it work when iscol is >> obtained based on PETSC_COMM_SELF? thanks. >> >> MatGetSubMatrix(Mat mat,IS isrow,IS iscol,PetscInt csize,MatReuse cll,Mat >> *newmat) >> > > 1) We do not check the comm of IS in this function > > 2) If we did we would issue an error code which would be checked by CHKERRQ > and report the mismatch. This > allows experimentation with the code to uncover these answers. > > Matt > > >> >> Regards, >> >> Yujie >> >> On Fri, Jan 2, 2009 at 6:20 PM, Matthew Knepley wrote: >> >>> On Fri, Jan 2, 2009 at 7:58 PM, Yujie wrote: >>> >>>> Dear Barry: >>>> >>>> When using MatGetSubmatrix() to get a submatrix in parallel mode, to my >>>> knowledge, this function will exact the rows and cols for submatrix. This >>>> submatrix is not redistributed. For example, if there is zero row and zero >>>> col in a process, the row and col of this submatrix in this process are >>>> zero, right? >>>> However, to get a subvector using Vec scatter, the user needs to create >>>> the subvector using MPI_COMM, that is new distribution for subvector is >>>> generated. Just regarding only rows or cols of the submatrix and the >>>> subvector, their distribution should be different even if using the same >>>> index to get them, right? thanks a lot. >>> >>> >>> No. They work exactly the same way. >>> >>> Matt >>> >>> >>>> >>>> Regards, >>>> Yujie >>>> >>>> On Fri, Jan 2, 2009 at 5:15 PM, Barry Smith wrote: >>>> >>>>> >>>>> The ix and iy are always global indices based on where the vector >>>>> lives. >>>>> >>>>> If you are going from a parallel to parallel vector then both indices >>>>> are "global" >>>>> >>>>> Barry >>>>> >>>>> On Jan 2, 2009, at 2:21 PM, Yujie wrote: >>>>> >>>>> Dear Barry: >>>>>> >>>>>> I have a new question about the parameter "iy"(new index set of >>>>>> subvector). To parallel vector, how to provide "ix" and "iy"? >>>>>> >>>>>> Just providing the local index subset for "ix" and "iy"? if it is, it >>>>>> is a little difficult to let the local vector know its global position in >>>>>> the new subvector? If the user needs to provide the global position of local >>>>>> vector for "iy". some MPI communication should be needed, it looks like not >>>>>> a good method. In exacting submatrix, the function hides this problem. could >>>>>> you give me any comments? thanks a lot. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Yujie >>>>>> >>>>>> >>>>>> On Fri, Jan 2, 2009 at 11:46 AM, Barry Smith >>>>>> wrote: >>>>>> >>>>>> VecScatter is for this purpose. >>>>>> >>>>>> Rational: extracting subparts of vectors for ghost points etc takes >>>>>> place many times in a simulation; maybe millions. >>>>>> Thus separating it into a set-up followed by many uses is a >>>>>> worthwhile optimization. Extracting submatrices occur >>>>>> much less often in a simulation, maybe tens, hundreds or thousands of >>>>>> times so it is not worth the extra complexity >>>>>> of having separate set-up followed by many uses. One could argue that >>>>>> uniformity of design means we should have >>>>>> handled matrices with a MatScatter concept to parallel the Vec >>>>>> approach, but it is too late now :-). >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> On Jan 2, 2009, at 1:41 PM, Yujie wrote: >>>>>> >>>>>> Like MatGetSubMatrix(), whether is there a function to get a subvector >>>>>> in parallel mode? I have checked some scatter functios, they don't likely >>>>>> work. thanks a lot. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Yujie >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Jan 6 11:52:45 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 6 Jan 2009 09:52:45 -0800 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() Message-ID: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> Dear PETSc developers: I am trying to use MatGetArray() and MatSetValues() to combine several MPIDense matrices into one matrix. At the beginning, I use MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3-p8 version) to get the start, end row and column. I can calculate the local rows and columns. I also use MatGetLocalSize() to confirm the accuracy. However, I always find some data loses in the combined matrix. And then, I try to use MatDenseGetLocalMatrix() to get the lcoal matrix and output it. I find column information by MatGetLocalSize() is not consistent with by MatDenseGetLocalMatrix(), is it bug? could you give me some advice? thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 6 13:33:54 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Jan 2009 13:33:54 -0600 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() In-Reply-To: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> Message-ID: On Tue, Jan 6, 2009 at 11:52 AM, Yujie wrote: > Dear PETSc developers: > > I am trying to use MatGetArray() and MatSetValues() to combine several > MPIDense matrices into one matrix. At the beginning, I use > > MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3-p8 > version) to get the start, end row and column. I can calculate the local > rows and columns. > > I also use MatGetLocalSize() to confirm the accuracy. However, I always > find some data loses in the combined matrix. > > And then, I try to use MatDenseGetLocalMatrix() to get the lcoal matrix and > output it. I find column information by MatGetLocalSize() is not consistent > with by MatDenseGetLocalMatrix(), is it bug? could you give me some advice? > thanks a lot. > What information? Matt > Regards, > > Yujie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Jan 6 13:43:02 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 6 Jan 2009 11:43:02 -0800 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() In-Reply-To: References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> Message-ID: <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> Dear Matthew: Two processors are used. The matrix dimension is 105*108; MatGetOwnershipRange() proc1: 0->56 proc2: 56->105 Mat->camp.rstart; Mat->cmap.rend proc1: 0->54 proc2: 54->108 MatGetLocalSize() proc1: row 56 col 54 porc2: row 49 col 54 MatDenseGetLocalMatrix() proc1: 56*108 proc2: 49*108 thanks. Yujie On Tue, Jan 6, 2009 at 11:33 AM, Matthew Knepley wrote: > On Tue, Jan 6, 2009 at 11:52 AM, Yujie wrote: > >> Dear PETSc developers: >> >> I am trying to use MatGetArray() and MatSetValues() to combine several >> MPIDense matrices into one matrix. At the beginning, I use >> >> MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3-p8 >> version) to get the start, end row and column. I can calculate the local >> rows and columns. >> >> I also use MatGetLocalSize() to confirm the accuracy. However, I always >> find some data loses in the combined matrix. >> >> And then, I try to use MatDenseGetLocalMatrix() to get the lcoal matrix >> and output it. I find column information by MatGetLocalSize() is not >> consistent with by MatDenseGetLocalMatrix(), is it bug? could you give me >> some advice? thanks a lot. >> > What information? > > Matt > > >> Regards, >> >> Yujie >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 6 13:45:05 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Jan 2009 13:45:05 -0600 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() In-Reply-To: <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> Message-ID: On Tue, Jan 6, 2009 at 1:43 PM, Yujie wrote: > Dear Matthew: > > Two processors are used. The matrix dimension is 105*108; > > MatGetOwnershipRange() > > proc1: 0->56 > > proc2: 56->105 > > Mat->camp.rstart; Mat->cmap.rend > > proc1: 0->54 > > proc2: 54->108 > > MatGetLocalSize() > > proc1: row 56 col 54 > > porc2: row 49 col 54 > > MatDenseGetLocalMatrix() > > proc1: 56*108 > > proc2: 49*108 > > Since PETSc matrices are all stored row-wise, even if columns are assigned to one process for the other, the storage is divided by row. This GetLocalMatrix() returns all the rows associated with a given process. Matt > thanks. > > Yujie > > On Tue, Jan 6, 2009 at 11:33 AM, Matthew Knepley wrote: > >> On Tue, Jan 6, 2009 at 11:52 AM, Yujie wrote: >> >>> Dear PETSc developers: >>> >>> I am trying to use MatGetArray() and MatSetValues() to combine several >>> MPIDense matrices into one matrix. At the beginning, I use >>> >>> MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3-p8 >>> version) to get the start, end row and column. I can calculate the local >>> rows and columns. >>> >>> I also use MatGetLocalSize() to confirm the accuracy. However, I always >>> find some data loses in the combined matrix. >>> >>> And then, I try to use MatDenseGetLocalMatrix() to get the lcoal matrix >>> and output it. I find column information by MatGetLocalSize() is not >>> consistent with by MatDenseGetLocalMatrix(), is it bug? could you give me >>> some advice? thanks a lot. >>> >> What information? >> >> Matt >> >> >>> Regards, >>> >>> Yujie >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Jan 6 13:55:41 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 6 Jan 2009 11:55:41 -0800 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() In-Reply-To: References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> Message-ID: <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> I am sorry, Matthew. I can't understand what you said. You mean it is not bug? However, practically, the dimension of local matrix in proc1 is 56*108? if it is, how to obtain its accurate dimension? thanks a lot. Regards, Yujie On Tue, Jan 6, 2009 at 11:45 AM, Matthew Knepley wrote: > On Tue, Jan 6, 2009 at 1:43 PM, Yujie wrote: > >> Dear Matthew: >> >> Two processors are used. The matrix dimension is 105*108; >> >> MatGetOwnershipRange() >> >> proc1: 0->56 >> >> proc2: 56->105 >> >> Mat->camp.rstart; Mat->cmap.rend >> >> proc1: 0->54 >> >> proc2: 54->108 >> >> MatGetLocalSize() >> >> proc1: row 56 col 54 >> >> porc2: row 49 col 54 >> >> MatDenseGetLocalMatrix() >> >> proc1: 56*108 >> >> proc2: 49*108 >> > > Since PETSc matrices are all stored row-wise, even if columns are assigned > to one process > for the other, the storage is divided by row. This GetLocalMatrix() returns > all the rows > associated with a given process. > > Matt > > >> thanks. >> >> Yujie >> >> On Tue, Jan 6, 2009 at 11:33 AM, Matthew Knepley wrote: >> >>> On Tue, Jan 6, 2009 at 11:52 AM, Yujie wrote: >>> >>>> Dear PETSc developers: >>>> >>>> I am trying to use MatGetArray() and MatSetValues() to combine several >>>> MPIDense matrices into one matrix. At the beginning, I use >>>> >>>> MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3-p8 >>>> version) to get the start, end row and column. I can calculate the local >>>> rows and columns. >>>> >>>> I also use MatGetLocalSize() to confirm the accuracy. However, I always >>>> find some data loses in the combined matrix. >>>> >>>> And then, I try to use MatDenseGetLocalMatrix() to get the lcoal matrix >>>> and output it. I find column information by MatGetLocalSize() is not >>>> consistent with by MatDenseGetLocalMatrix(), is it bug? could you give me >>>> some advice? thanks a lot. >>>> >>> What information? >>> >>> Matt >>> >>> >>>> Regards, >>>> >>>> Yujie >>>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 6 14:30:56 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Jan 2009 14:30:56 -0600 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() In-Reply-To: <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> Message-ID: On Tue, Jan 6, 2009 at 1:55 PM, Yujie wrote: > I am sorry, Matthew. I can't understand what you said. You mean it is not > bug? However, practically, the dimension of local matrix in proc1 is 56*108? > if it is, how to obtain its accurate dimension? thanks a lot. > For rowwise storage, the number of columns is always the global number. Matt > Regards, > > Yujie > > On Tue, Jan 6, 2009 at 11:45 AM, Matthew Knepley wrote: > >> On Tue, Jan 6, 2009 at 1:43 PM, Yujie wrote: >> >>> Dear Matthew: >>> >>> Two processors are used. The matrix dimension is 105*108; >>> >>> MatGetOwnershipRange() >>> >>> proc1: 0->56 >>> >>> proc2: 56->105 >>> >>> Mat->camp.rstart; Mat->cmap.rend >>> >>> proc1: 0->54 >>> >>> proc2: 54->108 >>> >>> MatGetLocalSize() >>> >>> proc1: row 56 col 54 >>> >>> porc2: row 49 col 54 >>> >>> MatDenseGetLocalMatrix() >>> >>> proc1: 56*108 >>> >>> proc2: 49*108 >>> >> >> Since PETSc matrices are all stored row-wise, even if columns are assigned >> to one process >> for the other, the storage is divided by row. This GetLocalMatrix() >> returns all the rows >> associated with a given process. >> >> Matt >> >> >>> thanks. >>> >>> Yujie >>> >>> On Tue, Jan 6, 2009 at 11:33 AM, Matthew Knepley wrote: >>> >>>> On Tue, Jan 6, 2009 at 11:52 AM, Yujie wrote: >>>> >>>>> Dear PETSc developers: >>>>> >>>>> I am trying to use MatGetArray() and MatSetValues() to combine several >>>>> MPIDense matrices into one matrix. At the beginning, I use >>>>> >>>>> MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3-p8 >>>>> version) to get the start, end row and column. I can calculate the local >>>>> rows and columns. >>>>> >>>>> I also use MatGetLocalSize() to confirm the accuracy. However, I always >>>>> find some data loses in the combined matrix. >>>>> >>>>> And then, I try to use MatDenseGetLocalMatrix() to get the lcoal matrix >>>>> and output it. I find column information by MatGetLocalSize() is not >>>>> consistent with by MatDenseGetLocalMatrix(), is it bug? could you give me >>>>> some advice? thanks a lot. >>>>> >>>> What information? >>>> >>>> Matt >>>> >>>> >>>>> Regards, >>>>> >>>>> Yujie >>>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 6 15:15:32 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Jan 2009 15:15:32 -0600 Subject: MatGetLocalSize() and MatDenseGetLocalMatrix() In-Reply-To: <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> Message-ID: <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> The cmap (and even rmap) information is NOT always directly related to how/what matrix parts are stored where. Given y = A x the rmap information of A matches the map information of y, the cmap information matches the map information of x. This is what the map information provides. It does not tell you the dimensions of the local part of A, sometimes the local part of A is one matrix, sometimes it is 2 sometimes it does not exist. You are trying to add meaning to stuff in PETSc that is sometimes true as if it were always true. Barry On Jan 6, 2009, at 1:55 PM, Yujie wrote: > I am sorry, Matthew. I can't understand what you said. You mean it > is not bug? However, practically, the dimension of local matrix in > proc1 is 56*108? if it is, how to obtain its accurate dimension? > thanks a lot. > > Regards, > > Yujie > > > On Tue, Jan 6, 2009 at 11:45 AM, Matthew Knepley > wrote: > On Tue, Jan 6, 2009 at 1:43 PM, Yujie wrote: > Dear Matthew: > > Two processors are used. The matrix dimension is 105*108; > > MatGetOwnershipRange() > > proc1: 0->56 > > proc2: 56->105 > > Mat->camp.rstart; Mat->cmap.rend > > proc1: 0->54 > > proc2: 54->108 > > MatGetLocalSize() > > proc1: row 56 col 54 > > porc2: row 49 col 54 > > MatDenseGetLocalMatrix() > > proc1: 56*108 > > proc2: 49*108 > > > Since PETSc matrices are all stored row-wise, even if columns are > assigned to one process > for the other, the storage is divided by row. This GetLocalMatrix() > returns all the rows > associated with a given process. > > Matt > > thanks. > > Yujie > > > On Tue, Jan 6, 2009 at 11:33 AM, Matthew Knepley > wrote: > On Tue, Jan 6, 2009 at 11:52 AM, Yujie wrote: > Dear PETSc developers: > > I am trying to use MatGetArray() and MatSetValues() to combine > several MPIDense matrices into one matrix. At the beginning, I use > > MatGetOwnershipRange() and Mat->camp.rstart; Mat->cmap.rend (2.3.3- > p8 version) to get the start, end row and column. I can calculate > the local rows and columns. > > I also use MatGetLocalSize() to confirm the accuracy. However, I > always find some data loses in the combined matrix. > > And then, I try to use MatDenseGetLocalMatrix() to get the lcoal > matrix and output it. I find column information by MatGetLocalSize() > is not consistent with by MatDenseGetLocalMatrix(), is it bug? could > you give me some advice? thanks a lot. > > What information? > > Matt > > Regards, > > Yujie > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From xy2102 at columbia.edu Tue Jan 6 16:44:16 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Tue, 06 Jan 2009 17:44:16 -0500 Subject: PetscValidHeaderSpecific() In-Reply-To: <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> Message-ID: <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> Hi, I did have error information as at the end of this email, and I found the error was a function called PetscValidHeaderSpecific(), is there anything wrong with it? Thanks a lot! R [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Wrong type of object: Parameter # 1! [0]PETSC ERROR: ---------------------------------------- [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 0, Fri Dec 19 22:02:38 CST 2008 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ---------------------------------------- [0]PETSC ERROR: ./oreggt on a cygwin-c- named YUAN_WORK by Rebecca Tue Jan 6 17:29:59 2009 [0]PETSC ERROR: Libraries linked from /home/Rebecca/soft/petsc-3.0.0-p0/cygwin-c-debug/lib [0]PETSC ERROR: Configure run at Mon Jan 5 13:50:14 2009 [0]PETSC ERROR: Configure options --with-cc=gcc --download-f-blas-lapack=1 --download-mpich=1 --useThreads=0 --with-shared=0 [0]PETSC ERROR: ---------------------------------------- [0]PETSC ERROR: MatDestroy() line 766 in src/mat/interface/matrix.c [0]PETSC ERROR: PCDestroy() line 88 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPDestroy() line 647 in src/ksp/ksp/interface/itfun.c [0]PETSC ERROR: SNESDestroy() line 1378 in src/snes/interface/snes.c [0]PETSC ERROR: PetscObjectDestroy() line 172 in src/sys/objects/destroy.c [0]PETSC ERROR: DMMGDestroy() line 220 in src/snes/utils/damg.c [0]PETSC ERROR: main() line 55 in oreggt.c application called MPI_Abort(MPI_COMM_WORLD,62) - process 0[unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD,62) - process 0 From bsmith at mcs.anl.gov Tue Jan 6 16:48:58 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Jan 2009 16:48:58 -0600 Subject: PetscValidHeaderSpecific() In-Reply-To: <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> Message-ID: <36CE42C4-4397-494F-9F3F-8833A96CD254@mcs.anl.gov> The most likely cause is memory corruption. The best way to solve this problem is to run on linux using valgrind (valgrind.org), the second best is to run with -malloc_debug and see if any messages come out Barry On Jan 6, 2009, at 4:44 PM, (Rebecca) Xuefei YUAN wrote: > Hi, > > I did have error information as at the end of this email, and I > found the error was a function called PetscValidHeaderSpecific(), is > there anything wrong with it? > > Thanks a lot! > > R > > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: ---------------------------------------- > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 0, Fri Dec 19 > 22:02:38 CST 2008 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ---------------------------------------- > [0]PETSC ERROR: ./oreggt on a cygwin-c- named YUAN_WORK by Rebecca > Tue Jan 6 17:29:59 2009 > [0]PETSC ERROR: Libraries linked from /home/Rebecca/soft/petsc-3.0.0- > p0/cygwin-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Jan 5 13:50:14 2009 > [0]PETSC ERROR: Configure options --with-cc=gcc --download-f-blas- > lapack=1 --download-mpich=1 --useThreads=0 --with-shared=0 > [0]PETSC ERROR: ---------------------------------------- > [0]PETSC ERROR: MatDestroy() line 766 in src/mat/interface/matrix.c > [0]PETSC ERROR: PCDestroy() line 88 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPDestroy() line 647 in src/ksp/ksp/interface/itfun.c > [0]PETSC ERROR: SNESDestroy() line 1378 in src/snes/interface/snes.c > [0]PETSC ERROR: PetscObjectDestroy() line 172 in src/sys/objects/ > destroy.c > [0]PETSC ERROR: DMMGDestroy() line 220 in src/snes/utils/damg.c > [0]PETSC ERROR: main() line 55 in oreggt.c > application called MPI_Abort(MPI_COMM_WORLD,62) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD,62) - process 0 > > From xy2102 at columbia.edu Tue Jan 6 18:28:26 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Tue, 06 Jan 2009 19:28:26 -0500 Subject: PetscValidHeaderSpecific() In-Reply-To: <36CE42C4-4397-494F-9F3F-8833A96CD254@mcs.anl.gov> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> <36CE42C4-4397-494F-9F3F-8833A96CD254@mcs.anl.gov> Message-ID: <20090106192826.u6ly4uajsokc8gck@cubmail.cc.columbia.edu> There is no message coming out when I run with -malloc_debug. I am using cygwin, any suggestions on that? Thanks! Rebecca Quoting Barry Smith : > > The most likely cause is memory corruption. The best way to solve > this problem is to run on linux using > valgrind (valgrind.org), the second best is to run with -malloc_debug > and see if any messages come out > > Barry > > On Jan 6, 2009, at 4:44 PM, (Rebecca) Xuefei YUAN wrote: > >> Hi, >> >> I did have error information as at the end of this email, and I >> found the error was a function called PetscValidHeaderSpecific(), >> is there anything wrong with it? >> >> Thanks a lot! >> >> R >> >> [0]PETSC ERROR: Invalid argument! >> [0]PETSC ERROR: Wrong type of object: Parameter # 1! >> [0]PETSC ERROR: ---------------------------------------- >> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 0, Fri Dec 19 >> 22:02:38 CST 2008 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: ---------------------------------------- >> [0]PETSC ERROR: ./oreggt on a cygwin-c- named YUAN_WORK by Rebecca >> Tue Jan 6 17:29:59 2009 >> [0]PETSC ERROR: Libraries linked from >> /home/Rebecca/soft/petsc-3.0.0-p0/cygwin-c-debug/lib >> [0]PETSC ERROR: Configure run at Mon Jan 5 13:50:14 2009 >> [0]PETSC ERROR: Configure options --with-cc=gcc >> --download-f-blas-lapack=1 --download-mpich=1 --useThreads=0 >> --with-shared=0 >> [0]PETSC ERROR: ---------------------------------------- >> [0]PETSC ERROR: MatDestroy() line 766 in src/mat/interface/matrix.c >> [0]PETSC ERROR: PCDestroy() line 88 in src/ksp/pc/interface/precon.c >> [0]PETSC ERROR: KSPDestroy() line 647 in src/ksp/ksp/interface/itfun.c >> [0]PETSC ERROR: SNESDestroy() line 1378 in src/snes/interface/snes.c >> [0]PETSC ERROR: PetscObjectDestroy() line 172 in src/sys/objects/destroy.c >> [0]PETSC ERROR: DMMGDestroy() line 220 in src/snes/utils/damg.c >> [0]PETSC ERROR: main() line 55 in oreggt.c >> application called MPI_Abort(MPI_COMM_WORLD,62) - process 0[unset]: >> aborting job: >> application called MPI_Abort(MPI_COMM_WORLD,62) - process 0 >> >> -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From recrusader at gmail.com Tue Jan 6 18:53:02 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 6 Jan 2009 16:53:02 -0800 Subject: MatTranspose_MPIDense illegal read? Message-ID: <7ff0ee010901061653j2cdbef50i9fcc27d83a0663ac@mail.gmail.com> According to Barry's advice, I use Valgrind to check my codes. I find there is an problem in MatTranspose_MPIDense(). It breaks my codes. The following is the information. could you give me some advice? thanks a lot ==5821== ==5821== Invalid read of size 4 ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, MatReuse, _p_Mat**) (matrix.c:5744) ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) ==5821== by 0x806B5B1: main (myproject.C:293) ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 alloc'd ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char const*, char const*, char const*, void**) (mal.c:40) ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char const*, char const*, char const*, void**) (mtr.c:194) ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c:1875) ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) (dense.c:1858) ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense (mpidense.c:1175) ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) (mpidense.c:1312) ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) (mpidense.c:979) ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) ==5821== by 0x806B5B1: main (myproject.C:293) Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 6 19:19:26 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Jan 2009 19:19:26 -0600 Subject: PetscValidHeaderSpecific() In-Reply-To: <20090106192826.u6ly4uajsokc8gck@cubmail.cc.columbia.edu> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> <36CE42C4-4397-494F-9F3F-8833A96CD254@mcs.anl.gov> <20090106192826.u6ly4uajsokc8gck@cubmail.cc.columbia.edu> Message-ID: On Tue, Jan 6, 2009 at 6:28 PM, (Rebecca) Xuefei YUAN wrote: > There is no message coming out when I run with -malloc_debug. I am using > cygwin, any suggestions on that? You would need to run the debugger, but that usually takes a lot of looking around. I recommend running your code on a Linux machine to find the memory overwrite. Matt > > Thanks! > > Rebecca > > Quoting Barry Smith : > > >> The most likely cause is memory corruption. The best way to solve >> this problem is to run on linux using >> valgrind (valgrind.org), the second best is to run with -malloc_debug >> and see if any messages come out >> >> Barry >> >> On Jan 6, 2009, at 4:44 PM, (Rebecca) Xuefei YUAN wrote: >> >> Hi, >>> >>> I did have error information as at the end of this email, and I found >>> the error was a function called PetscValidHeaderSpecific(), is there >>> anything wrong with it? >>> >>> Thanks a lot! >>> >>> R >>> >>> [0]PETSC ERROR: Invalid argument! >>> [0]PETSC ERROR: Wrong type of object: Parameter # 1! >>> [0]PETSC ERROR: ---------------------------------------- >>> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 0, Fri Dec 19 >>> 22:02:38 CST 2008 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: ---------------------------------------- >>> [0]PETSC ERROR: ./oreggt on a cygwin-c- named YUAN_WORK by Rebecca Tue >>> Jan 6 17:29:59 2009 >>> [0]PETSC ERROR: Libraries linked from >>> /home/Rebecca/soft/petsc-3.0.0-p0/cygwin-c-debug/lib >>> [0]PETSC ERROR: Configure run at Mon Jan 5 13:50:14 2009 >>> [0]PETSC ERROR: Configure options --with-cc=gcc >>> --download-f-blas-lapack=1 --download-mpich=1 --useThreads=0 >>> --with-shared=0 >>> [0]PETSC ERROR: ---------------------------------------- >>> [0]PETSC ERROR: MatDestroy() line 766 in src/mat/interface/matrix.c >>> [0]PETSC ERROR: PCDestroy() line 88 in src/ksp/pc/interface/precon.c >>> [0]PETSC ERROR: KSPDestroy() line 647 in src/ksp/ksp/interface/itfun.c >>> [0]PETSC ERROR: SNESDestroy() line 1378 in src/snes/interface/snes.c >>> [0]PETSC ERROR: PetscObjectDestroy() line 172 in >>> src/sys/objects/destroy.c >>> [0]PETSC ERROR: DMMGDestroy() line 220 in src/snes/utils/damg.c >>> [0]PETSC ERROR: main() line 55 in oreggt.c >>> application called MPI_Abort(MPI_COMM_WORLD,62) - process 0[unset]: >>> aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD,62) - process 0 >>> >>> >>> > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 6 19:45:36 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Jan 2009 19:45:36 -0600 Subject: PetscValidHeaderSpecific() In-Reply-To: <20090106192826.u6ly4uajsokc8gck@cubmail.cc.columbia.edu> References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> <36CE42C4-4397-494F-9F3F-8833A96CD254@mcs.anl.gov> <20090106192826.u6ly4uajsokc8gck@cubmail.cc.columbia.edu> Message-ID: If you send your code (and makefile and any data it needs) to petsc-maint at mcs.anl.gov and how to run it; we'll run it in valgrind. Frankly cygwin is a terrible system for source code development. It is really worth your effort to make your machine dual boot and install linux also; the time you spend getting linux going will be far more than gained by the improvement in development time and frustration. Barry If other people need to run your code on Windows, you can still do the development on Linux and only build on cygwin when it is very stable. On Jan 6, 2009, at 6:28 PM, (Rebecca) Xuefei YUAN wrote: > There is no message coming out when I run with -malloc_debug. I am > using cygwin, any suggestions on that? > > Thanks! > > Rebecca > > Quoting Barry Smith : > >> >> The most likely cause is memory corruption. The best way to solve >> this problem is to run on linux using >> valgrind (valgrind.org), the second best is to run with -malloc_debug >> and see if any messages come out >> >> Barry >> >> On Jan 6, 2009, at 4:44 PM, (Rebecca) Xuefei YUAN wrote: >> >>> Hi, >>> >>> I did have error information as at the end of this email, and I >>> found the error was a function called PetscValidHeaderSpecific(), >>> is there anything wrong with it? >>> >>> Thanks a lot! >>> >>> R >>> >>> [0]PETSC ERROR: Invalid argument! >>> [0]PETSC ERROR: Wrong type of object: Parameter # 1! >>> [0]PETSC ERROR: ---------------------------------------- >>> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 0, Fri Dec 19 >>> 22:02:38 CST 2008 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: ---------------------------------------- >>> [0]PETSC ERROR: ./oreggt on a cygwin-c- named YUAN_WORK by >>> Rebecca Tue Jan 6 17:29:59 2009 >>> [0]PETSC ERROR: Libraries linked from /home/Rebecca/soft/ >>> petsc-3.0.0-p0/cygwin-c-debug/lib >>> [0]PETSC ERROR: Configure run at Mon Jan 5 13:50:14 2009 >>> [0]PETSC ERROR: Configure options --with-cc=gcc --download-f-blas- >>> lapack=1 --download-mpich=1 --useThreads=0 --with-shared=0 >>> [0]PETSC ERROR: ---------------------------------------- >>> [0]PETSC ERROR: MatDestroy() line 766 in src/mat/interface/matrix.c >>> [0]PETSC ERROR: PCDestroy() line 88 in src/ksp/pc/interface/precon.c >>> [0]PETSC ERROR: KSPDestroy() line 647 in src/ksp/ksp/interface/ >>> itfun.c >>> [0]PETSC ERROR: SNESDestroy() line 1378 in src/snes/interface/snes.c >>> [0]PETSC ERROR: PetscObjectDestroy() line 172 in src/sys/objects/ >>> destroy.c >>> [0]PETSC ERROR: DMMGDestroy() line 220 in src/snes/utils/damg.c >>> [0]PETSC ERROR: main() line 55 in oreggt.c >>> application called MPI_Abort(MPI_COMM_WORLD,62) - process >>> 0[unset]: aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD,62) - process 0 >>> >>> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From bsmith at mcs.anl.gov Tue Jan 6 19:52:18 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Jan 2009 19:52:18 -0600 Subject: MatTranspose_MPIDense illegal read? In-Reply-To: <7ff0ee010901061653j2cdbef50i9fcc27d83a0663ac@mail.gmail.com> References: <7ff0ee010901061653j2cdbef50i9fcc27d83a0663ac@mail.gmail.com> Message-ID: <03EAB50D-75BF-4137-8801-84BB1E6D0898@mcs.anl.gov> Looks like this is some old version of PETSc. The line numbers don't match PETSc 3.0.0 Please switch to 3.0.0 and see if the problem occurs and send the valgrind output again Please switch to petsc-maint at mcs.anl.gov for this. Barry I don't see a problem with the MatTranspose_ but there is a bad read in the MatGetSubMatrix On Jan 6, 2009, at 6:53 PM, Yujie wrote: > According to Barry's advice, I use Valgrind to check my codes. I > find there is an problem in MatTranspose_MPIDense(). It breaks my > codes. The following is the information. could you give me some > advice? thanks a lot > ==5821== > ==5821== Invalid read of size 4 > ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, > _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) > ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, > MatReuse, _p_Mat**) (matrix.c:5744) > ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 > alloc'd > ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) > ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char > const*, char const*, char const*, void**) (mal.c:40) > ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char > const*, char const*, char const*, void**) (mtr.c:194) > ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c: > 1875) > ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) > (dense.c:1858) > ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense > (mpidense.c:1175) > ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) > (mpidense.c:1312) > ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) > (mpidense.c:979) > ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) > ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > > Regards, > Yujie > From recrusader at gmail.com Tue Jan 6 20:06:58 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 6 Jan 2009 18:06:58 -0800 Subject: MatTranspose_MPIDense illegal read? In-Reply-To: <03EAB50D-75BF-4137-8801-84BB1E6D0898@mcs.anl.gov> References: <7ff0ee010901061653j2cdbef50i9fcc27d83a0663ac@mail.gmail.com> <03EAB50D-75BF-4137-8801-84BB1E6D0898@mcs.anl.gov> Message-ID: <7ff0ee010901061806g1ffdced4t958646c0f5847d5e@mail.gmail.com> Dear Barry: I will try to do it according to your advice. I tried to incearse and reduce the processor number, some cases are broken, others aren't. This invalid read problem always exists. If the erros exist, the following is the broken information, Invalid read can lead to the following errors? thanks. "[6]PETSC ERROR: ------------------------------------------------------------------------ p2_15964: p4_error: net_recv read: probable EOF on socket: 1 rm_l_2_15965: (354.724282) net_send: could not write to fd=5, errno = 32" ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ==5821== ==5821== Invalid read of size 4 ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, MatReuse, _p_Mat**) (matrix.c:5744) ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) ==5821== by 0x806B5B1: main (myproject.C:293) ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 alloc'd ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char const*, char const*, char const*, void**) (mal.c:40) ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char const*, char const*, char const*, void**) (mtr.c:194) ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c:1875) ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) (dense.c:1858) ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense (mpidense.c:1175) ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) (mpidense.c:1312) ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) (mpidense.c:979) ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) ==5821== by 0x806B5B1: main (myproject.C:293) [6]PETSC ERROR: ------------------------------------------------------------------------ p2_15964: p4_error: net_recv read: probable EOF on socket: 1 rm_l_2_15965: (354.724282) net_send: could not write to fd=5, errno = 32 ==5821== ==5821== Invalid read of size 4 ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, MatReuse, _p_Mat**) (matrix.c:5744) ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) ==5821== by 0x806B5B1: main (myproject.C:293) ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 alloc'd ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char const*, char const*, char const*, void**) (mal.c:40) ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char const*, char const*, char const*, void**) (mtr.c:194) ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c:1875) ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) (dense.c:1858) ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense (mpidense.c:1175) ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) (mpidense.c:1312) ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) (mpidense.c:979) ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) ==5821== by 0x806B5B1: main (myproject.C:293) [6]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[6]PETSCERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [6]PETSC ERROR: likely location of problem given in stack below [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [6]PETSC ERROR: INSTEAD the line number of the start of the function [6]PETSC ERROR: is given. [6]PETSC ERROR: [6] MatGetSubMatrix_MPIDense line 218 src/mat/impls/dense/mpi/mpidense.c [6]PETSC ERROR: [6] MatGetSubMatrix line 5721 src/mat/interface/matrix.c [6]PETSC ERROR: --------------------- Error Message ------------------------------------ [6]PETSC ERROR: Signal received! [6]PETSC ERROR: ------------------------------------------------------------------------ [6]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b [6]PETSC ERROR: See docs/changes/index.html for recent updates. [6]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [6]PETSC ERROR: See docs/index.html for manual pages. [6]PETSC ERROR: ------------------------------------------------------------------------ [6]PETSC ERROR: /data4/ylu/codes/new/examples/myproj/myproj-dbg on a linux named wolf29.cluster1.isl.org by yujie Tue Jan 6 16:20:55 2009 [6]PETSC ERROR: Libraries linked from /home/yujie/codes/petsc-2.3.3-p8/lib/linux [6]PETSC ERROR: Configure run at Fri Apr 11 17:24:03 2008 [6]PETSC ERROR: Configure options --with-mpi-dir=/home/yujie/mpich127/ --with-clanguage=C++ --with-debugging=1 --with-shared=1 --with-hypre=1 --with-spooles=1 --download-spooles=1 --with-superlu_dist=1 --download-superlu_dist=1 [6]PETSC ERROR: ------------------------------------------------------------------------ [6]PETSC ERROR: User provided function() line 0 in unknown directory unknown file p6_11637: p4_error: : 59 rm_l_6_11638: (333.669678) net_send: could not write to fd=5, errno = 32 On Tue, Jan 6, 2009 at 5:52 PM, Barry Smith wrote: > > Looks like this is some old version of PETSc. The line numbers don't > match PETSc 3.0.0 > Please switch to 3.0.0 and see if the problem occurs and send the valgrind > output again > > Please switch to petsc-maint at mcs.anl.gov for this. > > Barry > > I don't see a problem with the MatTranspose_ but there is a bad read in > the MatGetSubMatrix > > On Jan 6, 2009, at 6:53 PM, Yujie wrote: > > According to Barry's advice, I use Valgrind to check my codes. I find >> there is an problem in MatTranspose_MPIDense(). It breaks my codes. The >> following is the information. could you give me some advice? thanks a lot >> ==5821== >> ==5821== Invalid read of size 4 >> ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, _p_IS*, >> int, MatReuse, _p_Mat**) (mpidense.c:251) >> ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, >> MatReuse, _p_Mat**) (matrix.c:5744) >> ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) >> ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) >> ==5821== by 0x806B5B1: main (myproject.C:293) >> ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 >> alloc'd >> ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) >> ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char const*, >> char const*, char const*, void**) (mal.c:40) >> ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char >> const*, char const*, char const*, void**) (mtr.c:194) >> ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c:1875) >> ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) >> (dense.c:1858) >> ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense >> (mpidense.c:1175) >> ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) >> (mpidense.c:1312) >> ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) >> (mpidense.c:979) >> ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) >> ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) >> ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C:1927) >> ==5821== by 0x806B5B1: main (myproject.C:293) >> >> Regards, >> Yujie >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 6 21:29:01 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Jan 2009 21:29:01 -0600 Subject: MatTranspose_MPIDense illegal read? In-Reply-To: <7ff0ee010901061806g1ffdced4t958646c0f5847d5e@mail.gmail.com> References: <7ff0ee010901061653j2cdbef50i9fcc27d83a0663ac@mail.gmail.com> <03EAB50D-75BF-4137-8801-84BB1E6D0898@mcs.anl.gov> <7ff0ee010901061806g1ffdced4t958646c0f5847d5e@mail.gmail.com> Message-ID: <59AB366A-FF17-481F-9678-D22342D5127B@mcs.anl.gov> You need to reproduce the problem with petsc-3.0.0 the line number 251 below is meaningless for 3.0.0 so doesn't help us at all. Libraries linked from /home/yujie/codes/petsc-2.3.3-p8/lib/linux This could easily be a bug in PETSc here but we cannot find bugs by just looking at the code; we need at least to know exactly where it happens. And we're not going to debug old versions of PETSc. Please continue this thread only on petsc-maint at mcs.anl.gov Barry On Jan 6, 2009, at 8:06 PM, Yujie wrote: > Dear Barry: > > I will try to do it according to your advice. I tried to incearse > and reduce the processor number, some cases are broken, others > aren't. This invalid read problem always exists. If the erros exist, > the following is the broken information, > > Invalid read can lead to the following errors? thanks. > > "[6]PETSC ERROR: > ------------------------------------------------------------------------ > p2_15964: p4_error: net_recv read: probable EOF on socket: 1 > rm_l_2_15965: (354.724282) net_send: could not write to fd=5, errno > = 32" > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > ==5821== > ==5821== Invalid read of size 4 > ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, > _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) > ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, > MatReuse, _p_Mat**) (matrix.c:5744) > ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 > alloc'd > ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) > ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char > const*, char const*, char const*, void**) (mal.c:40) > ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char > const*, char const*, char const*, void**) (mtr.c:194) > ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c: > 1875) > ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) > (dense.c:1858) > ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense > (mpidense.c:1175) > ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) > (mpidense.c:1312) > ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) > (mpidense.c:979) > ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) > ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > > [6]PETSC ERROR: > ------------------------------------------------------------------------ > p2_15964: p4_error: net_recv read: probable EOF on socket: 1 > rm_l_2_15965: (354.724282) net_send: could not write to fd=5, errno > = 32 > > > ==5821== > ==5821== Invalid read of size 4 > ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, > _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) > ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, > MatReuse, _p_Mat**) (matrix.c:5744) > ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 > alloc'd > ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) > ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char > const*, char const*, char const*, void**) (mal.c:40) > ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char > const*, char const*, char const*, void**) (mtr.c:194) > ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c: > 1875) > ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) > (dense.c:1858) > ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense > (mpidense.c:1175) > ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) > (mpidense.c:1312) > ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) > (mpidense.c:979) > ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) > ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > > [6]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > [6]PETSC ERROR: Try option -start_in_debugger or - > on_error_attach_debugger > [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal > [6]PETSC ERROR: or try http://valgrind.org on linux or man > libgmalloc on Apple to find memory corruption errors > [6]PETSC ERROR: likely location of problem given in stack below > [6]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [6]PETSC ERROR: INSTEAD the line number of the start of the function > [6]PETSC ERROR: is given. > [6]PETSC ERROR: [6] MatGetSubMatrix_MPIDense line 218 src/mat/impls/ > dense/mpi/mpidense.c > [6]PETSC ERROR: [6] MatGetSubMatrix line 5721 src/mat/interface/ > matrix.c > [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [6]PETSC ERROR: Signal received! > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 > 17:03:40 CST 2007 HG revision: > 414581156e67e55c761739b0deb119f7590d0f4b > [6]PETSC ERROR: See docs/changes/index.html for recent updates. > [6]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [6]PETSC ERROR: See docs/index.html for manual pages. > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: /data4/ylu/codes/new/examples/myproj/myproj-dbg on a > linux named wolf29.cluster1.isl.org by yujie Tue Jan 6 16:20:55 2009 > [6]PETSC ERROR: Libraries linked from /home/yujie/codes/petsc-2.3.3- > p8/lib/linux > [6]PETSC ERROR: Configure run at Fri Apr 11 17:24:03 2008 > [6]PETSC ERROR: Configure options --with-mpi-dir=/home/yujie/ > mpich127/ --with-clanguage=C++ --with-debugging=1 --with-shared=1 -- > with-hypre=1 --with-spooles=1 --download-spooles=1 --with- > superlu_dist=1 --download-superlu_dist=1 > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > p6_11637: p4_error: : 59 > rm_l_6_11638: (333.669678) net_send: could not write to fd=5, errno > = 32 > > > > On Tue, Jan 6, 2009 at 5:52 PM, Barry Smith > wrote: > > Looks like this is some old version of PETSc. The line numbers > don't match PETSc 3.0.0 > Please switch to 3.0.0 and see if the problem occurs and send the > valgrind output again > > Please switch to petsc-maint at mcs.anl.gov for this. > > Barry > > I don't see a problem with the MatTranspose_ but there is a bad > read in the MatGetSubMatrix > > > On Jan 6, 2009, at 6:53 PM, Yujie wrote: > > According to Barry's advice, I use Valgrind to check my codes. I > find there is an problem in MatTranspose_MPIDense(). It breaks my > codes. The following is the information. could you give me some > advice? thanks a lot > ==5821== > ==5821== Invalid read of size 4 > ==5821== at 0x5143617: MatGetSubMatrix_MPIDense(_p_Mat*, _p_IS*, > _p_IS*, int, MatReuse, _p_Mat**) (mpidense.c:251) > ==5821== by 0x50C2D71: MatGetSubMatrix(_p_Mat*, _p_IS*, _p_IS*, int, > MatReuse, _p_Mat**) (matrix.c:5744) > ==5821== by 0x805BCE8: mySystem::MakeA(int) (mySystem.C:518) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > ==5821== Address 0x8f06c64 is 12 bytes after a block of size 472,272 > alloc'd > ==5821== at 0x401AC01: malloc (vg_replace_malloc.c:207) > ==5821== by 0x558312D: PetscMallocAlign(unsigned int, int, char > const*, char const*, char const*, void**) (mal.c:40) > ==5821== by 0x558DA83: PetscTrMallocDefault(unsigned int, int, char > const*, char const*, char const*, void**) (mtr.c:194) > ==5821== by 0x50386A7: MatSeqDenseSetPreallocation_SeqDense (dense.c: > 1875) > ==5821== by 0x5038469: MatSeqDenseSetPreallocation(_p_Mat*, double*) > (dense.c:1858) > ==5821== by 0x514BC4B: MatMPIDenseSetPreallocation_MPIDense > (mpidense.c:1175) > ==5821== by 0x514C889: MatMPIDenseSetPreallocation(_p_Mat*, double*) > (mpidense.c:1312) > ==5821== by 0x514AD8B: MatTranspose_MPIDense(_p_Mat*, _p_Mat**) > (mpidense.c:979) > ==5821== by 0x50B2D85: MatTranspose(_p_Mat*, _p_Mat**) (matrix.c:3603) > ==5821== by 0x805BCAA: mySystem::MakeA(int) (mySystem.C:514) > ==5821== by 0x80615A6: mySystem::Source(mySystem*, int) (mySystem.C: > 1927) > ==5821== by 0x806B5B1: main (myproject.C:293) > > Regards, > Yujie > > > From rafaelsantoscoelho at gmail.com Thu Jan 8 17:42:15 2009 From: rafaelsantoscoelho at gmail.com (Rafael Santos Coelho) Date: Thu, 8 Jan 2009 21:42:15 -0200 Subject: Help with adding a new routine to the MATMFFD module Message-ID: <3b6f83d40901081542k1bd6d82m89c70de57f79f087@mail.gmail.com> Hello everyone, I've implemented a new routine for computing the differencing parameter "h" used with the finite-difference based matrix-free Jacobian by the MATMFFD module. This new approach, which I labeled "avg", can be found in the article entitled "Jacobian-free Newton-Krylov methods: a survey of approaches and applications", authored by D. A. Knoll and D. E. Keyes. It has the following formula: h = b * {[S / (n * || a ||)] + 1} - "b" is the square root of the machine round-off (10^-6 for 64-bit double precision); - "n" is the problem dimension (number of unknowns); - "a" is the vector in the Jacobian-vector product that is being approximated (J(U)a); - "S" = sum[i = 1, n] |Ui| is the sum of the absolute values of the solution vector entries. Well, the programming part is all done and correct, but I'm experiecing difficulties to (statically) incorporate the routine in the PETSc API. I've done everything that I could, but whenever I try to test my code by running an example with the -mat_mffd_type avg command-line option, I get this error message: {PETSC_DIR}/petsc-2.3.3-p15/src/snes/examples/tutorials$ mpirun -np 2 ./ex5 -snes_mf -mat_mffd_type avg [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: Unknown type. Check for miss-spelling or missing external package needed for type! [1]PETSC ERROR: Unknown MatMFFD type avg given! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 2.3.3, Patch 15, Tue Sep 23 10:02:49 CDT 2008 HG revision: 31306062cd1a6f6a2496fccb4878f485c9b91760 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: ./ex5 on a linux-gnu named hepburn by rafael Thu Jan 8 19:52:36 2009 [1]PETSC ERROR: Libraries linked from /home/rafael/ufes/petsc-2.3.3-p15/lib/linux-gnu-c-debug [1]PETSC ERROR: Configure run at Sun Nov 23 02:11:51 2008 [1]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx --with-fc=0 --download-c-blas-lapack=1 --with-shared=0 [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: MatMFFDSetType() line 107 in src/mat/impls/mffd/mffd.c [1]PETSC ERROR: MatMFFDSetFromOptions() line 493 in src/mat/impls/mffd/mffd.c [1]PETSC ERROR: SNESSetUp() line 1194 in src/snes/interface/snes.c [1]PETSC ERROR: SNESSolve() line 1855 in src/snes/interface/snes.c [1]PETSC ERROR: main() line 209 in src/snes/examples/tutorials/ex5.c [hepburn:22928] MPI_ABORT invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 86 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing external package needed for type! [0]PETSC ERROR: Unknown MatMFFD type avg given! [0]PETSC ERROR: ------------------------------------------------------------------------ So, here's what I did: 1. Put the source code (avg.c) in the {PETSC_DIR}/src/mat/impls/mffd/ directory (when writing the avg.c, I basically followed the same code structure in the wp.c file); 2. Added "avg.c" and "avg.o" to {PETSC_DIR}/src/mat/impls/mffd/makefile; 3. Added the macro "#define MATMFFD_AVG "avg"" to the petscmat.h file; 4. Called MatMFFDRegisterDynamic(MATMFFD_AVG, path, "MatMFFDCreate_AVG", MatMFFDCreate_AVG) inside the MatMFFDRegisterAll routine in the mfregis.c file. What step am I missing here? Why is this not working? Thanks in advance! Rafael -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jan 8 17:51:42 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 8 Jan 2009 17:51:42 -0600 Subject: Help with adding a new routine to the MATMFFD module In-Reply-To: <3b6f83d40901081542k1bd6d82m89c70de57f79f087@mail.gmail.com> References: <3b6f83d40901081542k1bd6d82m89c70de57f79f087@mail.gmail.com> Message-ID: <8B5DE7AC-AAFC-4B5B-855F-C8DC70663682@mcs.anl.gov> You are doing everything right; there must be a glitch along the way. First run in the debugger and put a breakpoint in MatMFFDRegisterAll() make sure the code gets there. Then step through while it does the dynamic registration of the new method and make sure everything looks reasonable along the way. Barry On Jan 8, 2009, at 5:42 PM, Rafael Santos Coelho wrote: > Hello everyone, > > I've implemented a new routine for computing the differencing > parameter "h" used with the finite-difference based matrix-free > Jacobian by the MATMFFD module. This new approach, which I labeled > "avg", can be found in the article entitled "Jacobian-free Newton- > Krylov methods: a survey of approaches and applications", authored > by D. A. Knoll and D. E. Keyes. It has the following formula: > > h = b * {[S / (n * || a ||)] + 1} > > ? "b" is the square root of the machine round-off (10^-6 for 64-bit > double precision); > ? "n" is the problem dimension (number of unknowns); > ? "a" is the vector in the Jacobian-vector product that is being > approximated (J(U)a); > ? "S" = sum[i = 1, n] |Ui| is the sum of the absolute values of the > solution vector entries. > Well, the programming part is all done and correct, but I'm > experiecing difficulties to (statically) incorporate the routine in > the PETSc API. I've done everything that I could, but whenever I try > to test my code by running an example with the -mat_mffd_type avg > command-line option, I get this error message: > > {PETSC_DIR}/petsc-2.3.3-p15/src/snes/examples/tutorials$ mpirun -np > 2 ./ex5 -snes_mf -mat_mffd_type avg > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: Unknown type. Check for miss-spelling or missing > external package needed for type! > [1]PETSC ERROR: Unknown MatMFFD type avg given! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 2.3.3, Patch 15, Tue Sep 23 > 10:02:49 CDT 2008 HG revision: > 31306062cd1a6f6a2496fccb4878f485c9b91760 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: ./ex5 on a linux-gnu named hepburn by rafael Thu > Jan 8 19:52:36 2009 > [1]PETSC ERROR: Libraries linked from /home/rafael/ufes/petsc-2.3.3- > p15/lib/linux-gnu-c-debug > [1]PETSC ERROR: Configure run at Sun Nov 23 02:11:51 2008 > [1]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx > --with-fc=0 --download-c-blas-lapack=1 --with-shared=0 > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: MatMFFDSetType() line 107 in src/mat/impls/mffd/mffd.c > [1]PETSC ERROR: MatMFFDSetFromOptions() line 493 in src/mat/impls/ > mffd/mffd.c > [1]PETSC ERROR: SNESSetUp() line 1194 in src/snes/interface/snes.c > [1]PETSC ERROR: SNESSolve() line 1855 in src/snes/interface/snes.c > [1]PETSC ERROR: main() line 209 in src/snes/examples/tutorials/ex5.c > [hepburn:22928] MPI_ABORT invoked on rank 1 in communicator > MPI_COMM_WORLD with errorcode 86 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing > external package needed for type! > [0]PETSC ERROR: Unknown MatMFFD type avg given! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > So, here's what I did: > > ? Put the source code (avg.c) in the {PETSC_DIR}/src/mat/impls/ > mffd/ directory (when writing the avg.c, I basically followed the > same code structure in the wp.c file); > ? Added "avg.c" and "avg.o" to {PETSC_DIR}/src/mat/impls/mffd/ > makefile; > ? Added the macro "#define MATMFFD_AVG "avg"" to the petscmat.h file; > ? Called MatMFFDRegisterDynamic(MATMFFD_AVG, path, > "MatMFFDCreate_AVG", MatMFFDCreate_AVG) inside the > MatMFFDRegisterAll routine in the mfregis.c file. > What step am I missing here? Why is this not working? Thanks in > advance! > > Rafael > From rafaelsantoscoelho at gmail.com Thu Jan 8 18:00:40 2009 From: rafaelsantoscoelho at gmail.com (Rafael Santos Coelho) Date: Thu, 8 Jan 2009 22:00:40 -0200 Subject: Help with adding a new routine to the MATMFFD module In-Reply-To: <8B5DE7AC-AAFC-4B5B-855F-C8DC70663682@mcs.anl.gov> References: <3b6f83d40901081542k1bd6d82m89c70de57f79f087@mail.gmail.com> <8B5DE7AC-AAFC-4B5B-855F-C8DC70663682@mcs.anl.gov> Message-ID: <3b6f83d40901081600i22df3cc1pee3c2a9e1b26ac8f@mail.gmail.com> Hi Barry, > > First run in the debugger and put a breakpoint in MatMFFDRegisterAll() > make sure the code gets there. > Then step through while it does the dynamic registration of the new method > and make sure everything > looks reasonable along the way. > Could you help me out with that? It's that I've never used PETSc in debugging mode before. Rafael -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelsantoscoelho at gmail.com Thu Jan 8 18:07:06 2009 From: rafaelsantoscoelho at gmail.com (Rafael Santos Coelho) Date: Thu, 8 Jan 2009 22:07:06 -0200 Subject: Help with adding a new routine to the MATMFFD module In-Reply-To: <3b6f83d40901081600i22df3cc1pee3c2a9e1b26ac8f@mail.gmail.com> References: <3b6f83d40901081542k1bd6d82m89c70de57f79f087@mail.gmail.com> <8B5DE7AC-AAFC-4B5B-855F-C8DC70663682@mcs.anl.gov> <3b6f83d40901081600i22df3cc1pee3c2a9e1b26ac8f@mail.gmail.com> Message-ID: <3b6f83d40901081607x52cd0009uc0c7013aca3de312@mail.gmail.com> Hi, Barry, I found the error, it works perfectly now! Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Fri Jan 9 06:54:16 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Fri, 09 Jan 2009 07:54:16 -0500 Subject: PetscValidHeaderSpecific() In-Reply-To: References: <7ff0ee010901060952x36351800n2320e1b07bfaa97e@mail.gmail.com> <7ff0ee010901061143r644dc28ave81735eceba6bfc8@mail.gmail.com> <7ff0ee010901061155n1eb38558y1a7813da6e95d00b@mail.gmail.com> <0B98A1F4-78E4-4A21-8535-F60BE994308B@mcs.anl.gov> <20090106174416.yz475c4ow0g04g8o@cubmail.cc.columbia.edu> <36CE42C4-4397-494F-9F3F-8833A96CD254@mcs.anl.gov> <20090106192826.u6ly4uajsokc8gck@cubmail.cc.columbia.edu> Message-ID: <20090109075416.7nms942o6cow0ckc@cubmail.cc.columbia.edu> I installed ubuntu, and there is no problem running under it. Thanks a lot! Rebecca Quoting Barry Smith : > > If you send your code (and makefile and any data it needs) to > petsc-maint at mcs.anl.gov and how to run it; we'll run it in > valgrind. > > Frankly cygwin is a terrible system for source code development. It > is really worth your effort to make > your machine dual boot and install linux also; the time you spend > getting linux going will be far more than > gained by the improvement in development time and frustration. > > Barry > > If other people need to run your code on Windows, you can still do the > development on Linux and only build > on cygwin when it is very stable. > > On Jan 6, 2009, at 6:28 PM, (Rebecca) Xuefei YUAN wrote: > >> There is no message coming out when I run with -malloc_debug. I am >> using cygwin, any suggestions on that? >> >> Thanks! >> >> Rebecca >> >> Quoting Barry Smith : >> >>> >>> The most likely cause is memory corruption. The best way to solve >>> this problem is to run on linux using >>> valgrind (valgrind.org), the second best is to run with -malloc_debug >>> and see if any messages come out >>> >>> Barry >>> >>> On Jan 6, 2009, at 4:44 PM, (Rebecca) Xuefei YUAN wrote: >>> >>>> Hi, >>>> >>>> I did have error information as at the end of this email, and I >>>> found the error was a function called PetscValidHeaderSpecific(), >>>> is there anything wrong with it? >>>> >>>> Thanks a lot! >>>> >>>> R >>>> >>>> [0]PETSC ERROR: Invalid argument! >>>> [0]PETSC ERROR: Wrong type of object: Parameter # 1! >>>> [0]PETSC ERROR: ---------------------------------------- >>>> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 0, Fri Dec 19 >>>> 22:02:38 CST 2008 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: ---------------------------------------- >>>> [0]PETSC ERROR: ./oreggt on a cygwin-c- named YUAN_WORK by >>>> Rebecca Tue Jan 6 17:29:59 2009 >>>> [0]PETSC ERROR: Libraries linked from >>>> /home/Rebecca/soft/petsc-3.0.0-p0/cygwin-c-debug/lib >>>> [0]PETSC ERROR: Configure run at Mon Jan 5 13:50:14 2009 >>>> [0]PETSC ERROR: Configure options --with-cc=gcc >>>> --download-f-blas-lapack=1 --download-mpich=1 --useThreads=0 >>>> --with-shared=0 >>>> [0]PETSC ERROR: ---------------------------------------- >>>> [0]PETSC ERROR: MatDestroy() line 766 in src/mat/interface/matrix.c >>>> [0]PETSC ERROR: PCDestroy() line 88 in src/ksp/pc/interface/precon.c >>>> [0]PETSC ERROR: KSPDestroy() line 647 in src/ksp/ksp/interface/itfun.c >>>> [0]PETSC ERROR: SNESDestroy() line 1378 in src/snes/interface/snes.c >>>> [0]PETSC ERROR: PetscObjectDestroy() line 172 in src/sys/objects/destroy.c >>>> [0]PETSC ERROR: DMMGDestroy() line 220 in src/snes/utils/damg.c >>>> [0]PETSC ERROR: main() line 55 in oreggt.c >>>> application called MPI_Abort(MPI_COMM_WORLD,62) - process >>>> 0[unset]: aborting job: >>>> application called MPI_Abort(MPI_COMM_WORLD,62) - process 0 >>>> >>>> >> >> >> >> -- >> (Rebecca) Xuefei YUAN >> Department of Applied Physics and Applied Mathematics >> Columbia University >> Tel:917-399-8032 >> www.columbia.edu/~xy2102 >> -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Wed Jan 14 07:15:44 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Wed, 14 Jan 2009 14:15:44 +0100 Subject: MatCreateMPIAIJWithSplitArrays Message-ID: <20090114141544.pe72wc8ve4yf4c4g@webmail.ec-nantes.fr> Hello, To define a matrix with arrays, I cannot use MatCreateMPIAIJWithSplitArrays in my program which is written in Fortran: call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, $ oColumn,ov,D,ierr) The error is F:246: undefined reference to `matcreatempiaijwithsplitarrays_' I could use MatCreateMPIAIJWithArrays but the off diagonal values are missing with this command. I would be appreciate for any advice. Thank you before hand. Regards, Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Wed Jan 14 07:38:34 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Wed, 14 Jan 2009 14:38:34 +0100 Subject: MatCreateMPIAIJWithSplitArrays Message-ID: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> Oh, I could not use MatCreateMPIAIJWithArrays either but the mechanism below works. call MatCreate(PETSC_COMM_WORLD,D,ierr) call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, $ ierr) call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix call MatSetFromOptions(D,ierr) call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) Where pointer is start-row indices a Column is local column indices v is value Is there the different beteween the start-row indices in MatMPIAIJSetPreallocationCSR and row indices in MatCreateMPIAIJWithArrays ? Regards, Jarunan Hello, To define a matrix with arrays, I cannot use MatCreateMPIAIJWithSplitArrays in my program which is written in Fortran: call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, $ oColumn,ov,D,ierr) The error is F:246: undefined reference to `matcreatempiaijwithsplitarrays_' I could use MatCreateMPIAIJWithArrays but the off diagonal values are missing with this command. I would be appreciate for any advice. Thank you before hand. Regards, Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE From bsmith at mcs.anl.gov Wed Jan 14 08:30:10 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 Jan 2009 08:30:10 -0600 Subject: MatCreateMPIAIJWithSplitArrays In-Reply-To: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> Message-ID: You should be able to use MatCreateMPIAIJWithSplitArrays(), MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() from Fortran. Are you using PETSc 3.0.0? The arguments for MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() have the same meaning (in fact MatCreateMPIAIJWithArrays() essentially calls MatCreateMPIAIJWithArrays()). Barry On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: > Oh, I could not use MatCreateMPIAIJWithArrays either but the > mechanism below works. > > call MatCreate(PETSC_COMM_WORLD,D,ierr) > call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, > $ ierr) > call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix > call MatSetFromOptions(D,ierr) > call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) > > Where pointer is start-row indices a > Column is local column indices > v is value > > Is there the different beteween the start-row indices in > MatMPIAIJSetPreallocationCSR and row indices in > MatCreateMPIAIJWithArrays ? > > > > Regards, > Jarunan > > > > > Hello, > > To define a matrix with arrays, I cannot use > MatCreateMPIAIJWithSplitArrays in my program which is written in > Fortran: > > call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, > $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, > $ oColumn,ov,D,ierr) > > The error is > F:246: undefined reference to `matcreatempiaijwithsplitarrays_' > > I could use MatCreateMPIAIJWithArrays but the off diagonal values > are missing with this command. > > I would be appreciate for any advice. Thank you before hand. > > Regards, > Jarunan > > > > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > > From tim.kroeger at cevis.uni-bremen.de Wed Jan 14 11:25:18 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Wed, 14 Jan 2009 18:25:18 +0100 (CET) Subject: VecCreateGhost Message-ID: Dear PETSc team, When I create a vector using VecCreateGhost(), and later on I want to access the value of one of the ghost cells, and all I know is the *global* index of that value, what is the correct thing to do? I understand that the ghost values are stored at the end of the vector and that this is done in the order that I used when creating the vector, but do I have to remember that order myself, or is there some method to query the local index corresponding to a global index? Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From knepley at gmail.com Wed Jan 14 12:03:24 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Jan 2009 12:03:24 -0600 Subject: VecCreateGhost In-Reply-To: References: Message-ID: On Wed, Jan 14, 2009 at 11:25 AM, Tim Kroeger < tim.kroeger at cevis.uni-bremen.de> wrote: > Dear PETSc team, > > When I create a vector using VecCreateGhost(), and later on I want to > access the value of one of the ghost cells, and all I know is the *global* > index of that value, what is the correct thing to do? I understand that the > ghost values are stored at the end of the vector and that this is done in > the order that I used when creating the vector, but do I have to remember > that order myself, or is there some method to query the local index > corresponding to a global index? Unfortunately, we never create the inverse mapping. Matt > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59a2.org Wed Jan 14 12:52:15 2009 From: jed at 59a2.org (Jed Brown) Date: Wed, 14 Jan 2009 09:52:15 -0900 Subject: VecCreateGhost In-Reply-To: References: Message-ID: <383ade90901141052v6cf99c31r3aa1fdd1df14dfdd@mail.gmail.com> On Wed, Jan 14, 2009 at 09:03, Matthew Knepley wrote: > wrote: >> When I create a vector using VecCreateGhost(), and later on I want to >> access the value of one of the ghost cells, and all I know is the *global* >> index of that value, what is the correct thing to do? I understand that the >> ghost values are stored at the end of the vector and that this is done in >> the order that I used when creating the vector, but do I have to remember >> that order myself, or is there some method to query the local index >> corresponding to a global index? > > Unfortunately, we never create the inverse mapping. To elaborate on this, the natural thing in your application is that user code (yours or the Libmesh DoF map) never sees the global numbering except through the LocalToGlobalMapping. It uses the local numbering with the local forms (VecGhostGetLocalForm) and VecSetValuesLocal, MatSetValuesLocal, etc. I don't know how this interacts with the parallel adaptive refinement model, but after refinement you'll be creating new vectors anyway. Jed From Hung.V.Nguyen at usace.army.mil Wed Jan 14 13:54:45 2009 From: Hung.V.Nguyen at usace.army.mil (Nguyen, Hung V ERDC-ITL-MS) Date: Wed, 14 Jan 2009 13:54:45 -0600 Subject: Stopping criteria Message-ID: Hello All, I tried to solve an ill-conditioned system using cg with Jacobi preconditioned. The KSP solver was stopping due to diverged reason within a few iterations. Is there a way to keep KSP solver running until max_it? Thanks, -hung hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type jacobi -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor -ksp_converged_reason 0 KSP Residual norm 1.379074550666e+04 1 KSP Residual norm 7.252034661743e+03 2 KSP Residual norm 7.302184771313e+03 3 KSP Residual norm 1.162244351275e+04 4 KSP Residual norm 7.912531765659e+03 5 KSP Residual norm 4.094706251487e+03 6 KSP Residual norm 5.486131070301e+03 7 KSP Residual norm 6.367904529202e+03 8 KSP Residual norm 6.312767173219e+03 Linear solve did not converge due to DIVERGED_INDEFINITE_MAT iterations 9 Time in PETSc solver: 0.452695 seconds The number of iteration = 9 The solution residual error = 6.312767e+03 From knepley at gmail.com Wed Jan 14 14:05:19 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Jan 2009 14:05:19 -0600 Subject: Stopping criteria In-Reply-To: References: Message-ID: On Wed, Jan 14, 2009 at 1:54 PM, Nguyen, Hung V ERDC-ITL-MS < Hung.V.Nguyen at usace.army.mil> wrote: > > Hello All, > > I tried to solve an ill-conditioned system using cg with Jacobi > preconditioned. The KSP solver was stopping due to diverged reason within a > few iterations. Is there a way to keep KSP solver running until max_it? There is no way to continue CG here because it gets a zero divisor, and interprets this as an indefinite matrix. You can try GMRES, however I would first check your matrix using -pc_type lu -ksp_type preonly to make sure its not singular. Matt > > Thanks, > > -hung > > hvnguyen:jade23% aprun -n 16 ./test_matrix_read -ksp_type cg -pc_type > jacobi > -ksp_rtol 1.0e-15 -ksp_max_it 50000 -ksp_monitor -ksp_converged_reason > 0 KSP Residual norm 1.379074550666e+04 > 1 KSP Residual norm 7.252034661743e+03 > 2 KSP Residual norm 7.302184771313e+03 > 3 KSP Residual norm 1.162244351275e+04 > 4 KSP Residual norm 7.912531765659e+03 > 5 KSP Residual norm 4.094706251487e+03 > 6 KSP Residual norm 5.486131070301e+03 > 7 KSP Residual norm 6.367904529202e+03 > 8 KSP Residual norm 6.312767173219e+03 > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT iterations 9 > Time in PETSc solver: 0.452695 seconds > The number of iteration = 9 > The solution residual error = 6.312767e+03 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59a2.org Wed Jan 14 15:14:40 2009 From: jed at 59a2.org (Jed Brown) Date: Wed, 14 Jan 2009 12:14:40 -0900 Subject: Stopping criteria In-Reply-To: References: Message-ID: <383ade90901141314q482b8ac1v9726c8fa16736827@mail.gmail.com> On Wed, Jan 14, 2009 at 10:54, Nguyen, Hung V ERDC-ITL-MS wrote: > Linear solve did not converge due to DIVERGED_INDEFINITE_MAT iterations 9 CG does not work for indefinite matrices, the natural thing to try is -ksp_type minres. You can also try a nonsymmetric KSP which gives you more choices for preconditioning, although good preconditioning for an indefinite matrix generally uses problem-specific information. Where does this matrix come from? How scalable does the solver need to be? Jed From niko.karin at gmail.com Tue Jan 20 16:12:01 2009 From: niko.karin at gmail.com (Karin&NiKo) Date: Tue, 20 Jan 2009 17:12:01 -0500 Subject: Feti Message-ID: Dear Petsc users, There is an implementation of the FETI iterative substructuring method in src/ksp/pc/impls/is/feti that does not compile. It seems that with some work, it could be made to run again. Before doing it, I just wanted to know if anyone had already done it. Thanks, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 20 16:20:17 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 20 Jan 2009 16:20:17 -0600 Subject: Feti In-Reply-To: References: Message-ID: <06C745EE-9195-486D-9325-3028FB38B289@mcs.anl.gov> On Jan 20, 2009, at 4:12 PM, Karin&NiKo wrote: > Dear Petsc users, > > There is an implementation of the FETI iterative substructuring > method in src/ksp/pc/impls/is/feti that does not compile. It seems > that with some work, it could be made to run again. Before doing it, > I just wanted to know if anyone had already done it. Nope. What is there is the remnants of what someone monkeyed around with a bunch of years ago. You are welcome to play around with it, there is nothing beside what is there. and we don't remember anything about it so cannot answer questions any better than you could. Note: this stuff is very old and uses something called SLES which is basically just a KSP. You might look at the NN preconditioner, my original hope was to try to reuse some of the infrastructure common to the NN (the pcis.c stuff) to implement the FETI. This is cool stuff but somewhat challenging. Barry > > > Thanks, > > Nicolas From DOMI0002 at ntu.edu.sg Tue Jan 20 18:03:01 2009 From: DOMI0002 at ntu.edu.sg (#DOMINIC DENVER JOHN CHANDAR#) Date: Wed, 21 Jan 2009 08:03:01 +0800 Subject: OpenMPI vs MPICH2 Message-ID: Dear PETSc Users, Has anyone observed any specific improvement in computational speed while using the OpenMPI implementation compared to MPICH2 ? ( Or the other way around ?).. Could it be problem dependent if there is a difference in speed ? Regards, Dominic -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Jan 20 18:25:49 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 20 Jan 2009 18:25:49 -0600 (CST) Subject: OpenMPI vs MPICH2 In-Reply-To: References: Message-ID: On Wed, 21 Jan 2009, #DOMINIC DENVER JOHN CHANDAR# wrote: > Dear PETSc Users, > > Has anyone observed any specific improvement in computational speed > while using the OpenMPI implementation compared to MPICH2 ? ( Or the > other way around ?).. Could it be problem dependent if there is a > difference in speed ? I don't have any comparison results. However here is my opinion. Based on the hardware & network you have - you would have to check with both impls [they do have mailing lists to help] - the best way to configure & install it for that give hardware [i.e SMP?] & network. [for eg: mpich configure supports options like --with-device --with-channel - that results in different types of codes being used]. Once you the optimal install of each of these impls - you would have to do a comparision for the app that matters. Satish From recrusader at gmail.com Tue Jan 20 19:27:38 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 20 Jan 2009 17:27:38 -0800 Subject: MatGetSubMatrix_MPIDense() Message-ID: <7ff0ee010901201727r343634c5g8df06f803563990a@mail.gmail.com> Dear Barry and Matthew: I always fingure out what's wrong with MatGetSubMatrix_MPIDense() in my application. Because I use other pacakges based on PETSc, it is difficult to change my PETSc2.3.3 to 3.0.0. I have to debug my codes to find something. Now, I have confirmed the question is in "*bv++ = av[irow[j] - rstart];" of (PETSC2.3.3) /* Now extract the data pointers and do the copy, column at a time */ 245: newmatd = (Mat_MPIDense*)newmat->data; 246: bv = ((Mat_SeqDense *)newmatd->A->data)->v; 247: 248: for (i=0; idata; 222: bv = ((Mat_SeqDense *)newmatd->A->data)->v; 223: 224: for (i=0; iA->data)->lda*icol[i]; 226: for (j=0; jA->data)->lda"? After I revise the codes in PETsc2.3.3, the error disappears. However, the results from my codes become wrong. Could you give me some advice? thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jan 20 19:34:46 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 Jan 2009 19:34:46 -0600 Subject: MatGetSubMatrix_MPIDense() In-Reply-To: <7ff0ee010901201727r343634c5g8df06f803563990a@mail.gmail.com> References: <7ff0ee010901201727r343634c5g8df06f803563990a@mail.gmail.com> Message-ID: That fixes a bug since the the data is actually stored with leading dimension lda, rather than nlrows. Matt On Tue, Jan 20, 2009 at 7:27 PM, Yujie wrote: > Dear Barry and Matthew: > > I always fingure out what's wrong with MatGetSubMatrix_MPIDense() in my > application. Because I use other pacakges based on PETSc, it is difficult to > change my PETSc2.3.3 to 3.0.0. I have to debug my codes to find something. > > Now, I have confirmed the question is in "*bv++ = av[irow[j] - rstart];" of > > (PETSC2.3.3) > > /* Now extract the data pointers and do the copy, column at a time */ > 245: newmatd = (Mat_MPIDense*)newmat->data; > 246: bv = ((Mat_SeqDense *)newmatd->A->data)->v; > 247: > 248: for (i=0; i 249: av = v + nlrows*icol[i]; > 250: for (j=0; j 251: *bv++ = av[irow[j] - rstart]; > 252: } > 253: } > > The codes generate an error of segmentation violation. > > I have checkec PETsc3.0.0 version. the corresponding codes are: > > 220: /* Now extract the data pointers and do the copy, column at a time */ > 221: newmatd = (Mat_MPIDense*)newmat->data; > 222: bv = ((Mat_SeqDense *)newmatd->A->data)->v; > 223: > 224: for (i=0; i 225: av = v + ((Mat_SeqDense *)newmatd->A->data)->lda*icol[i]; > 226: for (j=0; j 227: *bv++ = av[irow[j] - rstart]; > 228: } > 229: } > > why change "nlrows" to "((Mat_SeqDense *)newmatd->A->data)->lda"? After I > revise the codes in PETsc2.3.3, the error disappears. However, the results > from my codes become wrong. Could you give me some advice? thanks a lot. > > Regards, > > Yujie > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From recrusader at gmail.com Tue Jan 20 20:02:57 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 20 Jan 2009 18:02:57 -0800 Subject: MatGetSubMatrix_MPIDense() In-Reply-To: References: <7ff0ee010901201727r343634c5g8df06f803563990a@mail.gmail.com> Message-ID: <7ff0ee010901201802k7cf146fasa63964a95b326de2@mail.gmail.com> Dear Matthew: You mean lda is correct in these codes even for PETSc2.3.3? I need to confirm this, because the results become wrong using lda. However, I use nlrows and can get right results. thanks. Regards, Yujie On Tue, Jan 20, 2009 at 5:34 PM, Matthew Knepley wrote: > That fixes a bug since the the data is actually stored with leading > dimension lda, rather > than nlrows. > > Matt > > On Tue, Jan 20, 2009 at 7:27 PM, Yujie wrote: > > Dear Barry and Matthew: > > > > I always fingure out what's wrong with MatGetSubMatrix_MPIDense() in my > > application. Because I use other pacakges based on PETSc, it is difficult > to > > change my PETSc2.3.3 to 3.0.0. I have to debug my codes to find > something. > > > > Now, I have confirmed the question is in "*bv++ = av[irow[j] - rstart];" > of > > > > (PETSC2.3.3) > > > > /* Now extract the data pointers and do the copy, column at a time */ > > 245: newmatd = (Mat_MPIDense*)newmat->data; > > 246: bv = ((Mat_SeqDense *)newmatd->A->data)->v; > > 247: > > 248: for (i=0; i > 249: av = v + nlrows*icol[i]; > > 250: for (j=0; j > 251: *bv++ = av[irow[j] - rstart]; > > 252: } > > 253: } > > > > The codes generate an error of segmentation violation. > > > > I have checkec PETsc3.0.0 version. the corresponding codes are: > > > > 220: /* Now extract the data pointers and do the copy, column at a time > */ > > 221: newmatd = (Mat_MPIDense*)newmat->data; > > 222: bv = ((Mat_SeqDense *)newmatd->A->data)->v; > > 223: > > 224: for (i=0; i > 225: av = v + ((Mat_SeqDense *)newmatd->A->data)->lda*icol[i]; > > 226: for (j=0; j > 227: *bv++ = av[irow[j] - rstart]; > > 228: } > > 229: } > > > > why change "nlrows" to "((Mat_SeqDense *)newmatd->A->data)->lda"? After I > > revise the codes in PETsc2.3.3, the error disappears. However, the > results > > from my codes become wrong. Could you give me some advice? thanks a lot. > > > > Regards, > > > > Yujie > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Jan 20 20:05:11 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 20 Jan 2009 18:05:11 -0800 Subject: MatGetSubMatrix_MPIDense() In-Reply-To: References: <7ff0ee010901201727r343634c5g8df06f803563990a@mail.gmail.com> Message-ID: <7ff0ee010901201805m44ea0816y5eb4f54e9eac9e07@mail.gmail.com> The bug is strange. When I increase or reduce the processors, the bug will appear with certain processors number :(. On Tue, Jan 20, 2009 at 5:34 PM, Matthew Knepley wrote: > That fixes a bug since the the data is actually stored with leading > dimension lda, rather > than nlrows. > > Matt > > On Tue, Jan 20, 2009 at 7:27 PM, Yujie wrote: > > Dear Barry and Matthew: > > > > I always fingure out what's wrong with MatGetSubMatrix_MPIDense() in my > > application. Because I use other pacakges based on PETSc, it is difficult > to > > change my PETSc2.3.3 to 3.0.0. I have to debug my codes to find > something. > > > > Now, I have confirmed the question is in "*bv++ = av[irow[j] - rstart];" > of > > > > (PETSC2.3.3) > > > > /* Now extract the data pointers and do the copy, column at a time */ > > 245: newmatd = (Mat_MPIDense*)newmat->data; > > 246: bv = ((Mat_SeqDense *)newmatd->A->data)->v; > > 247: > > 248: for (i=0; i > 249: av = v + nlrows*icol[i]; > > 250: for (j=0; j > 251: *bv++ = av[irow[j] - rstart]; > > 252: } > > 253: } > > > > The codes generate an error of segmentation violation. > > > > I have checkec PETsc3.0.0 version. the corresponding codes are: > > > > 220: /* Now extract the data pointers and do the copy, column at a time > */ > > 221: newmatd = (Mat_MPIDense*)newmat->data; > > 222: bv = ((Mat_SeqDense *)newmatd->A->data)->v; > > 223: > > 224: for (i=0; i > 225: av = v + ((Mat_SeqDense *)newmatd->A->data)->lda*icol[i]; > > 226: for (j=0; j > 227: *bv++ = av[irow[j] - rstart]; > > 228: } > > 229: } > > > > why change "nlrows" to "((Mat_SeqDense *)newmatd->A->data)->lda"? After I > > revise the codes in PETsc2.3.3, the error disappears. However, the > results > > from my codes become wrong. Could you give me some advice? thanks a lot. > > > > Regards, > > > > Yujie > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jerome.snho at gmail.com Wed Jan 21 00:52:18 2009 From: jerome.snho at gmail.com (jerome ho) Date: Wed, 21 Jan 2009 14:52:18 +0800 Subject: Increasing convergence rate In-Reply-To: <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> Message-ID: <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> Thanks for your advice. Using the different boomeramg options helps to trade off between speed and memory. Right now, I've distribute the matrix to 2 processors. However, when solving, the parallel version takes a longer time with more iteration counts. I enabled -ksp_monitor and they seems to converge at a different rate, although using the same options. Is there a reason for this? The matrix formed by the serial and parallel are the same. Jerome On Thu, Jan 15, 2009 at 5:29 PM, Jed Brown wrote: > On Wed, Jan 14, 2009 at 20:04, jerome ho wrote: >> boomeramg+minres: 388MB in 1min (8 iterations) >> icc+cg: 165MB in 30min (>5000 iterations) >> bjacobi+cg: 201MB in 50min (>5000 iterations) > > Note that in serial, bjacobi is just whatever -sub_pc_type is (ilu by > default). In parallel, it's always worth trying -pc_type asm as an > alternative to bjacobi. You can frequently make the incomplete > factorization stronger by using multiple levels (-pc_factor_levels N), > but it will use more memory. It looks like multigrid works well for > your problem so it will likely be very hard for a traditional method > to compete. > > To reduce memory usage in BoomerAMG, try these options > > -pc_hypre_boomeramg_truncfactor <0>: Truncation factor for > interpolation (0=no truncation) (None) > -pc_hypre_boomeramg_P_max <0>: Max elements per row for > interpolation operator ( 0=unlimited ) (None) > -pc_hypre_boomeramg_agg_nl <0>: Number of levels of aggressive > coarsening (None) > -pc_hypre_boomeramg_agg_num_paths <1>: Number of paths for > aggressive coarsening (None) > -pc_hypre_boomeramg_strong_threshold <0.25>: Threshold for being > strongly connected (None) > > For 3D problems, the manual suggests setting strong_threshold to 0.5. > > It's also worth trying ML, especially for vector problems. > > Jed > From jed at 59A2.org Wed Jan 21 01:14:48 2009 From: jed at 59A2.org (Jed Brown) Date: Tue, 20 Jan 2009 22:14:48 -0900 Subject: Increasing convergence rate In-Reply-To: <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> Message-ID: <383ade90901202314r43e1513cg255f5e1b79d6b1ab@mail.gmail.com> On Tue, Jan 20, 2009 at 21:52, jerome ho wrote: > Right now, I've distribute the matrix to 2 processors. However, when > solving, the parallel version takes a longer time > with more iteration counts. > I enabled -ksp_monitor and they seems to converge at a different rate, > although using the same options. > Is there a reason for this? Most preconditioners are not the same in parallel, including these implementations of AMG. At a minimum, the smoother is using a block Jacobi version of SOR or ILU. As you add processes beyond 2, the increase in iteration count is usually very minor. If you are using multiple cores, the per-core floating point performance will also be worse due to the memory bandwidth bottleneck. That may contribute to the poor parallel performance you are seeing. Jed From ahusborde at enscpb.fr Wed Jan 21 08:35:26 2009 From: ahusborde at enscpb.fr (Etienne Ahusborde) Date: Wed, 21 Jan 2009 15:35:26 +0100 Subject: PETSc interface for staggered and structured grid? Message-ID: <497732AE.4060205@enscpb.fr> Hello, I am a new user of PETSc. My goal is to solve linear systems coming from the Navier-Stokes equations. Components of the velocity field are coupled each other, so that, in 2D, the matrix is composed of four blocks (two diagonal blocks, one for each component, and two extra-diagonal blocks for the coupling between u and v). Discretization is made on a staggered and structured grid (MAC grid, one grid for each component). After having read the user manual, I an wondering if there is a way to take benefit of the structured grid as the Semi-Structured-Grid System interface of Hypre, or if I have to use the parallel AIJ Sparse matrices. We have understood that the distributed arrays are rather adapted to scalar equations (in comparison with coupled vectorial equations). In your opinion, what is the best way to treat our problem? Kind regards Etienne Ahusborde From hzhang at mcs.anl.gov Wed Jan 21 08:57:32 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 21 Jan 2009 08:57:32 -0600 (CST) Subject: PETSc interface for staggered and structured grid? In-Reply-To: <497732AE.4060205@enscpb.fr> References: <497732AE.4060205@enscpb.fr> Message-ID: You may take a look at ~petsc/src/snes/examples/tutorials/ex19.c which solves a Nonlinear driven cavity over structured 2d grid. Hong On Wed, 21 Jan 2009, Etienne Ahusborde wrote: > Hello, > > I am a new user of PETSc. > > My goal is to solve linear systems coming from the Navier-Stokes equations. > Components of the velocity field > are coupled each other, so that, in 2D, the matrix is composed of four blocks > (two diagonal blocks, one for each component, > and two extra-diagonal blocks for the coupling between u and v). > Discretization is made on a staggered and structured grid > (MAC grid, one grid for each component). > > After having read the user manual, I an wondering if there is a way to take > benefit of the structured grid as the Semi-Structured-Grid System interface > of Hypre, or if I have to use the parallel AIJ Sparse matrices. We have > understood that the distributed arrays are > rather adapted to scalar equations (in comparison with coupled vectorial > equations). > > In your opinion, what is the best way to treat our problem? > > Kind regards > > Etienne Ahusborde > > > > From bsmith at mcs.anl.gov Wed Jan 21 11:29:09 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 21 Jan 2009 11:29:09 -0600 Subject: PETSc interface for staggered and structured grid? In-Reply-To: References: <497732AE.4060205@enscpb.fr> Message-ID: The DA object in PETSc is intended as the way to manage computations on structured grids as in ex19.c Hong mentioned. With staggered grids there is one extra complication. The DA is based on having the same number of unknowns at each grid point; this causes trouble for pressure which are cell centered so there are less pressure locations. Thus you have to "fake it" by treating as an identity those extra values. Barry On Jan 21, 2009, at 8:57 AM, Hong Zhang wrote: > > You may take a look at ~petsc/src/snes/examples/tutorials/ex19.c > which solves a Nonlinear driven cavity over structured 2d > grid. > > Hong > > On Wed, 21 Jan 2009, Etienne Ahusborde wrote: > >> Hello, >> >> I am a new user of PETSc. >> >> My goal is to solve linear systems coming from the Navier-Stokes >> equations. Components of the velocity field >> are coupled each other, so that, in 2D, the matrix is composed of >> four blocks (two diagonal blocks, one for each component, >> and two extra-diagonal blocks for the coupling between u and v). >> Discretization is made on a staggered and structured grid >> (MAC grid, one grid for each component). >> >> After having read the user manual, I an wondering if there is a way >> to take benefit of the structured grid as the Semi-Structured-Grid >> System interface of Hypre, or if I have to use the parallel AIJ >> Sparse matrices. We have understood that the distributed arrays are >> rather adapted to scalar equations (in comparison with coupled >> vectorial equations). >> >> In your opinion, what is the best way to treat our problem? >> >> Kind regards >> >> Etienne Ahusborde >> >> >> >> From mafunk at nmsu.edu Wed Jan 21 18:07:50 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Wed, 21 Jan 2009 17:07:50 -0700 Subject: compiling petsc2-3.3-p15 against mvapich-0.9.9 Message-ID: <200901211707.51080.mafunk@nmsu.edu> Hi, i was wondering if there is any issues with compiling petsc based on mvapich-0.9.9. I tells me that it is unable to configure with given options. I configure as such: ./config/configure.py --with-mpi-include=/usr/mpi/mvapich-0.9.9/gcc/include --with-mpi-lib=/usr/mpi/mvapich-0.9.9/gcc/lib/libmpich.a --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-superlu=1 --download-superlu_dist=1 --download-spooles=1 --download-hypre=hypre-2.4.0b.tar.gz I can compile ?petsc against the mvapich2 libraries just fine. thanks matt From balay at mcs.anl.gov Wed Jan 21 18:24:37 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 21 Jan 2009 18:24:37 -0600 (CST) Subject: compiling petsc2-3.3-p15 against mvapich-0.9.9 In-Reply-To: <200901211707.51080.mafunk@nmsu.edu> References: <200901211707.51080.mafunk@nmsu.edu> Message-ID: Not sure what the issue is. I see the configure.log [from the rejected e-mail] - there are some binary chars at the end of the file. Perhaps you can retry and see? [after rm *.log]. If you still have issues - send configure.log to petsc-maint at mcs.anl.gov Satish On Wed, 21 Jan 2009, Matt Funk wrote: > Hi, > > i was wondering if there is any issues with compiling petsc based on > mvapich-0.9.9. I tells me that it is unable to configure with given options. > I configure as such: > ./config/configure.py --with-mpi-include=/usr/mpi/mvapich-0.9.9/gcc/include --with-mpi-lib=/usr/mpi/mvapich-0.9.9/gcc/lib/libmpich.a --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-superlu=1 --download-superlu_dist=1 --download-spooles=1 --download-hypre=hypre-2.4.0b.tar.gz > > I can compile ?petsc against the mvapich2 libraries just fine. > > > thanks > matt > From mafunk at nmsu.edu Wed Jan 21 18:30:20 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Wed, 21 Jan 2009 17:30:20 -0700 Subject: compiling petsc2-3.3-p15 against mvapich-0.9.9 In-Reply-To: References: <200901211707.51080.mafunk@nmsu.edu> Message-ID: <200901211730.20807.mafunk@nmsu.edu> I tried it multiple from a fresh tarball each time. Same thing everytime. Did you mean try with different configure options? If so, which ones did you have in mind? thanks matt On Wednesday 21 January 2009, Satish Balay wrote: > Not sure what the issue is. I see the configure.log [from the rejected > e-mail] - there are some binary chars at the end of the file. > > Perhaps you can retry and see? [after rm *.log]. If you still have > issues - send configure.log to petsc-maint at mcs.anl.gov > > Satish > > On Wed, 21 Jan 2009, Matt Funk wrote: > > Hi, > > > > i was wondering if there is any issues with compiling petsc based on > > mvapich-0.9.9. I tells me that it is unable to configure with given > > options. I configure as such: > > ./config/configure.py > > --with-mpi-include=/usr/mpi/mvapich-0.9.9/gcc/include > > --with-mpi-lib=/usr/mpi/mvapich-0.9.9/gcc/lib/libmpich.a > > --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran > > --download-f-blas-lapack=1 --download-superlu=1 --download-superlu_dist=1 > > --download-spooles=1 --download-hypre=hypre-2.4.0b.tar.gz > > > > I can compile ?petsc against the mvapich2 libraries just fine. > > > > > > thanks > > matt From balay at mcs.anl.gov Wed Jan 21 18:40:44 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 21 Jan 2009 18:40:44 -0600 (CST) Subject: compiling petsc2-3.3-p15 against mvapich-0.9.9 In-Reply-To: <200901211730.20807.mafunk@nmsu.edu> References: <200901211707.51080.mafunk@nmsu.edu> <200901211730.20807.mafunk@nmsu.edu> Message-ID: Each time - do you see binary chars at the end of configure.log - after the _access test? I don't know why this would happen. Perhaps you can try with the same configure options - and send in the configure.log [petsc-maint at mcs.anl.gov] for us to comapre. Satish On Wed, 21 Jan 2009, Matt Funk wrote: > I tried it multiple from a fresh tarball each time. Same thing everytime. > Did you mean try with different configure options? If so, which ones did you > have in mind? > > thanks > matt > > > > On Wednesday 21 January 2009, Satish Balay wrote: > > Not sure what the issue is. I see the configure.log [from the rejected > > e-mail] - there are some binary chars at the end of the file. > > > > Perhaps you can retry and see? [after rm *.log]. If you still have > > issues - send configure.log to petsc-maint at mcs.anl.gov > > > > Satish > > > > On Wed, 21 Jan 2009, Matt Funk wrote: > > > Hi, > > > > > > i was wondering if there is any issues with compiling petsc based on > > > mvapich-0.9.9. I tells me that it is unable to configure with given > > > options. I configure as such: > > > ./config/configure.py > > > --with-mpi-include=/usr/mpi/mvapich-0.9.9/gcc/include > > > --with-mpi-lib=/usr/mpi/mvapich-0.9.9/gcc/lib/libmpich.a > > > --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran > > > --download-f-blas-lapack=1 --download-superlu=1 --download-superlu_dist=1 > > > --download-spooles=1 --download-hypre=hypre-2.4.0b.tar.gz > > > > > > I can compile ?petsc against the mvapich2 libraries just fine. > > > > > > > > > thanks > > > matt > > > From bhatiamanav at gmail.com Wed Jan 21 20:47:37 2009 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Wed, 21 Jan 2009 21:47:37 -0500 Subject: boundary conditions Message-ID: <73497BD7-5300-4B4B-87D5-4FA1BE65C69F@gmail.com> Hi, I am trying to solve an eigenproblem. I would like to apply boundary conditions by removing the rows and columns associated with these dofs (as opposed to retaining/zeroing them and setting the diagonal elements to unit values). Is there a way for me to tell the matrix data structure which dofs to ignore during the matrix operations? Or should I create a separate matrix by extracting the unconstrained dofs from the global matrix? I would appreciate any inputs. Thanks, Manav From bsmith at mcs.anl.gov Wed Jan 21 21:09:44 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 21 Jan 2009 21:09:44 -0600 Subject: boundary conditions In-Reply-To: <73497BD7-5300-4B4B-87D5-4FA1BE65C69F@gmail.com> References: <73497BD7-5300-4B4B-87D5-4FA1BE65C69F@gmail.com> Message-ID: On Jan 21, 2009, at 8:47 PM, Manav Bhatia wrote: > Hi, > > I am trying to solve an eigenproblem. I would like to apply > boundary conditions by removing the rows and columns associated with > these dofs (as opposed to retaining/zeroing them and setting the > diagonal elements to unit values). > > Is there a way for me to tell the matrix data structure which dofs > to ignore during the matrix operations? This is possible, but tricky. Basically you can call those rows/ columns -1 in your MatSetValues() calls and then they are ignored (nothing is put in the matrix for them). But this means you need to make sure that only the "real" rows and columns have numbers. For example, if your problem has three rows/columns and row 1 (the second one cause PETSc starts at 0) is the dummy row, then real row 0 would correspond to 0 as the row index you pass in, real row 1 would correspond to -1 as the row index you pass in and real row 2 would correspond to 1 as the row index you pass in. > Or should I create a separate matrix by extracting the unconstrained > dofs from the global matrix? You can use MatGetSubMatrix() to pull out the part you want. Of course, once you get the eigenvector you have to remember to "put back in the boundary condition entries". For example with the 3 by 3 matrix above if the reduced eigenvalue is [1 2] then putting it back in the original numbering would be [1 x 2] where x is whatever that boundary value was. Barry > > > I would appreciate any inputs. > > Thanks, > Manav > > From griffith at cims.nyu.edu Thu Jan 22 09:08:52 2009 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 22 Jan 2009 10:08:52 -0500 Subject: valgrind question Message-ID: <49788C04.5020203@cims.nyu.edu> Hi, Folks -- I am trying to track down a bug in some of my code using Valgrind 3.4.0. I'm using OpenMPI 1.3, which directly integrates with Valgrind. Unfortunately, I have not completed the process of transitioning my code from PETSc 2.3.3 to 3.0.0, so these runs are using PETSc 2.3.3-p15. Running my code with Valgrind, I get a number of errors like the following: Invalid read of size 8 at 0x6104C8F: UnPack_1 (vpscat.c:421) by 0x61047A4: VecScatterEnd_1 (vpscat.h:150) by 0x6133B2B: VecScatterEnd (vscat.c:1564) by 0xC6A258: IBTK::LDataManager::endDataRedistribution(int, int) (LDataManager.C:1200) by 0xA0AEF2: IBAMR::IBStaggeredHierarchyIntegrator::regridHierarchy() (IBStaggeredHierarchyIntegrator.C:1196) by 0xA0CF53: IBAMR::IBStaggeredHierarchyIntegrator::advanceHierarchy(double) (IBStaggeredHierarchyIntegrator.C:653) by 0x7308C4: main (main.C:798) Address 0xa66bfd0 is 496 bytes inside a block of size 528 alloc'd at 0x4A05FBB: malloc (vg_replace_malloc.c:207) by 0x63DCFAF: PetscMallocAlign (mal.c:40) by 0x63EA8A0: PetscTrMallocDefault (mtr.c:194) by 0x611BDB2: VecScatterCreate_PtoS (vpscat.c:1570) by 0x6121252: VecScatterCreate_StoP (vpscat.c:1946) by 0x6122D2D: VecScatterCreate_PtoP (vpscat.c:2123) by 0x6131B08: VecScatterCreate (vscat.c:1380) by 0xC6960F: IBTK::LDataManager::endDataRedistribution(int, int) (LDataManager.C:1158) by 0xA0AEF2: IBAMR::IBStaggeredHierarchyIntegrator::regridHierarchy() (IBStaggeredHierarchyIntegrator.C:1196) by 0xA0CF53: IBAMR::IBStaggeredHierarchyIntegrator::advanceHierarchy(double) (IBStaggeredHierarchyIntegrator.C:653) by 0x7308C4: main (main.C:798) Has anyone seen this kind of error crop up before? Does it indicate that I am messing up the call to VecScatterCreate? (This VecScatter is being used to scatter data between ghosted Vecs, and I wouldn't be shocked if I have made a dumb mistake.) Could Valgrind be returning a false positive? Again, this is with PETSc 2.3.3-p15. I am working on getting things up and running with PETSc 3.0.0. Thanks! -- Boyce From balay at mcs.anl.gov Thu Jan 22 10:29:05 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 22 Jan 2009 10:29:05 -0600 (CST) Subject: valgrind question In-Reply-To: <49788C04.5020203@cims.nyu.edu> References: <49788C04.5020203@cims.nyu.edu> Message-ID: On Thu, 22 Jan 2009, Boyce Griffith wrote: > Hi, Folks -- > > I am trying to track down a bug in some of my code using Valgrind 3.4.0. I'm > using OpenMPI 1.3, which directly integrates with Valgrind. Unfortunately, I > have not completed the process of transitioning my code from PETSc 2.3.3 to > 3.0.0, so these runs are using PETSc 2.3.3-p15. > > Running my code with Valgrind, I get a number of errors like the following: > > Invalid read of size 8 > at 0x6104C8F: UnPack_1 (vpscat.c:421) > by 0x61047A4: VecScatterEnd_1 (vpscat.h:150) > by 0x6133B2B: VecScatterEnd (vscat.c:1564) > by 0xC6A258: IBTK::LDataManager::endDataRedistribution(int, int) > (LDataManager.C:1200) > by 0xA0AEF2: IBAMR::IBStaggeredHierarchyIntegrator::regridHierarchy() > (IBStaggeredHierarchyIntegrator.C:1196) > by 0xA0CF53: > IBAMR::IBStaggeredHierarchyIntegrator::advanceHierarchy(double) > (IBStaggeredHierarchyIntegrator.C:653) > by 0x7308C4: main (main.C:798) > Address 0xa66bfd0 is 496 bytes inside a block of size 528 alloc'd > at 0x4A05FBB: malloc (vg_replace_malloc.c:207) > by 0x63DCFAF: PetscMallocAlign (mal.c:40) > by 0x63EA8A0: PetscTrMallocDefault (mtr.c:194) > by 0x611BDB2: VecScatterCreate_PtoS (vpscat.c:1570) > by 0x6121252: VecScatterCreate_StoP (vpscat.c:1946) > by 0x6122D2D: VecScatterCreate_PtoP (vpscat.c:2123) > by 0x6131B08: VecScatterCreate (vscat.c:1380) > by 0xC6960F: IBTK::LDataManager::endDataRedistribution(int, int) > (LDataManager.C:1158) > by 0xA0AEF2: IBAMR::IBStaggeredHierarchyIntegrator::regridHierarchy() > (IBStaggeredHierarchyIntegrator.C:1196) > by 0xA0CF53: > IBAMR::IBStaggeredHierarchyIntegrator::advanceHierarchy(double) > (IBStaggeredHierarchyIntegrator.C:653) > by 0x7308C4: main (main.C:798) > > Has anyone seen this kind of error crop up before? Does it indicate that I am > messing up the call to VecScatterCreate? (This VecScatter is being used to > scatter data between ghosted Vecs, and I wouldn't be shocked if I have made a > dumb mistake.) Could Valgrind be returning a false positive? Most likely valgrind is correct here. However its a challenge determining where exactly the bug is - that is causing this behavior.. Checking ScatterCreate() and ScatterBegin() - and making sure that same vector [or dupes] are specified to these routines is a start... It is also possible that there was some other memory corruption much earlier than this piece of code that valgrind couldn't detect [so there is some garage already in some arrays]. And this garbage is now causing bad memory access - thus this error. Satish > > Again, this is with PETSc 2.3.3-p15. I am working on getting things up and > running with PETSc 3.0.0. > > Thanks! > > -- Boyce > From griffith at cims.nyu.edu Thu Jan 22 13:40:06 2009 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 22 Jan 2009 14:40:06 -0500 Subject: valgrind question In-Reply-To: References: <49788C04.5020203@cims.nyu.edu> Message-ID: <4978CB96.5000100@cims.nyu.edu> Satish Balay wrote: > On Thu, 22 Jan 2009, Boyce Griffith wrote: >> Running my code with Valgrind, I get a number of errors like the following: >> >> Invalid read of size 8 >> at 0x6104C8F: UnPack_1 (vpscat.c:421) >> by 0x61047A4: VecScatterEnd_1 (vpscat.h:150) >> by 0x6133B2B: VecScatterEnd (vscat.c:1564) >> by 0xC6A258: IBTK::LDataManager::endDataRedistribution(int, int) >> (LDataManager.C:1200) >> by 0xA0AEF2: IBAMR::IBStaggeredHierarchyIntegrator::regridHierarchy() >> (IBStaggeredHierarchyIntegrator.C:1196) >> by 0xA0CF53: >> IBAMR::IBStaggeredHierarchyIntegrator::advanceHierarchy(double) >> (IBStaggeredHierarchyIntegrator.C:653) >> by 0x7308C4: main (main.C:798) >> Address 0xa66bfd0 is 496 bytes inside a block of size 528 alloc'd >> at 0x4A05FBB: malloc (vg_replace_malloc.c:207) >> by 0x63DCFAF: PetscMallocAlign (mal.c:40) >> by 0x63EA8A0: PetscTrMallocDefault (mtr.c:194) >> by 0x611BDB2: VecScatterCreate_PtoS (vpscat.c:1570) >> by 0x6121252: VecScatterCreate_StoP (vpscat.c:1946) >> by 0x6122D2D: VecScatterCreate_PtoP (vpscat.c:2123) >> by 0x6131B08: VecScatterCreate (vscat.c:1380) >> by 0xC6960F: IBTK::LDataManager::endDataRedistribution(int, int) >> (LDataManager.C:1158) >> by 0xA0AEF2: IBAMR::IBStaggeredHierarchyIntegrator::regridHierarchy() >> (IBStaggeredHierarchyIntegrator.C:1196) >> by 0xA0CF53: >> IBAMR::IBStaggeredHierarchyIntegrator::advanceHierarchy(double) >> (IBStaggeredHierarchyIntegrator.C:653) >> by 0x7308C4: main (main.C:798) >> >> Has anyone seen this kind of error crop up before? Does it indicate that I am >> messing up the call to VecScatterCreate? (This VecScatter is being used to >> scatter data between ghosted Vecs, and I wouldn't be shocked if I have made a >> dumb mistake.) Could Valgrind be returning a false positive? > > Most likely valgrind is correct here. However its a challenge > determining where exactly the bug is - that is causing this behavior.. > > Checking ScatterCreate() and ScatterBegin() - and making sure that > same vector [or dupes] are specified to these routines is a start... > > It is also possible that there was some other memory corruption much > earlier than this piece of code that valgrind couldn't detect [so > there is some garage already in some arrays]. And this garbage is now > causing bad memory access - thus this error. Hi, Satish -- All of these valgrind errors are related to VecGhosts. Can I run into memory corruption problems if there are no local nodes associated with a particular processor, or if there are no ghost nodes associated with a particular processor? Thanks, -- Boyce From bsmith at mcs.anl.gov Thu Jan 22 14:01:27 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 Jan 2009 14:01:27 -0600 Subject: valgrind question In-Reply-To: <4978CB96.5000100@cims.nyu.edu> References: <49788C04.5020203@cims.nyu.edu> <4978CB96.5000100@cims.nyu.edu> Message-ID: Boyce, This (corner cases) should all be legal and not cause memory corruption problems. Barry > > > Hi, Satish -- > > All of these valgrind errors are related to VecGhosts. Can I run > into memory corruption problems if there are no local nodes > associated with a particular processor, or if there are no ghost > nodes associated with a particular processor? > > Thanks, > > -- Boyce From griffith at cims.nyu.edu Thu Jan 22 15:33:23 2009 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 22 Jan 2009 16:33:23 -0500 Subject: valgrind question In-Reply-To: References: <49788C04.5020203@cims.nyu.edu> <4978CB96.5000100@cims.nyu.edu> Message-ID: <4978E623.3050309@cims.nyu.edu> Thanks, Barry! My guess is that the answer is "no", but should I expect any trouble if there are NO ghost nodes on ANY of the processors? (This happens only at the very beginning of the code.) -- Boyce Barry Smith wrote: > > Boyce, > > This (corner cases) should all be legal and not cause memory > corruption problems. > > Barry > >> >> >> Hi, Satish -- >> >> All of these valgrind errors are related to VecGhosts. Can I run into >> memory corruption problems if there are no local nodes associated with >> a particular processor, or if there are no ghost nodes associated with >> a particular processor? >> >> Thanks, >> >> -- Boyce > From balay at mcs.anl.gov Thu Jan 22 16:10:02 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 22 Jan 2009 16:10:02 -0600 (CST) Subject: compiling petsc2-3.3-p15 against mvapich-0.9.9 In-Reply-To: <200901221455.26152.mafunk@nmsu.edu> References: <200901211707.51080.mafunk@nmsu.edu> <200901211730.20807.mafunk@nmsu.edu> <200901221455.26152.mafunk@nmsu.edu> Message-ID: Looks like this old version of mpi has MPI_Win_create() - but not the associated MPI_Win definition in its include. You can edit petscconf.h and remove the lines #ifndef PETSC_HAVE_MPI_WIN_CREATE #define PETSC_HAVE_MPI_WIN_CREATE 1 #endif and then recompiling petsc. BTW: when you specify --with-mpi-dir=/usr/mpi/mvapich-0.9.9/gcc PETSc configure will use mpicc/mpif77 etc as compilers. So its best not to specify '--with-cc=gcc --with-fc=gfortran' etcc Satish On Thu, 22 Jan 2009, Matt Funk wrote: > Hi Satish, > > it still doesn't work, but something changed. I configure now with: > ./config/configure.py --with-mpi-dir=/usr/mpi/mvapich-0.9.9/gcc --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-superlu=1 --download-superlu_dist=1 --download-spooles=1 --download-hypre=hypre-2.4.0b.tar.gz > rather than specifying the include dir and lib dir specifically. > > Anyway, it passes through the configuration stage. After setting PETSC_DIR and > PETSC_ARCH, i do: make all test. This one fails. > > The first couple error are printed to the screen as such: > > In file included from vinv.c:6: > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/include/private/vecimpl.h:236: > error: expected specifier-qualifier-list before 'MPI_Win' > In file included from vscat.c:9: > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/include/private/vecimpl.h:236: > error: expected specifier-qualifier-list before 'MPI_Win' > In file included from vpscat.c:8: > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/include/private/vecimpl.h:236: > error: expected specifier-qualifier-list before 'MPI_Win' > In file included from vpscat.c:1370: > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h: In > function 'VecScatterBegin_1': > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:65: > warning: implicit declaration of function 'MPI_Win_fence' > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:65: > error: 'VecScatter_MPI_General' has no member named 'window' > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:68: > warning: implicit declaration of function 'MPI_Put' > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:68: > error: 'VecScatter_MPI_General' has no member named 'window' > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:70: > error: 'VecScatter_MPI_General' has no member named 'window' > In file included from vpscat.c:1372: > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h: In > function 'VecScatterBegin_2': > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:65: > error: 'VecScatter_MPI_General' has no member named 'window' > > > I attached the config file. > > > thanks > matt > > > > On Wednesday 21 January 2009, you wrote: > > Each time - do you see binary chars at the end of configure.log - > > after the _access test? > > > > I don't know why this would happen. > > > > Perhaps you can try with the same configure options - and send in the > > configure.log [petsc-maint at mcs.anl.gov] for us to comapre. > > > > Satish > > > > On Wed, 21 Jan 2009, Matt Funk wrote: > > > I tried it multiple from a fresh tarball each time. Same thing everytime. > > > Did you mean try with different configure options? If so, which ones did > > > you have in mind? > > > > > > thanks > > > matt > > > > > > On Wednesday 21 January 2009, Satish Balay wrote: > > > > Not sure what the issue is. I see the configure.log [from the rejected > > > > e-mail] - there are some binary chars at the end of the file. > > > > > > > > Perhaps you can retry and see? [after rm *.log]. If you still have > > > > issues - send configure.log to petsc-maint at mcs.anl.gov > > > > > > > > Satish > > > > > > > > On Wed, 21 Jan 2009, Matt Funk wrote: > > > > > Hi, > > > > > > > > > > i was wondering if there is any issues with compiling petsc based on > > > > > mvapich-0.9.9. I tells me that it is unable to configure with given > > > > > options. I configure as such: > > > > > ./config/configure.py > > > > > --with-mpi-include=/usr/mpi/mvapich-0.9.9/gcc/include > > > > > --with-mpi-lib=/usr/mpi/mvapich-0.9.9/gcc/lib/libmpich.a > > > > > --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran > > > > > --download-f-blas-lapack=1 --download-superlu=1 > > > > > --download-superlu_dist=1 --download-spooles=1 > > > > > --download-hypre=hypre-2.4.0b.tar.gz > > > > > > > > > > I can compile ?petsc against the mvapich2 libraries just fine. > > > > > > > > > > > > > > > thanks > > > > > matt > > > From mafunk at nmsu.edu Thu Jan 22 16:46:52 2009 From: mafunk at nmsu.edu (Matt Funk) Date: Thu, 22 Jan 2009 15:46:52 -0700 Subject: compiling petsc2-3.3-p15 against mvapich-0.9.9 In-Reply-To: References: <200901211707.51080.mafunk@nmsu.edu> <200901221455.26152.mafunk@nmsu.edu> Message-ID: <200901221546.52613.mafunk@nmsu.edu> Hi Satish, looks like simply leaving out --with-cc=gcc and --with-fc=gfortran did the trick. thanks matt On Thursday 22 January 2009, Satish Balay wrote: > Looks like this old version of mpi has MPI_Win_create() - but not the > associated MPI_Win definition in its include. > > You can edit petscconf.h and remove the lines > > #ifndef PETSC_HAVE_MPI_WIN_CREATE > #define PETSC_HAVE_MPI_WIN_CREATE 1 > #endif > > and then recompiling petsc. > > BTW: when you specify --with-mpi-dir=/usr/mpi/mvapich-0.9.9/gcc > PETSc configure will use mpicc/mpif77 etc as compilers. > > So its best not to specify '--with-cc=gcc --with-fc=gfortran' etcc > > Satish > > On Thu, 22 Jan 2009, Matt Funk wrote: > > Hi Satish, > > > > it still doesn't work, but something changed. I configure now with: > > ./config/configure.py --with-mpi-dir=/usr/mpi/mvapich-0.9.9/gcc > > --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran > > --download-f-blas-lapack=1 --download-superlu=1 --download-superlu_dist=1 > > --download-spooles=1 --download-hypre=hypre-2.4.0b.tar.gz rather than > > specifying the include dir and lib dir specifically. > > > > Anyway, it passes through the configuration stage. After setting > > PETSC_DIR and PETSC_ARCH, i do: make all test. This one fails. > > > > The first couple error are printed to the screen as such: > > > > In file included from vinv.c:6: > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/include/private/vecimpl.h:236 > >: error: expected specifier-qualifier-list before 'MPI_Win' > > In file included from vscat.c:9: > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/include/private/vecimpl.h:236 > >: error: expected specifier-qualifier-list before 'MPI_Win' > > In file included from vpscat.c:8: > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/include/private/vecimpl.h:236 > >: error: expected specifier-qualifier-list before 'MPI_Win' > > In file included from vpscat.c:1370: > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h: > > In function 'VecScatterBegin_1': > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:65 > >: warning: implicit declaration of function 'MPI_Win_fence' > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:65 > >: error: 'VecScatter_MPI_General' has no member named 'window' > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:68 > >: warning: implicit declaration of function 'MPI_Put' > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:68 > >: error: 'VecScatter_MPI_General' has no member named 'window' > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:70 > >: error: 'VecScatter_MPI_General' has no member named 'window' > > In file included from vpscat.c:1372: > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h: > > In function 'VecScatterBegin_2': > > /home/mafunk/Packages/petsc-2.3.3-p15_Conf0/src/vec/vec/utils/vpscat.h:65 > >: error: 'VecScatter_MPI_General' has no member named 'window' > > > > > > I attached the config file. > > > > > > thanks > > matt > > > > On Wednesday 21 January 2009, you wrote: > > > Each time - do you see binary chars at the end of configure.log - > > > after the _access test? > > > > > > I don't know why this would happen. > > > > > > Perhaps you can try with the same configure options - and send in the > > > configure.log [petsc-maint at mcs.anl.gov] for us to comapre. > > > > > > Satish > > > > > > On Wed, 21 Jan 2009, Matt Funk wrote: > > > > I tried it multiple from a fresh tarball each time. Same thing > > > > everytime. Did you mean try with different configure options? If so, > > > > which ones did you have in mind? > > > > > > > > thanks > > > > matt > > > > > > > > On Wednesday 21 January 2009, Satish Balay wrote: > > > > > Not sure what the issue is. I see the configure.log [from the > > > > > rejected e-mail] - there are some binary chars at the end of the > > > > > file. > > > > > > > > > > Perhaps you can retry and see? [after rm *.log]. If you still have > > > > > issues - send configure.log to petsc-maint at mcs.anl.gov > > > > > > > > > > Satish > > > > > > > > > > On Wed, 21 Jan 2009, Matt Funk wrote: > > > > > > Hi, > > > > > > > > > > > > i was wondering if there is any issues with compiling petsc based > > > > > > on mvapich-0.9.9. I tells me that it is unable to configure with > > > > > > given options. I configure as such: > > > > > > ./config/configure.py > > > > > > --with-mpi-include=/usr/mpi/mvapich-0.9.9/gcc/include > > > > > > --with-mpi-lib=/usr/mpi/mvapich-0.9.9/gcc/lib/libmpich.a > > > > > > --with-debugging=0 --with-log=0 --with-cc=gcc --with-fc=gfortran > > > > > > --download-f-blas-lapack=1 --download-superlu=1 > > > > > > --download-superlu_dist=1 --download-spooles=1 > > > > > > --download-hypre=hypre-2.4.0b.tar.gz > > > > > > > > > > > > I can compile ?petsc against the mvapich2 libraries just fine. > > > > > > > > > > > > > > > > > > thanks > > > > > > matt From jerome.snho at gmail.com Thu Jan 22 23:59:05 2009 From: jerome.snho at gmail.com (jerome ho) Date: Fri, 23 Jan 2009 13:59:05 +0800 Subject: Increasing convergence rate In-Reply-To: <383ade90901202314r43e1513cg255f5e1b79d6b1ab@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> <383ade90901202314r43e1513cg255f5e1b79d6b1ab@mail.gmail.com> Message-ID: <4316b1710901222159s78ffd0do889425f57b7f0bf3@mail.gmail.com> On Wed, Jan 21, 2009 at 3:14 PM, Jed Brown wrote: > Most preconditioners are not the same in parallel, including these > implementations of AMG. At a minimum, the smoother is using a block > Jacobi version of SOR or ILU. As you add processes beyond 2, the > increase in iteration count is usually very minor. > > If you are using multiple cores, the per-core floating point > performance will also be worse due to the memory bandwidth bottleneck. > That may contribute to the poor parallel performance you are seeing. > Hi I'm getting strange results. In parallel (on 2 processors), the result doesn't to be able to converge further but appears to fluctuate between 1e-9 and 1e-8 (after 100+ iterations), when it solves in 8 iterations on a single machine. I decrease the rtol (from 1e-7) for the parallel simulation because I'm getting a 20% result difference. When I split into more (6) processors, it's reporting divergence. Am I doing something wrong here? Should I be switching to DMMG method instead? The matrix size is about 1mil x 1mil. Jerome From knepley at gmail.com Fri Jan 23 08:28:28 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Jan 2009 08:28:28 -0600 Subject: Increasing convergence rate In-Reply-To: <4316b1710901222159s78ffd0do889425f57b7f0bf3@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> <383ade90901202314r43e1513cg255f5e1b79d6b1ab@mail.gmail.com> <4316b1710901222159s78ffd0do889425f57b7f0bf3@mail.gmail.com> Message-ID: On Thu, Jan 22, 2009 at 11:59 PM, jerome ho wrote: > On Wed, Jan 21, 2009 at 3:14 PM, Jed Brown wrote: >> Most preconditioners are not the same in parallel, including these >> implementations of AMG. At a minimum, the smoother is using a block >> Jacobi version of SOR or ILU. As you add processes beyond 2, the >> increase in iteration count is usually very minor. >> >> If you are using multiple cores, the per-core floating point >> performance will also be worse due to the memory bandwidth bottleneck. >> That may contribute to the poor parallel performance you are seeing. >> > > Hi > > I'm getting strange results. In parallel (on 2 processors), the result > doesn't to be able to converge further but appears to fluctuate > between 1e-9 and 1e-8 (after 100+ iterations), when it solves in 8 > iterations on a single machine. I decrease the rtol (from 1e-7) for > the parallel simulation because I'm getting a 20% result difference. The default KSP is GMRES(30). Since the parallel preconditioner is weaker, you might not be able to solve it in 30 iterates. Then this solver can fluctuate every time you restart. You can try increasing the number of Krylov vectors kept. Matt > When I split into more (6) processors, it's reporting divergence. Am I > doing something wrong here? Should I be switching to DMMG method > instead? The matrix size is about 1mil x 1mil. > > Jerome -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Fri Jan 23 10:07:03 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Fri, 23 Jan 2009 17:07:03 +0100 Subject: MatCreateMPIAIJWithSplitArrays In-Reply-To: References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> <20090115164240.uo0edtyqa5ss444s@webmail.ec-nantes.fr> Message-ID: <20090123170703.t1jv9vhqk6ko0o8s@webmail.ec-nantes.fr> I have tried to upgrade to 3.0.0, I could compile the program with MatCreateMPIAIJWithSplitArrays() but somehow cause of the configuration I cannot execute the program. Anyway, does not it work with petsc-2.3.3? Is there an equivalent mechanism for MatCreateMPIAIJWithSplitArrays() (Like MatMPIAIJSetPreallocationCSR() for MatCreateMPIAIJWithArrays()) ? Thank you very much Regards, Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE Matthew Knepley a ??crit? : > On Thu, Jan 15, 2009 at 9:42 AM, Panyasantisuk Jarunan < > Jarunan.Panyasantisuk at eleves.ec-nantes.fr> wrote: > >> When I create a matrix with MatCreateMPIAIJWithSplitArrays, as it doesn't >> copy the values so I have to use MatSetValues to set the internal value? > > > 1) You should upgrade to 3.0.0 > > 2) You should not have to call MatSetValues(). It will use the arrays you > provide. > > Matt > > >> >> -- >> Jarunan PANYASANTISUK >> MSc. in Computational Mechanics >> Erasmus Mundus Master Program >> Ecole Centrale de Nantes >> 1, rue de la no?, 44321 NANTES, FRANCE >> >> >> >> Barry Smith a ?(c)crit? : >> >> >>> You should be able to use MatCreateMPIAIJWithSplitArrays(), >>> MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() >>> from Fortran. Are you using PETSc 3.0.0? >>> >>> The arguments for MatCreateMPIAIJWithArrays() or >>> MatMPIAIJSetPreallocationCSR() have the same meaning >>> (in fact MatCreateMPIAIJWithArrays() essentially calls >>> MatCreateMPIAIJWithArrays()). >>> >>> Barry >>> >>> On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: >>> >>> Oh, I could not use MatCreateMPIAIJWithArrays either but the mechanism >>>> below works. >>>> >>>> call MatCreate(PETSC_COMM_WORLD,D,ierr) >>>> call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, >>>> $ ierr) >>>> call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix >>>> call MatSetFromOptions(D,ierr) >>>> call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) >>>> >>>> Where pointer is start-row indices a >>>> Column is local column indices >>>> v is value >>>> >>>> Is there the different beteween the start-row indices in >>>> MatMPIAIJSetPreallocationCSR and row indices in >>>> MatCreateMPIAIJWithArrays >>>> ? >>>> >>>> >>>> >>>> Regards, >>>> Jarunan >>>> >>>> >>>> >>>> >>>> Hello, >>>> >>>> To define a matrix with arrays, I cannot use >>>> MatCreateMPIAIJWithSplitArrays in my program which is written in >>>> Fortran: >>>> >>>> call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, >>>> $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, >>>> $ oColumn,ov,D,ierr) >>>> >>>> The error is >>>> F:246: undefined reference to `matcreatempiaijwithsplitarrays_' >>>> >>>> I could use MatCreateMPIAIJWithArrays but the off diagonal values are >>>> missing with this command. >>>> >>>> I would be appreciate for any advice. Thank you before hand. >>>> >>>> Regards, >>>> Jarunan >>>> >>>> >>>> >>>> >>>> -- >>>> Jarunan PANYASANTISUK >>>> MSc. in Computational Mechanics >>>> Erasmus Mundus Master Program >>>> Ecole Centrale de Nantes >>>> 1, rue de la no?, 44321 NANTES, FRANCE >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >> >> >> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From knepley at gmail.com Fri Jan 23 10:12:53 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Jan 2009 10:12:53 -0600 Subject: MatCreateMPIAIJWithSplitArrays In-Reply-To: <20090123170703.t1jv9vhqk6ko0o8s@webmail.ec-nantes.fr> References: <20090114143834.l7v6bkosp8g00okk@webmail.ec-nantes.fr> <20090115164240.uo0edtyqa5ss444s@webmail.ec-nantes.fr> <20090123170703.t1jv9vhqk6ko0o8s@webmail.ec-nantes.fr> Message-ID: On Fri, Jan 23, 2009 at 10:07 AM, Panyasantisuk Jarunan wrote: > I have tried to upgrade to 3.0.0, I could compile the program with > MatCreateMPIAIJWithSplitArrays() but somehow cause of the configuration I > cannot execute the program. Anyway, does not it work with petsc-2.3.3? > > Is there an equivalent mechanism for MatCreateMPIAIJWithSplitArrays() (Like > MatMPIAIJSetPreallocationCSR() for MatCreateMPIAIJWithArrays()) ? When you report an error, please send the entire error message. In this case, I need to see the compiler error or I have absolutely no idea what is going on. Matt > Thank you very much > > Regards, > Jarunan > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > Matthew Knepley a ?(c)crit? : > >> On Thu, Jan 15, 2009 at 9:42 AM, Panyasantisuk Jarunan < >> Jarunan.Panyasantisuk at eleves.ec-nantes.fr> wrote: >> >>> When I create a matrix with MatCreateMPIAIJWithSplitArrays, as it doesn't >>> copy the values so I have to use MatSetValues to set the internal value? >> >> >> 1) You should upgrade to 3.0.0 >> >> 2) You should not have to call MatSetValues(). It will use the arrays you >> provide. >> >> Matt >> >> >>> >>> -- >>> Jarunan PANYASANTISUK >>> MSc. in Computational Mechanics >>> Erasmus Mundus Master Program >>> Ecole Centrale de Nantes >>> 1, rue de la no?, 44321 NANTES, FRANCE >>> >>> >>> >>> Barry Smith a ?(c)crit? : >>> >>> >>>> You should be able to use MatCreateMPIAIJWithSplitArrays(), >>>> MatCreateMPIAIJWithArrays() or MatMPIAIJSetPreallocationCSR() >>>> from Fortran. Are you using PETSc 3.0.0? >>>> >>>> The arguments for MatCreateMPIAIJWithArrays() or >>>> MatMPIAIJSetPreallocationCSR() have the same meaning >>>> (in fact MatCreateMPIAIJWithArrays() essentially calls >>>> MatCreateMPIAIJWithArrays()). >>>> >>>> Barry >>>> >>>> On Jan 14, 2009, at 7:38 AM, Panyasantisuk Jarunan wrote: >>>> >>>> Oh, I could not use MatCreateMPIAIJWithArrays either but the mechanism >>>>> >>>>> below works. >>>>> >>>>> call MatCreate(PETSC_COMM_WORLD,D,ierr) >>>>> call MatSetSizes(D,N,N,PETSC_DETERMINE,PETSC_DETERMINE, >>>>> $ ierr) >>>>> call MatSetType(D,MATMPIAIJ,ierr) ! to set type a parallel matrix >>>>> call MatSetFromOptions(D,ierr) >>>>> call MatMPIAIJSetPreallocationCSR(D,pointer,Column,v,ierr) >>>>> >>>>> Where pointer is start-row indices a >>>>> Column is local column indices >>>>> v is value >>>>> >>>>> Is there the different beteween the start-row indices in >>>>> MatMPIAIJSetPreallocationCSR and row indices in >>>>> MatCreateMPIAIJWithArrays >>>>> ? >>>>> >>>>> >>>>> >>>>> Regards, >>>>> Jarunan >>>>> >>>>> >>>>> >>>>> >>>>> Hello, >>>>> >>>>> To define a matrix with arrays, I cannot use >>>>> MatCreateMPIAIJWithSplitArrays in my program which is written in >>>>> Fortran: >>>>> >>>>> call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,N,N, >>>>> $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, >>>>> $ oColumn,ov,D,ierr) >>>>> >>>>> The error is >>>>> F:246: undefined reference to `matcreatempiaijwithsplitarrays_' >>>>> >>>>> I could use MatCreateMPIAIJWithArrays but the off diagonal values are >>>>> missing with this command. >>>>> >>>>> I would be appreciate for any advice. Thank you before hand. >>>>> >>>>> Regards, >>>>> Jarunan >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Jarunan PANYASANTISUK >>>>> MSc. in Computational Mechanics >>>>> Erasmus Mundus Master Program >>>>> Ecole Centrale de Nantes >>>>> 1, rue de la no?, 44321 NANTES, FRANCE >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From jed at 59A2.org Fri Jan 23 11:53:14 2009 From: jed at 59A2.org (Jed Brown) Date: Fri, 23 Jan 2009 08:53:14 -0900 Subject: Increasing convergence rate In-Reply-To: <4316b1710901222159s78ffd0do889425f57b7f0bf3@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> <383ade90901202314r43e1513cg255f5e1b79d6b1ab@mail.gmail.com> <4316b1710901222159s78ffd0do889425f57b7f0bf3@mail.gmail.com> Message-ID: <383ade90901230953o634500acw4a47d01ca98ff64a@mail.gmail.com> On Thu, Jan 22, 2009 at 20:59, jerome ho wrote: > I'm getting strange results. In parallel (on 2 processors), the result > doesn't to be able to converge further but appears to fluctuate > between 1e-9 and 1e-8 (after 100+ iterations), when it solves in 8 > iterations on a single machine. I decrease the rtol (from 1e-7) for > the parallel simulation because I'm getting a 20% result difference. The 20% result difference makes me very worried that the matrices are actually different. Are you still using BoomerAMG? If your 1Mx1M matrix comes from a 2D problem you might be able to compare with a direct solve (-pc_type lu -pc_factor_mat_solver_package mumps) but if it's 3D, that would take way too much memory. It's a good idea to make the problem as small as possible (like 100x100 or less) when dealing with issues of correctness. It's really hard to make a preconditioner exactly the same in parallel, even parallel ILU (like Euclid with default options) is not exactly the same. It's silly, but if you can't make the problem smaller, can't use a direct solver, and don't have an easy way to determine if the parallel matrix is the same as the serial one, try -pc_type redundant -pc_redundant_type hypre, the results (up to rounding error due to non-associativity) and number of iterations should be the same as in serial but the monitored residuals won't be exactly the same since they are computed differently. > When I split into more (6) processors, it's reporting divergence. Am I > doing something wrong here? Should I be switching to DMMG method > instead? The matrix size is about 1mil x 1mil. If you are using a structured grid then geometric multigrid (DMMG) should reduce setup time compared to AMG, but AMG might be more robust. That's not the issue here so I wouldn't bother until you get correct results in parallel. Jed From billy at dem.uminho.pt Sun Jan 25 07:52:33 2009 From: billy at dem.uminho.pt (=?iso-8859-1?Q?Billy_Ara=FAjo?=) Date: Sun, 25 Jan 2009 13:52:33 -0000 Subject: Compiling and using PETSC with intel MKL Message-ID: <1200D8BEDB3DD54DBA528E210F372BF3D944BE@BEFUNCIONARIOS.uminho.pt> Hi, Before I compiled PETSC version 2.3.3-p6 with lapack/blas supplied with Intel MKL and everythin worked great: ./config/configure.py --with-cc=gcc --with-fc=g77 --download-lam=1 --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014 --with-debugging=no --with-shared=0 Now I have tried PETSC version 3.0.0-p2 with OpenMPI and same Intel MKL version and it compiles but when I run my application I get the following error: MKL FATAL ERROR: libmkl_def.so: cannot open shared object file: No such file or directory ./config/configure.py --with-cc=gcc --with-fc=g77 --download-openmpi=1 --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014 --with-debugging=no --with-shared=0 Regards, Billy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From w_subber at yahoo.com Sun Jan 25 10:30:26 2009 From: w_subber at yahoo.com (Waad Subber) Date: Sun, 25 Jan 2009 08:30:26 -0800 (PST) Subject: cannot compile using petsc 3.0 Message-ID: <902.61979.qm@web38206.mail.mud.yahoo.com> Hi, I just installed petsc-3.0.0-p2. Everything goes well in the installation using intel compilers and Intel MKL. When I compile my code I get these error messages : mfactor.F(21): #error: can't find include file: include/finclude/petsc.h mfactor.F(22): #error: can't find include file: include/finclude/petscvec.h mfactor.F(23): #error: can't find include file: include/finclude/petscviewer.h mfactor.F(24): #error: can't find include file: include/finclude/petscmat.h : : : My makefile is ------------------------------------------------------------------------------------------- include ${PETSC_DIR}/conf/base INCS=${PETSC_INCLUDE} LIBS=${PETSC_LIB} MAIN = RUN.exe COMP = mpif90 BINDIR = ../bin LIST = mfactor.F msolve.F mveccreate.F m_mdim.f90 m_minput.f90? m_mglobal.F m_moutput.F all : $(LIST) ??? $(COMP) $(INCS) $(LIST) $(LIBS) -o $(BINDIR)/$(MAIN)??? ??? ${RM} -f *.o *.mod --------------------------------------------------------------------------------------- Any idea ? thanks Waad -------------- next part -------------- An HTML attachment was scrubbed... URL: From billy at dem.uminho.pt Sun Jan 25 12:35:03 2009 From: billy at dem.uminho.pt (=?iso-8859-1?Q?Billy_Ara=FAjo?=) Date: Sun, 25 Jan 2009 18:35:03 -0000 Subject: PETSc version 3.0.0-p2 Message-ID: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> I compiled my code using the new version of PETSc but the convergence of the GMRES has suffered greatly, specially in parallel. I am using GMRES with ASM preconditioner. Has there been any change? Billy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Sun Jan 25 13:37:53 2009 From: jed at 59A2.org (Jed Brown) Date: Sun, 25 Jan 2009 10:37:53 -0900 Subject: cannot compile using petsc 3.0 In-Reply-To: <902.61979.qm@web38206.mail.mud.yahoo.com> References: <902.61979.qm@web38206.mail.mud.yahoo.com> Message-ID: <20090125193753.GA15428@brakk> On Sun 2009-01-25 08:30, Waad Subber wrote: > I just installed petsc-3.0.0-p2. Everything goes well in the installation using > intel compilers and Intel MKL. When I compile my code I get these error > messages : > > mfactor.F(21): #error: can't find include file: include/finclude/petsc.h > mfactor.F(22): #error: can't find include file: include/finclude/petscvec.h > mfactor.F(23): #error: can't find include file: include/finclude/petscviewer.h > mfactor.F(24): #error: can't find include file: include/finclude/petscmat.h You should just include "finclude/petscXXX.h" Including "include/finclude/petscXXX.h" doesn't work with 3.0.0 because $PETSC_DIR is no longer a default include directory, everything the user needs is in $PETSC_DIR/include. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: From knepley at gmail.com Sun Jan 25 13:38:54 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 25 Jan 2009 13:38:54 -0600 Subject: Compiling and using PETSC with intel MKL In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D944BE@BEFUNCIONARIOS.uminho.pt> References: <1200D8BEDB3DD54DBA528E210F372BF3D944BE@BEFUNCIONARIOS.uminho.pt> Message-ID: With any bug report, please send configure.log and make*.log. Otherwise, we really cannot see what is going on. Matt On Sun, Jan 25, 2009 at 7:52 AM, Billy Ara?jo wrote: > > Hi, > > Before I compiled PETSC version 2.3.3-p6 with lapack/blas supplied with > Intel MKL and everythin worked great: > > ./config/configure.py --with-cc=gcc --with-fc=g77 --download-lam=1 > --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014 --with-debugging=no > --with-shared=0 > > Now I have tried PETSC version 3.0.0-p2 with OpenMPI and same Intel MKL > version and it compiles but when I run my application I get the following > error: > > MKL FATAL ERROR: libmkl_def.so: cannot open shared object file: No such file > or directory > > ./config/configure.py --with-cc=gcc --with-fc=g77 --download-openmpi=1 > --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014 --with-debugging=no > --with-shared=0 > > Regards, > > Billy. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Sun Jan 25 13:40:11 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 25 Jan 2009 13:40:11 -0600 Subject: PETSc version 3.0.0-p2 In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> References: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> Message-ID: On Sun, Jan 25, 2009 at 12:35 PM, Billy Ara?jo wrote: > > I compiled my code using the new version of PETSc but the convergence of the > GMRES has suffered greatly, specially in parallel. I am using GMRES with ASM > preconditioner. > Has there been any change? No. Run with -snes_view to see exactly what solver you are using. Matt > Billy. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Sun Jan 25 14:22:01 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 Jan 2009 14:22:01 -0600 Subject: PETSc version 3.0.0-p2 In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> References: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> Message-ID: <5901D5AC-DE04-441A-BBAB-88BBB51510F9@mcs.anl.gov> How are you determining "convergence"? Run old and new with - ksp_monitor_true_residual -ksp_truemonitor We did change approach to convergence when running with a nonzero initial guess. Barry On Jan 25, 2009, at 12:35 PM, Billy Ara?jo wrote: > > I compiled my code using the new version of PETSc but the > convergence of the GMRES has suffered greatly, specially in > parallel. I am using GMRES with ASM preconditioner. > Has there been any change? > > Billy. > From bsmith at mcs.anl.gov Sun Jan 25 14:26:00 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 Jan 2009 14:26:00 -0600 Subject: Compiling and using PETSC with intel MKL In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D944BE@BEFUNCIONARIOS.uminho.pt> References: <1200D8BEDB3DD54DBA528E210F372BF3D944BE@BEFUNCIONARIOS.uminho.pt> Message-ID: <7E7F102C-2F44-47F7-9E69-0E46BD8B3A84@mcs.anl.gov> On Jan 25, 2009, at 7:52 AM, Billy Ara?jo wrote: > > Hi, > > Before I compiled PETSC version 2.3.3-p6 with lapack/blas supplied > with Intel MKL and everythin worked great: > > ./config/configure.py --with-cc=gcc --with-fc=g77 --download-lam=1 -- > with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014 --with-debugging=no -- > with-shared=0 > > Now I have tried PETSC version 3.0.0-p2 with OpenMPI and same Intel > MKL version and it compiles but when I run my application I get the > following error: > > MKL FATAL ERROR: libmkl_def.so: cannot open shared object file: No > such file or directory > It cannot find the location of the shared library at runtime. You may need to set a LD_LIBRARY_PATH or similar environmental variable before running the program. Barry > > > ./config/configure.py --with-cc=gcc --with-fc=g77 --download- > openmpi=1 --with-blas-lapack-dir=/opt/intel/mkl/10.0.1.014 --with- > debugging=no --with-shared=0 > > Regards, > > Billy. From billy at dem.uminho.pt Sun Jan 25 15:50:56 2009 From: billy at dem.uminho.pt (=?iso-8859-1?Q?Billy_Ara=FAjo?=) Date: Sun, 25 Jan 2009 21:50:56 -0000 Subject: PETSc version 3.0.0-p2 References: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> <5901D5AC-DE04-441A-BBAB-88BBB51510F9@mcs.anl.gov> Message-ID: <1200D8BEDB3DD54DBA528E210F372BF3D944C2@BEFUNCIONARIOS.uminho.pt> I am only running my application on a dual-core intel machine (2 processors). I just noted that the GMRES solver is much slower but before I was using LAMMPI so maybe it has to do with OpenMPI. I have to test it further and then I will keep you posted. Regards, Billy. -----Mensagem original----- De: petsc-users-bounces at mcs.anl.gov em nome de Barry Smith Enviada: dom 25-01-2009 20:22 Para: PETSc users list Assunto: Re: PETSc version 3.0.0-p2 How are you determining "convergence"? Run old and new with - ksp_monitor_true_residual -ksp_truemonitor We did change approach to convergence when running with a nonzero initial guess. Barry On Jan 25, 2009, at 12:35 PM, Billy Ara?jo wrote: > > I compiled my code using the new version of PETSc but the > convergence of the GMRES has suffered greatly, specially in > parallel. I am using GMRES with ASM preconditioner. > Has there been any change? > > Billy. > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 3011 bytes Desc: not available URL: From irfan.khan at gatech.edu Sun Jan 25 20:19:38 2009 From: irfan.khan at gatech.edu (Khan, Irfan) Date: Sun, 25 Jan 2009 21:19:38 -0500 (EST) Subject: parallel solvers and matrix structure In-Reply-To: <2125197683.1813191232935148354.JavaMail.root@mail4.gatech.edu> Message-ID: <1980111559.1816051232936378340.JavaMail.root@mail4.gatech.edu> Dear Petsc team Firstly, thanks for developing PETSc. I have been using it to parallelize a linear finite element code to use with a parallel lattice Boltzmann code. It has helped me a lot untill now. I have some questions about the way the parallel solvers handles matrices with varying zeros in the matrix. I have found that the more MatSetValue() commands I use in my code the slower it gets. Therefore I initialize the parallel Stiffness matrix to 0.0. I then fill in the values using an "if" conditions to eliminate zero entries into the parallel Stiffness matrix. This reduces the number of times MatSetValue() is called greatly (5-15 times). I also approximate the number of non-zero entries into the parallel matrix, that I create using MatCreateMPIAIJ. I have attached outputs of running my code with "-info"; one with the "if" condition and the other without the conditions (at the end of the email). I have also compared the matrices generated by using the "if" condition and the one generated without the "if" condition. They are the same except for the extra zero entries in the later case. But what I realize is that during solving, the matrix generated with the "if" condition is not converging while the matrix generated without the "if" conditions converges in 17 iterations. I use KSPCG. It would help me a lot if I can get some suggestions on how to use MatSetValue() optimally, the reason for KSPCG failing to converge and if something can be done. Also any suggestions on if I should not use an "if" condition to enter values into the Stiffness matrix to eliminate zero entries. I will be glad to send any other information if needed. Thanks in advance Best Regards Irfan WITHOUT "IF" CONDITION < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 mallocs. < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 306 unneeded,4356 used < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 216 < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using Inode routines < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 465 unneeded,3576 used < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 183 < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using Inode routines < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 243 unneeded,972 used < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 72 < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/(num_localrows 66) < 0.6. Do not use CompressedRow routines. < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 504 unneeded,1116 used < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 99 < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 66) < 0.6. Do not use CompressedRow routines. WITH "IF" CONDITION > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 118 unneeded,1304 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 > [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using Inode routines > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 353 unneeded,1033 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 6 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 > [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using Inode routines > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: 14 unneeded,121 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 > [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/(num_localrows 66) > 0.6. Use CompressedRow routines. > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: 14 unneeded,121 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 > [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/(num_localrows 66) > 0.6. Use CompressedRow routines. From knepley at gmail.com Sun Jan 25 20:42:31 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 25 Jan 2009 20:42:31 -0600 Subject: parallel solvers and matrix structure In-Reply-To: <1980111559.1816051232936378340.JavaMail.root@mail4.gatech.edu> References: <2125197683.1813191232935148354.JavaMail.root@mail4.gatech.edu> <1980111559.1816051232936378340.JavaMail.root@mail4.gatech.edu> Message-ID: On Sun, Jan 25, 2009 at 8:19 PM, Khan, Irfan wrote: > Dear Petsc team > > Firstly, thanks for developing PETSc. I have been using it to parallelize a linear finite element code to use with a parallel lattice Boltzmann code. It has helped me a lot untill now. > > I have some questions about the way the parallel solvers handles matrices with varying zeros in the matrix. > > I have found that the more MatSetValue() commands I use in my code the slower it gets. Therefore I initialize the parallel Stiffness matrix to 0.0. I then fill in the values using an "if" conditions to eliminate zero entries into the parallel Stiffness matrix. This reduces the number of times MatSetValue() is called greatly (5-15 times). I also approximate the number of non-zero entries into the parallel matrix, that I create using MatCreateMPIAIJ. I have attached outputs of running my code with "-info"; one with the "if" condition and the other without the conditions (at the end of the email). 1) The real problem here I think is not the number of times you call MAtSetValues(), but the fact that your matrix preallocation is incorrect if you do not exclude some of the zero values. If you fix this, the speed should be about the same. > I have also compared the matrices generated by using the "if" condition and the one generated without the "if" condition. They are the same except for the extra zero entries in the later case. But what I realize is that during solving, the matrix generated with the "if" condition is not converging while the matrix generated without the "if" conditions converges in 17 iterations. I use KSPCG. 2) I am guessing that you are using ILU. This depends on the nonzero pattern of the matrix, and thus will change between these two cases. Matt > It would help me a lot if I can get some suggestions on how to use MatSetValue() optimally, the reason for KSPCG failing to converge and if something can be done. Also any suggestions on if I should not use an "if" condition to enter values into the Stiffness matrix to eliminate zero entries. > > I will be glad to send any other information if needed. > > Thanks in advance > Best Regards > Irfan > > > > > WITHOUT "IF" CONDITION > < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 > < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 mallocs. > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 306 unneeded,4356 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 216 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using Inode routines > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 465 unneeded,3576 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 183 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using Inode routines > > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 243 unneeded,972 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 72 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/(num_localrows 66) < 0.6. Do not use CompressedRow routines. > > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 504 unneeded,1116 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 99 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 66) < 0.6. Do not use CompressedRow routines. > > > WITH "IF" CONDITION >> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 >> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 118 unneeded,1304 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using Inode routines >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: 353 unneeded,1033 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 6 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using Inode routines > >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: 14 unneeded,121 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 >> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/(num_localrows 66) > 0.6. Use CompressedRow routines. > >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: 14 unneeded,121 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 >> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/(num_localrows 66) > 0.6. Use CompressedRow routines. > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Sun Jan 25 20:42:42 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 Jan 2009 20:42:42 -0600 Subject: parallel solvers and matrix structure In-Reply-To: <1980111559.1816051232936378340.JavaMail.root@mail4.gatech.edu> References: <1980111559.1816051232936378340.JavaMail.root@mail4.gatech.edu> Message-ID: On Jan 25, 2009, at 8:19 PM, Khan, Irfan wrote: > Dear Petsc team > > Firstly, thanks for developing PETSc. I have been using it to > parallelize a linear finite element code to use with a parallel > lattice Boltzmann code. It has helped me a lot untill now. > > I have some questions about the way the parallel solvers handles > matrices with varying zeros in the matrix. > > I have found that the more MatSetValue() commands I use in my code > the slower it gets. Therefore I initialize the parallel Stiffness > matrix to 0.0. I don't know what you mean by this. You should NOT call MatZeroEntries() on the matrix initially. This will destroy your preallocation information. Call MatZeroEntries() after you have filled it up and used it, when you are ready to start again. > I then fill in the values using an "if" conditions to eliminate zero > entries into the parallel Stiffness matrix. This reduces the number > of times MatSetValue() is called greatly (5-15 times). I also > approximate the number of non-zero entries into the parallel matrix, > that I create using MatCreateMPIAIJ. I have attached outputs of > running my code with "-info"; one with the "if" condition and the > other without the conditions (at the end of the email). If you are using the SeqAIJ or MPIAIJ matrices then you can simply call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) and it will not put in those locations with zero value. > > > I have also compared the matrices generated by using the "if" > condition and the one generated without the "if" condition. They are > the same except for the extra zero entries in the later case. But > what I realize is that during solving, the matrix generated with the > "if" condition is not converging while the matrix generated without > the "if" conditions converges in 17 iterations. I use KSPCG. First run both cases with -pc_type lu and confirm that they both produce the same solution to the linear system (on one process). Also confirm that the matrix is symmetric positive definite so you can use CG. The reason for the difference in converging is because the default solver that PETSc uses is ILU(0). The "quality" of an ILU preconditioner depends on the "nonzero" structure of the matrix, the more "nonzero" locations the more likely there will be convergence and it will be faster. For ILU a zero in a location is different than not having that location at all. Likely you want to use a better preconditioner; what about -pc_type sor? For moderate size problems, say < 100,000 you may want to use Cholesky direct solver. In parallel run config/ configure.py --download-spooles then run the code with -pc_type cholesky -pc_factor_mat_solver_package spooles Barry > > > It would help me a lot if I can get some suggestions on how to use > MatSetValue() optimally, the reason for KSPCG failing to converge > and if something can be done. Also any suggestions on if I should > not use an "if" condition to enter values into the Stiffness matrix > to eliminate zero entries. > > I will be glad to send any other information if needed. > > Thanks in advance > Best Regards > Irfan > > > > > WITHOUT "IF" CONDITION > < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 > < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 > mallocs. > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 306 unneeded,4356 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 216 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using > Inode routines > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 465 unneeded,3576 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 183 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using > Inode routines > > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 243 unneeded,972 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 72 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 504 unneeded,1116 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 99 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > > WITH "IF" CONDITION >> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 >> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 118 unneeded,1304 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using >> Inode routines >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 353 unneeded,1033 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 6 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using >> Inode routines > >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: >> 14 unneeded,121 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 >> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: >> 14 unneeded,121 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 >> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > > > > > From irfan.khan at gatech.edu Sun Jan 25 23:04:21 2009 From: irfan.khan at gatech.edu (Khan, Irfan) Date: Mon, 26 Jan 2009 00:04:21 -0500 (EST) Subject: parallel solvers and matrix structure In-Reply-To: <1172248477.1841271232946199356.JavaMail.root@mail4.gatech.edu> Message-ID: <391003168.1841291232946261557.JavaMail.root@mail4.gatech.edu> Thanks for the quick and helpful suggestions. I have used MatSetOptions() using MAT_IGNORE_ZERO_ENTRIES with -pc_type sor. This gives me the same Stiffness matrix in both cases (with and without excluding zero entries). Further the results are the same, including the number of iterations. However, the number of iterations to convergence is 70. Earlier I used to use PCASM with KSPCG. This required 17 iterations. Barry mentioned "spooles" with pc_type cholesky. But I am skeptical about its performs on systems with over a million degrees of freedom. My matrices are positive definite, thus CG may be ideally suited. But I am not sure what kind of preconditioner would best suit in parallel. Any suggestions on preconditioner-solver combinations for very large (1-10 million dof) positive definite systems? Thanks again Best Regards Irfan ----- Original Message ----- From: "Barry Smith" To: "PETSc users list" Sent: Sunday, January 25, 2009 9:42:42 PM GMT -05:00 US/Canada Eastern Subject: Re: parallel solvers and matrix structure On Jan 25, 2009, at 8:19 PM, Khan, Irfan wrote: > Dear Petsc team > > Firstly, thanks for developing PETSc. I have been using it to > parallelize a linear finite element code to use with a parallel > lattice Boltzmann code. It has helped me a lot untill now. > > I have some questions about the way the parallel solvers handles > matrices with varying zeros in the matrix. > > I have found that the more MatSetValue() commands I use in my code > the slower it gets. Therefore I initialize the parallel Stiffness > matrix to 0.0. I don't know what you mean by this. You should NOT call MatZeroEntries() on the matrix initially. This will destroy your preallocation information. Call MatZeroEntries() after you have filled it up and used it, when you are ready to start again. > I then fill in the values using an "if" conditions to eliminate zero > entries into the parallel Stiffness matrix. This reduces the number > of times MatSetValue() is called greatly (5-15 times). I also > approximate the number of non-zero entries into the parallel matrix, > that I create using MatCreateMPIAIJ. I have attached outputs of > running my code with "-info"; one with the "if" condition and the > other without the conditions (at the end of the email). If you are using the SeqAIJ or MPIAIJ matrices then you can simply call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) and it will not put in those locations with zero value. > > > I have also compared the matrices generated by using the "if" > condition and the one generated without the "if" condition. They are > the same except for the extra zero entries in the later case. But > what I realize is that during solving, the matrix generated with the > "if" condition is not converging while the matrix generated without > the "if" conditions converges in 17 iterations. I use KSPCG. First run both cases with -pc_type lu and confirm that they both produce the same solution to the linear system (on one process). Also confirm that the matrix is symmetric positive definite so you can use CG. The reason for the difference in converging is because the default solver that PETSc uses is ILU(0). The "quality" of an ILU preconditioner depends on the "nonzero" structure of the matrix, the more "nonzero" locations the more likely there will be convergence and it will be faster. For ILU a zero in a location is different than not having that location at all. Likely you want to use a better preconditioner; what about -pc_type sor? For moderate size problems, say < 100,000 you may want to use Cholesky direct solver. In parallel run config/ configure.py --download-spooles then run the code with -pc_type cholesky -pc_factor_mat_solver_package spooles Barry > > > It would help me a lot if I can get some suggestions on how to use > MatSetValue() optimally, the reason for KSPCG failing to converge > and if something can be done. Also any suggestions on if I should > not use an "if" condition to enter values into the Stiffness matrix > to eliminate zero entries. > > I will be glad to send any other information if needed. > > Thanks in advance > Best Regards > Irfan > > > > > WITHOUT "IF" CONDITION > < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 > < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 > mallocs. > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 306 unneeded,4356 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 216 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using > Inode routines > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 465 unneeded,3576 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 183 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using > Inode routines > > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 243 unneeded,972 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 72 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 504 unneeded,1116 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 99 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > > WITH "IF" CONDITION >> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 >> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 118 unneeded,1304 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using >> Inode routines >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 353 unneeded,1033 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 6 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using >> Inode routines > >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: >> 14 unneeded,121 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 >> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: >> 14 unneeded,121 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 >> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > > > > > From w_subber at yahoo.com Mon Jan 26 08:19:43 2009 From: w_subber at yahoo.com (Waad Subber) Date: Mon, 26 Jan 2009 06:19:43 -0800 (PST) Subject: parallel solvers and matrix structure In-Reply-To: <391003168.1841291232946261557.JavaMail.root@mail4.gatech.edu> Message-ID: <867107.32871.qm@web38203.mail.mud.yahoo.com> Are you doing substructuring yourself or? just using petsc parallel solver ? --- On Mon, 1/26/09, Khan, Irfan wrote: From: Khan, Irfan Subject: Re: parallel solvers and matrix structure To: "PETSc users list" Date: Monday, January 26, 2009, 12:04 AM Thanks for the quick and helpful suggestions. I have used MatSetOptions() using MAT_IGNORE_ZERO_ENTRIES with -pc_type sor. This gives me the same Stiffness matrix in both cases (with and without excluding zero entries). Further the results are the same, including the number of iterations. However, the number of iterations to convergence is 70. Earlier I used to use PCASM with KSPCG. This required 17 iterations. Barry mentioned "spooles" with pc_type cholesky. But I am skeptical about its performs on systems with over a million degrees of freedom. My matrices are positive definite, thus CG may be ideally suited. But I am not sure what kind of preconditioner would best suit in parallel. Any suggestions on preconditioner-solver combinations for very large (1-10 million dof) positive definite systems? Thanks again Best Regards Irfan ----- Original Message ----- From: "Barry Smith" To: "PETSc users list" Sent: Sunday, January 25, 2009 9:42:42 PM GMT -05:00 US/Canada Eastern Subject: Re: parallel solvers and matrix structure On Jan 25, 2009, at 8:19 PM, Khan, Irfan wrote: > Dear Petsc team > > Firstly, thanks for developing PETSc. I have been using it to > parallelize a linear finite element code to use with a parallel > lattice Boltzmann code. It has helped me a lot untill now. > > I have some questions about the way the parallel solvers handles > matrices with varying zeros in the matrix. > > I have found that the more MatSetValue() commands I use in my code > the slower it gets. Therefore I initialize the parallel Stiffness > matrix to 0.0. I don't know what you mean by this. You should NOT call MatZeroEntries() on the matrix initially. This will destroy your preallocation information. Call MatZeroEntries() after you have filled it up and used it, when you are ready to start again. > I then fill in the values using an "if" conditions to eliminate zero > entries into the parallel Stiffness matrix. This reduces the number > of times MatSetValue() is called greatly (5-15 times). I also > approximate the number of non-zero entries into the parallel matrix, > that I create using MatCreateMPIAIJ. I have attached outputs of > running my code with "-info"; one with the "if" condition and the > other without the conditions (at the end of the email). If you are using the SeqAIJ or MPIAIJ matrices then you can simply call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) and it will not put in those locations with zero value. > > > I have also compared the matrices generated by using the "if" > condition and the one generated without the "if" condition. They are > the same except for the extra zero entries in the later case. But > what I realize is that during solving, the matrix generated with the > "if" condition is not converging while the matrix generated without > the "if" conditions converges in 17 iterations. I use KSPCG. First run both cases with -pc_type lu and confirm that they both produce the same solution to the linear system (on one process). Also confirm that the matrix is symmetric positive definite so you can use CG. The reason for the difference in converging is because the default solver that PETSc uses is ILU(0). The "quality" of an ILU preconditioner depends on the "nonzero" structure of the matrix, the more "nonzero" locations the more likely there will be convergence and it will be faster. For ILU a zero in a location is different than not having that location at all. Likely you want to use a better preconditioner; what about -pc_type sor? For moderate size problems, say < 100,000 you may want to use Cholesky direct solver. In parallel run config/ configure.py --download-spooles then run the code with -pc_type cholesky -pc_factor_mat_solver_package spooles Barry > > > It would help me a lot if I can get some suggestions on how to use > MatSetValue() optimally, the reason for KSPCG failing to converge > and if something can be done. Also any suggestions on if I should > not use an "if" condition to enter values into the Stiffness matrix > to eliminate zero entries. > > I will be glad to send any other information if needed. > > Thanks in advance > Best Regards > Irfan > > > > > WITHOUT "IF" CONDITION > < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 > < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 > mallocs. > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 306 unneeded,4356 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 216 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using > Inode routines > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 465 unneeded,3576 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 183 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using > Inode routines > > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 243 unneeded,972 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 72 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 504 unneeded,1116 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 99 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > > WITH "IF" CONDITION >> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 >> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 118 unneeded,1304 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using >> Inode routines >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 353 unneeded,1033 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 6 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using >> Inode routines > >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: >> 14 unneeded,121 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 >> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: >> 14 unneeded,121 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 >> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From irfan.khan at gatech.edu Mon Jan 26 10:21:37 2009 From: irfan.khan at gatech.edu (Khan, Irfan) Date: Mon, 26 Jan 2009 11:21:37 -0500 (EST) Subject: parallel solvers and matrix structure In-Reply-To: <436963937.1993691232986882728.JavaMail.root@mail4.gatech.edu> Message-ID: <1516096250.1993791232986897340.JavaMail.root@mail4.gatech.edu> Are you doing substructuring yourself or just using petsc parallel solver ? I am not doing any substructuring, however, I am using the "-mat_partitioning_type parmetis" option to divide the geometry. I then assemble the global stiffness matrix and global load vector and use petsc parallel solver. Thanks Irfan --- On Mon, 1/26/09, Khan, Irfan wrote: From: Khan, Irfan Subject: Re: parallel solvers and matrix structure To: "PETSc users list" Date: Monday, January 26, 2009, 12:04 AM Thanks for the quick and helpful suggestions. I have used MatSetOptions() using MAT_IGNORE_ZERO_ENTRIES with -pc_type sor. This gives me the same Stiffness matrix in both cases (with and without excluding zero entries). Further the results are the same, including the number of iterations. However, the number of iterations to convergence is 70. Earlier I used to use PCASM with KSPCG. This required 17 iterations. Barry mentioned "spooles" with pc_type cholesky. But I am skeptical about its performs on systems with over a million degrees of freedom. My matrices are positive definite, thus CG may be ideally suited. But I am not sure what kind of preconditioner would best suit in parallel. Any suggestions on preconditioner-solver combinations for very large (1-10 million dof) positive definite systems? Thanks again Best Regards Irfan ----- Original Message ----- From: "Barry Smith" To: "PETSc users list" Sent: Sunday, January 25, 2009 9:42:42 PM GMT -05:00 US/Canada Eastern Subject: Re: parallel solvers and matrix structure On Jan 25, 2009, at 8:19 PM, Khan, Irfan wrote: > Dear Petsc team > > Firstly, thanks for developing PETSc. I have been using it to > parallelize a linear finite element code to use with a parallel > lattice Boltzmann code. It has helped me a lot untill now. > > I have some questions about the way the parallel solvers handles > matrices with varying zeros in the matrix. > > I have found that the more MatSetValue() commands I use in my code > the slower it gets. Therefore I initialize the parallel Stiffness > matrix to 0.0. I don't know what you mean by this. You should NOT call MatZeroEntries() on the matrix initially. This will destroy your preallocation information. Call MatZeroEntries() after you have filled it up and used it, when you are ready to start again. > I then fill in the values using an "if" conditions to eliminate zero > entries into the parallel Stiffness matrix. This reduces the number > of times MatSetValue() is called greatly (5-15 times). I also > approximate the number of non-zero entries into the parallel matrix, > that I create using MatCreateMPIAIJ. I have attached outputs of > running my code with "-info"; one with the "if" condition and the > other without the conditions (at the end of the email). If you are using the SeqAIJ or MPIAIJ matrices then you can simply call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) and it will not put in those locations with zero value. > > > I have also compared the matrices generated by using the "if" > condition and the one generated without the "if" condition. They are > the same except for the extra zero entries in the later case. But > what I realize is that during solving, the matrix generated with the > "if" condition is not converging while the matrix generated without > the "if" conditions converges in 17 iterations. I use KSPCG. First run both cases with -pc_type lu and confirm that they both produce the same solution to the linear system (on one process). Also confirm that the matrix is symmetric positive definite so you can use CG. The reason for the difference in converging is because the default solver that PETSc uses is ILU(0). The "quality" of an ILU preconditioner depends on the "nonzero" structure of the matrix, the more "nonzero" locations the more likely there will be convergence and it will be faster. For ILU a zero in a location is different than not having that location at all. Likely you want to use a better preconditioner; what about -pc_type sor? For moderate size problems, say < 100,000 you may want to use Cholesky direct solver. In parallel run config/ configure.py --download-spooles then run the code with -pc_type cholesky -pc_factor_mat_solver_package spooles Barry > > > It would help me a lot if I can get some suggestions on how to use > MatSetValue() optimally, the reason for KSPCG failing to converge > and if something can be done. Also any suggestions on if I should > not use an "if" condition to enter values into the Stiffness matrix > to eliminate zero entries. > > I will be glad to send any other information if needed. > > Thanks in advance > Best Regards > Irfan > > > > > WITHOUT "IF" CONDITION > < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 > < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 > mallocs. > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 306 unneeded,4356 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 216 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using > Inode routines > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 465 unneeded,3576 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 183 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using > Inode routines > > < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 243 unneeded,972 used > < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 72 > < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: > 504 unneeded,1116 used > < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during > MatSetValues() is 99 > < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 > < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/ > (num_localrows 66) < 0.6. Do not use CompressedRow routines. > > > WITH "IF" CONDITION >> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 >> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 118 unneeded,1304 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using >> Inode routines >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 353 unneeded,1033 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 6 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using >> Inode routines > >> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: >> 14 unneeded,121 used >> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 >> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: >> 14 unneeded,121 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 >> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/ >> (num_localrows 66) > 0.6. Use CompressedRow routines. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jan 26 11:30:39 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 26 Jan 2009 11:30:39 -0600 Subject: parallel solvers and matrix structure In-Reply-To: <391003168.1841291232946261557.JavaMail.root@mail4.gatech.edu> References: <391003168.1841291232946261557.JavaMail.root@mail4.gatech.edu> Message-ID: You'll want to see why it is running with a different number of iterations with the later PETSc release. As I said before run both (old and new) with -ksp_monitor_true_residual -ksp_truemonitor Barry On Jan 25, 2009, at 11:04 PM, Khan, Irfan wrote: > Thanks for the quick and helpful suggestions. I have used > MatSetOptions() using MAT_IGNORE_ZERO_ENTRIES with -pc_type sor. > This gives me the same Stiffness matrix in both cases (with and > without excluding zero entries). Further the results are the same, > including the number of iterations. > > However, the number of iterations to convergence is 70. Earlier I > used to use PCASM with KSPCG. This required 17 iterations. Barry > mentioned "spooles" with pc_type cholesky. But I am skeptical about > its performs on systems with over a million degrees of freedom. My > matrices are positive definite, thus CG may be ideally suited. But I > am not sure what kind of preconditioner would best suit in parallel. > > Any suggestions on preconditioner-solver combinations for very large > (1-10 million dof) positive definite systems? > > Thanks again > Best Regards > Irfan > > > ----- Original Message ----- > From: "Barry Smith" > To: "PETSc users list" > Sent: Sunday, January 25, 2009 9:42:42 PM GMT -05:00 US/Canada Eastern > Subject: Re: parallel solvers and matrix structure > > > On Jan 25, 2009, at 8:19 PM, Khan, Irfan wrote: > >> Dear Petsc team >> >> Firstly, thanks for developing PETSc. I have been using it to >> parallelize a linear finite element code to use with a parallel >> lattice Boltzmann code. It has helped me a lot untill now. >> >> I have some questions about the way the parallel solvers handles >> matrices with varying zeros in the matrix. >> >> I have found that the more MatSetValue() commands I use in my code >> the slower it gets. Therefore I initialize the parallel Stiffness >> matrix to 0.0. > > I don't know what you mean by this. You should NOT call > MatZeroEntries() on the matrix initially. This will destroy your > preallocation information. > Call MatZeroEntries() after you have filled it up and used it, when > you are ready to start again. > >> I then fill in the values using an "if" conditions to eliminate zero >> entries into the parallel Stiffness matrix. This reduces the number >> of times MatSetValue() is called greatly (5-15 times). I also >> approximate the number of non-zero entries into the parallel matrix, >> that I create using MatCreateMPIAIJ. I have attached outputs of >> running my code with "-info"; one with the "if" condition and the >> other without the conditions (at the end of the email). > > If you are using the SeqAIJ or MPIAIJ matrices then you can simply > call MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) > and it will not put in those locations with zero value. > > >> >> >> I have also compared the matrices generated by using the "if" >> condition and the one generated without the "if" condition. They are >> the same except for the extra zero entries in the later case. But >> what I realize is that during solving, the matrix generated with the >> "if" condition is not converging while the matrix generated without >> the "if" conditions converges in 17 iterations. I use KSPCG. > > First run both cases with -pc_type lu and confirm that they both > produce the same solution to the linear system (on one process). > > Also confirm that the matrix is symmetric positive definite so you > can use CG. > > The reason for the difference in converging is because the default > solver that PETSc uses is ILU(0). The "quality" of an ILU > preconditioner depends on the > "nonzero" structure of the matrix, the more "nonzero" locations the > more likely there will be convergence and it will be faster. For ILU a > zero in a location is different > than not having that location at all. Likely you want to use a better > preconditioner; what about -pc_type sor? For moderate size problems, > say < 100,000 you may > want to use Cholesky direct solver. In parallel run config/ > configure.py --download-spooles then run the code with -pc_type > cholesky -pc_factor_mat_solver_package spooles > > > Barry > >> >> >> It would help me a lot if I can get some suggestions on how to use >> MatSetValue() optimally, the reason for KSPCG failing to converge >> and if something can be done. Also any suggestions on if I should >> not use an "if" condition to enter values into the Stiffness matrix >> to eliminate zero entries. >> >> I will be glad to send any other information if needed. >> >> Thanks in advance >> Best Regards >> Irfan >> >> >> >> >> WITHOUT "IF" CONDITION >> < [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 3464 >> < [0] MatAssemblyBegin_MPIAIJ(): Stash has 432 entries, uses 0 >> mallocs. >> < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 306 unneeded,4356 used >> < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 216 >> < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 >> < [0] Mat_CheckInode(): Found 14 nodes of 66. Limit used: 5. Using >> Inode routines >> < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 465 unneeded,3576 used >> < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 183 >> < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 >> < [1] Mat_CheckInode(): Found 23 nodes of 66. Limit used: 5. Using >> Inode routines >> >> < [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 243 unneeded,972 used >> < [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 72 >> < [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 >> < [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 12)/ >> (num_localrows 66) < 0.6. Do not use CompressedRow routines. >> >> < [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >> 504 unneeded,1116 used >> < [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >> MatSetValues() is 99 >> < [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 66 >> < [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/ >> (num_localrows 66) < 0.6. Do not use CompressedRow routines. >> >> >> WITH "IF" CONDITION >>> [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 632 >>> [0] MatAssemblyBegin_MPIAIJ(): Stash has 78 entries, uses 0 mallocs. >>> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >>> 118 unneeded,1304 used >>> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >>> MatSetValues() is 0 >>> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >>> [0] Mat_CheckInode(): Found 66 nodes out of 66 rows. Not using >>> Inode routines >>> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 66; storage space: >>> 353 unneeded,1033 used >>> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >>> MatSetValues() is 6 >>> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 26 >>> [1] Mat_CheckInode(): Found 64 nodes out of 66 rows. Not using >>> Inode routines >> >>> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 24; storage space: >>> 14 unneeded,121 used >>> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >>> MatSetValues() is 0 >>> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 12 >>> [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 48)/ >>> (num_localrows 66) > 0.6. Use CompressedRow routines. >> >>> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 66 X 18; storage space: >>> 14 unneeded,121 used >>> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during >>> MatSetValues() is 0 >>> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 11 >>> [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 42)/ >>> (num_localrows 66) > 0.6. Use CompressedRow routines. >> >> >> >> >> > From recrusader at gmail.com Tue Jan 27 11:43:22 2009 From: recrusader at gmail.com (Yujie) Date: Tue, 27 Jan 2009 09:43:22 -0800 Subject: about MatSolve() and KSP Message-ID: <7ff0ee010901270943r38c81555nb7b706d3f3a55a6b@mail.gmail.com> Dear PETSc Developers: I have checked the differerence between 2.3.3 and 3.0.0 in using external package. Now, we don't need to use Matconvert() to convert the matrix to the type suitable for external package, just to use MatGetFactor(). The following codes are the realization of MatSolveXXX() after calling MatGetFactor(). 53: MatFactorInfoInitialize(&info); 54: MatGetFactor(C,MAT_SOLVER_PETSC,MAT_FACTOR_LU,&A); 55: MatLUFactorSymbolic(A,C,row,col,&info); 56: MatLUFactorNumeric(A,C,&info); 57: MatSolveTranspose(A,b,x); I have checked the description to MatSolveXXX() or MatMatSolve(). "Notes Most users should employ the simplified KSP interface for linear solvers instead of working directly with matrix algebra routines such as this. See, e.g., KSPCreate()." You advise to use KSP interface for calling MatSolveXXX(). I am wondering if it is ok to directly call MatSolveXX(), why need KSP interface? thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jan 27 11:51:54 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 27 Jan 2009 11:51:54 -0600 Subject: about MatSolve() and KSP In-Reply-To: <7ff0ee010901270943r38c81555nb7b706d3f3a55a6b@mail.gmail.com> References: <7ff0ee010901270943r38c81555nb7b706d3f3a55a6b@mail.gmail.com> Message-ID: On Jan 27, 2009, at 11:43 AM, Yujie wrote: > Dear PETSc Developers: > > I have checked the differerence between 2.3.3 and 3.0.0 in using > external package. > > Now, we don't need to use Matconvert() to convert the matrix to the > type suitable for external package, just to use MatGetFactor(). > > The following codes are the realization of MatSolveXXX() after > calling MatGetFactor(). > > 53: MatFactorInfoInitialize(&info); > > 54: MatGetFactor(C,MAT_SOLVER_PETSC,MAT_FACTOR_LU,&A); > > 55: MatLUFactorSymbolic(A,C,row,col,&info); > 56: MatLUFactorNumeric(A,C,&info); > 57: MatSolveTranspose(A,b,x); > > I have checked the description to MatSolveXXX() or MatMatSolve(). > > "Notes > Most users should employ the simplified KSP interface for linear > solvers instead of working directly with matrix algebra routines > such as this. See, e.g., KSPCreate()." > ^^^^^^^ MOST users of PETSc, 99% at least, need to solve a single linear system at a time. For these we recommend using KSP since it provides a COMMON interface for all linear solvers direct and iterative. By coding directly with MatSolve() one losses the flexibility of switching between many solvers are are more appropriate for PDES. Because block (multiple right hand side) Krylov methods are complicated and delicate (and we are lazy) we have not coded our iterative methods for solving many systems at the same time. Thus KSP does not support directly solving many linear systems at the same time. Due to demand from a small number of users we do provide support for multiple solves using direct solvers via MatMatSolve(). If we were not so damn lazy and coded KSP up for multiple solves then we would not suggest using MatMatSolve() directly, but since we are stupid and lazy when one solves multiple right hand sides with direct solvers we suggest MatMatSolve(). It is really easy to understand. Barry > You advise to use KSP interface for calling MatSolveXXX(). I am > wondering if it is ok to directly call MatSolveXX(), why need KSP > interface? thanks a lot. > > Regards, > > Yujie > > > From griffith at cims.nyu.edu Wed Jan 28 11:55:35 2009 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Wed, 28 Jan 2009 12:55:35 -0500 Subject: VecSetValues and irreproducible results in parallel Message-ID: <49809C17.9040109@cims.nyu.edu> Hi, Folks -- I recently noticed an irreproducibility issue with some parallel code which uses PETSc. Basically, I can run the same parallel simulation on the same number of processors and get different answers. The discrepancies were only apparent to the eye after very long integration times (e.g., approx. 20k timesteps), but in fact the computations start diverging after only a few handfuls (30--40) of timesteps. (Running the same code in serial yields reproducible results, but changing the MPI library or running the code on a different system did not make any difference.) I believe that I have tracked this issue down to VecSetValues/VecSetValuesBlocked. A part of the computation uses VecSetValues to accumulate the values of forces at the nodes of a mesh. At some nodes, force contributions come from multiple processors. It appears that there is some "randomness" in the way that this accumulation is performed in parallel, presumably related to the order in which values are communicated. I have found that I can make modified versions of VecSetValues_MPI(), VecSetValuesBlocked_MPI(), and VecAssemblyEnd_MPI() to ensure reproducible results. I make the following modifications: (1) VecSetValues_MPI() and VecSetValuesBlocked_MPI() place all values into the stash instead of immediately adding the local components. (2) VecAssemblyEnd_MPI() "extracts" all of the values provided by the stash and bstash, placing these values into lists corresponding to each of the local entries in the vector. Next, once all values have been extracted, I sort each of these lists (e.g., by descending or ascending magnitude). Making these changes appears to yield exactly reproducible results, although I am still performing tests to try to shake out any other problems. Another approach which seems to work is to perform the final summation in higher precision (e.g., using "double-double precision" arithmetic). Using double-double precision allows me to skip the sorting step, although since the number of values to sort is relatively small, it may be cheaper to sort. Using more accurate summation methods (e.g., compensated summation) does not appear to fix the lack or reproducibility problem. I was wondering if anyone has run into similar issues with PETSc and has a better solution. Thanks! -- Boyce PS: These tests were done using PETSc 2.3.3. I have just finished upgrading the code to PETSc 3.0.0 and will re-run them. However, looking at VecSetValues_MPI()/VecSetValuesBlocked_MPI()/etc, it looks like this code is essentially the same in PETSc 2.3.3 and 3.0.0. From bsmith at mcs.anl.gov Wed Jan 28 12:14:13 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 28 Jan 2009 12:14:13 -0600 Subject: VecSetValues and irreproducible results in parallel In-Reply-To: <49809C17.9040109@cims.nyu.edu> References: <49809C17.9040109@cims.nyu.edu> Message-ID: <04F06BA3-D1AB-487E-A13B-C49FC2A6B388@mcs.anl.gov> Boyce, As you know, any of the runs give equally valid answers (all right or all wrong depending on how you look at it). I assume you want reproducibility to help track down other bugs in the code and to see if changes in some places that are not suppose to change the solution are not wrong and actually changing the solution (which is hard to find if in each run you get a different answer). We would be interested in including your additions with petsc-dev with a config/configure.py option (say --with-reproducibility) that turns on the more precise (but slower) version if you would like to contribute it. Barry We see this issue every once it a while and never got the energy to comprehensively go through the code and support to make the computations much more reproducible. On Jan 28, 2009, at 11:55 AM, Boyce Griffith wrote: > Hi, Folks -- > > I recently noticed an irreproducibility issue with some parallel > code which uses PETSc. Basically, I can run the same parallel > simulation on the same number of processors and get different > answers. The discrepancies were only apparent to the eye after very > long integration times (e.g., approx. 20k timesteps), but in fact > the computations start diverging after only a few handfuls (30--40) > of timesteps. (Running the same code in serial yields reproducible > results, but changing the MPI library or running the code on a > different system did not make any difference.) > > I believe that I have tracked this issue down to VecSetValues/ > VecSetValuesBlocked. A part of the computation uses VecSetValues to > accumulate the values of forces at the nodes of a mesh. At some > nodes, force contributions come from multiple processors. It > appears that there is some "randomness" in the way that this > accumulation is performed in parallel, presumably related to the > order in which values are communicated. > > I have found that I can make modified versions of > VecSetValues_MPI(), VecSetValuesBlocked_MPI(), and > VecAssemblyEnd_MPI() to ensure reproducible results. I make the > following modifications: > > (1) VecSetValues_MPI() and VecSetValuesBlocked_MPI() place all > values into the stash instead of immediately adding the local > components. > > (2) VecAssemblyEnd_MPI() "extracts" all of the values provided by > the stash and bstash, placing these values into lists corresponding > to each of the local entries in the vector. Next, once all values > have been extracted, I sort each of these lists (e.g., by descending > or ascending magnitude). > > Making these changes appears to yield exactly reproducible results, > although I am still performing tests to try to shake out any other > problems. Another approach which seems to work is to perform the > final summation in higher precision (e.g., using "double-double > precision" arithmetic). Using double-double precision allows me to > skip the sorting step, although since the number of values to sort > is relatively small, it may be cheaper to sort. > > Using more accurate summation methods (e.g., compensated summation) > does not appear to fix the lack or reproducibility problem. > > I was wondering if anyone has run into similar issues with PETSc and > has a better solution. > > Thanks! > > -- Boyce > > PS: These tests were done using PETSc 2.3.3. I have just finished > upgrading the code to PETSc 3.0.0 and will re-run them. However, > looking at VecSetValues_MPI()/VecSetValuesBlocked_MPI()/etc, it > looks like this code is essentially the same in PETSc 2.3.3 and 3.0.0. From griffith at cims.nyu.edu Wed Jan 28 12:27:50 2009 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Wed, 28 Jan 2009 13:27:50 -0500 Subject: VecSetValues and irreproducible results in parallel In-Reply-To: <04F06BA3-D1AB-487E-A13B-C49FC2A6B388@mcs.anl.gov> References: <49809C17.9040109@cims.nyu.edu> <04F06BA3-D1AB-487E-A13B-C49FC2A6B388@mcs.anl.gov> Message-ID: <4980A3A6.80807@cims.nyu.edu> Hi, Barry -- I suppose that each of the runs could be thought of as a particular "realization" or something like that. In my case, I figure they are all "equally wrong." ;-) I have a hunch that the discrepancies between runs is exacerbated by under-resolution, but I haven't confirmed this to be the case. What got me started on this was I was looking into the effect of changing a convergence threshold: how tight do I need to make my convergence tolerance before I start getting the "same" answer? I was distressed to discover that no matter how tight the tolerance, the results were always different. Then I noticed that they were different even for the same tolerance. This made it a little difficult to choose the tolerance! I'd be happy to supply the code for inclusion in petsc-dev. I use C++ STL vectors and sorting algorithms in VecAssemblyEnd, however. Can I toss you all the code with C++ stuff in it, or would it really only be useful if I turn it into C-only code first? -- Boyce Barry Smith wrote: > > Boyce, > > As you know, any of the runs give equally valid answers (all right > or all wrong depending on how you look at it). I assume you > want reproducibility to help track down other bugs in the code and to > see if changes in some places that are not suppose to change > the solution are not wrong and actually changing the solution (which is > hard to find if in each run you get a different answer). > > We would be interested in including your additions with petsc-dev > with a config/configure.py option (say --with-reproducibility) that turns > on the more precise (but slower) version if you would like to contribute > it. > > Barry > > We see this issue every once it a while and never got the energy to > comprehensively go through the code and support to make the computations > much > more reproducible. > > > On Jan 28, 2009, at 11:55 AM, Boyce Griffith wrote: > >> Hi, Folks -- >> >> I recently noticed an irreproducibility issue with some parallel code >> which uses PETSc. Basically, I can run the same parallel simulation >> on the same number of processors and get different answers. The >> discrepancies were only apparent to the eye after very long >> integration times (e.g., approx. 20k timesteps), but in fact the >> computations start diverging after only a few handfuls (30--40) of >> timesteps. (Running the same code in serial yields reproducible >> results, but changing the MPI library or running the code on a >> different system did not make any difference.) >> >> I believe that I have tracked this issue down to >> VecSetValues/VecSetValuesBlocked. A part of the computation uses >> VecSetValues to accumulate the values of forces at the nodes of a >> mesh. At some nodes, force contributions come from multiple >> processors. It appears that there is some "randomness" in the way >> that this accumulation is performed in parallel, presumably related to >> the order in which values are communicated. >> >> I have found that I can make modified versions of VecSetValues_MPI(), >> VecSetValuesBlocked_MPI(), and VecAssemblyEnd_MPI() to ensure >> reproducible results. I make the following modifications: >> >> (1) VecSetValues_MPI() and VecSetValuesBlocked_MPI() place all values >> into the stash instead of immediately adding the local components. >> >> (2) VecAssemblyEnd_MPI() "extracts" all of the values provided by the >> stash and bstash, placing these values into lists corresponding to >> each of the local entries in the vector. Next, once all values have >> been extracted, I sort each of these lists (e.g., by descending or >> ascending magnitude). >> >> Making these changes appears to yield exactly reproducible results, >> although I am still performing tests to try to shake out any other >> problems. Another approach which seems to work is to perform the >> final summation in higher precision (e.g., using "double-double >> precision" arithmetic). Using double-double precision allows me to >> skip the sorting step, although since the number of values to sort is >> relatively small, it may be cheaper to sort. >> >> Using more accurate summation methods (e.g., compensated summation) >> does not appear to fix the lack or reproducibility problem. >> >> I was wondering if anyone has run into similar issues with PETSc and >> has a better solution. >> >> Thanks! >> >> -- Boyce >> >> PS: These tests were done using PETSc 2.3.3. I have just finished >> upgrading the code to PETSc 3.0.0 and will re-run them. However, >> looking at VecSetValues_MPI()/VecSetValuesBlocked_MPI()/etc, it looks >> like this code is essentially the same in PETSc 2.3.3 and 3.0.0. > From jerome.snho at gmail.com Wed Jan 28 17:31:40 2009 From: jerome.snho at gmail.com (jerome ho) Date: Thu, 29 Jan 2009 07:31:40 +0800 Subject: Increasing convergence rate In-Reply-To: <383ade90901230953o634500acw4a47d01ca98ff64a@mail.gmail.com> References: <4316b1710901142104ofc101d1l94cea4ce400a7a8a@mail.gmail.com> <383ade90901150129t44bb9160h386e666b84c0b6e9@mail.gmail.com> <4316b1710901202252pae00ea4r90de6e132be0ab4b@mail.gmail.com> <383ade90901202314r43e1513cg255f5e1b79d6b1ab@mail.gmail.com> <4316b1710901222159s78ffd0do889425f57b7f0bf3@mail.gmail.com> <383ade90901230953o634500acw4a47d01ca98ff64a@mail.gmail.com> Message-ID: <4316b1710901281531p32525b67p20bd617b8512725e@mail.gmail.com> On Sat, Jan 24, 2009 at 1:53 AM, Jed Brown wrote: > The 20% result difference makes me very worried that the matrices are > actually different. > Are you still using BoomerAMG? If your 1Mx1M matrix comes from a 2D > problem you might be able to compare with a direct solve (-pc_type lu > -pc_factor_mat_solver_package mumps) but if it's 3D, that would take > way too much memory. It's a good idea to make the problem as small as > possible (like 100x100 or less) when dealing with issues of > correctness. It's really hard to make a preconditioner exactly the > same in parallel, even parallel ILU (like Euclid with default options) > is not exactly the same. It's silly, but if you can't make the > problem smaller, can't use a direct solver, and don't have an easy way > to determine if the parallel matrix is the same as the serial one, try > -pc_type redundant -pc_redundant_type hypre, the results (up to > rounding error due to non-associativity) and number of iterations > should be the same as in serial but the monitored residuals won't be > exactly the same since they are computed differently. Thanks for your advice. I finally managed to nail down the problem. Earlier, on smaller test cases, the matrices on both serial and parallel was verified to be the same. I didn't thought it was the matrices. But when I tried the redundant method I still got the 20% difference. So, I recheck the matrix stamping again and there were a few elements that I missed when distributed into more processors, which makes it even harder to converge. Now, both the serial and parallel results correlates and converges within several iterations. Thanks again! Jerome From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Thu Jan 29 09:39:24 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Thu, 29 Jan 2009 16:39:24 +0100 Subject: MatCreateMPIAIJWithSplitArrays() Message-ID: <20090129163924.w7ti7mnu2o000wow@webmail.ec-nantes.fr> Hi, Is anybody using MatCreateMPIAIJWithSplitArrays()? I got the error below, eventhough, the matrix is built but not completed. [0]PETSC ERROR: --------------- Error Message --------------- [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Column entry number 1 (actual colum 0) in row 1 is not sorted! I think there is a problem with column indices but I thought that it has been sorted, please see below. I am using this command call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,2,2, $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, $ ocolumn,ov,D,ierr) for processor 0: pointer = [0 2 4] ! row pointer into column column = [0 1 0 1] ! local column index v = [5.49 -2.74 -2.74 5.49] ! diagonal value off-diagonal processor 0: opointer = [0 1 3] ocolumn = [2 3] ov = [-2.74 -2.74] The matrix: row 0: (0, 5.49395) (1, -2.74697) (2, -2.74697) row 1: (0, 0) (1, 5.49395) (3, -2.74697) I am using fortran with petsc-3.0.0. Thank you, Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE From knepley at gmail.com Thu Jan 29 13:51:21 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 Jan 2009 13:51:21 -0600 Subject: MatCreateMPIAIJWithSplitArrays() In-Reply-To: <20090129163924.w7ti7mnu2o000wow@webmail.ec-nantes.fr> References: <20090129163924.w7ti7mnu2o000wow@webmail.ec-nantes.fr> Message-ID: Can you send this code to petsc-maint at mcs.anl.gov? I tried in C and this runs fine. Matt On Thu, Jan 29, 2009 at 9:39 AM, Panyasantisuk Jarunan wrote: > Hi, > > Is anybody using MatCreateMPIAIJWithSplitArrays()? > I got the error below, eventhough, the matrix is built but not completed. > > [0]PETSC ERROR: --------------- Error Message --------------- > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Column entry number 1 (actual colum 0) in row 1 is not > sorted! > > I think there is a problem with column indices but I thought that it has > been sorted, please see below. > > I am using this command > > call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,2,2, > $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, > $ ocolumn,ov,D,ierr) > > for processor 0: > pointer = [0 2 4] ! row pointer into column column = [0 1 0 1] > ! local column index > v = [5.49 -2.74 -2.74 5.49] ! diagonal value > > off-diagonal processor 0: opointer = [0 1 3] ocolumn = [2 3] > ov = [-2.74 -2.74] > > The matrix: > row 0: (0, 5.49395) (1, -2.74697) (2, -2.74697) > row 1: (0, 0) (1, 5.49395) (3, -2.74697) > > I am using fortran with petsc-3.0.0. > > > Thank you, > Jarunan > > > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From dalcinl at gmail.com Thu Jan 29 14:01:49 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 29 Jan 2009 17:01:49 -0300 Subject: PETSc version 3.0.0-p2 In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> References: <1200D8BEDB3DD54DBA528E210F372BF3D944C1@BEFUNCIONARIOS.uminho.pt> Message-ID: Sorry for the late response. In 3.0.0, some things have changed in ASM. Now ASM try to make a "smart" subdomain partitioning, they are likely to improve things in the case of unstructured grid problems or many subdomanins per processor. However, if your problem is on a structured grid, and you are using 2 processors, and let say 4 subdomains (i.e., 2 subdoms/proc), then in that particular case the new "smart" partitioning could be a bit worse than the one obtained with the old code. But I would not expect a dramatic difference in the GMRES iteration counts... The differences will only show-up if you built PETSc with ParMETIS and you are using more than 1 subdomain per processor. Is this your case? On Sun, Jan 25, 2009 at 3:35 PM, Billy Ara?jo wrote: > > I compiled my code using the new version of PETSc but the convergence of the > GMRES has suffered greatly, specially in parallel. I am using GMRES with ASM > preconditioner. > Has there been any change? > > Billy. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From jerome.snho at gmail.com Thu Jan 29 19:46:12 2009 From: jerome.snho at gmail.com (jerome ho) Date: Fri, 30 Jan 2009 09:46:12 +0800 Subject: Time to execute KSPSetUp Message-ID: <4316b1710901291746s5724cf1agb08cb066348c255a@mail.gmail.com> Hi I wonder if there's any way I can speed up KSPSetUp()? On larger test cases with 1e8 unknowns, it's taking more than 17 hours (and still incomplete) to execute this command across 3 processors with BoomerAMG, where as the whole setup+solving takes 2 min if the problem size is 1e6. I've verified that the matrices has completed assembly and I'm using the non-debugging Petsc version. Jerome -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jan 29 20:36:34 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 Jan 2009 20:36:34 -0600 Subject: Time to execute KSPSetUp In-Reply-To: <4316b1710901291746s5724cf1agb08cb066348c255a@mail.gmail.com> References: <4316b1710901291746s5724cf1agb08cb066348c255a@mail.gmail.com> Message-ID: On Thu, Jan 29, 2009 at 7:46 PM, jerome ho wrote: > Hi > > I wonder if there's any way I can speed up KSPSetUp()? > On larger test cases with 1e8 unknowns, it's taking more than 17 hours (and > still incomplete) to execute this command across 3 processors with > BoomerAMG, > where as the whole setup+solving takes 2 min if the problem size is 1e6. > > I've verified that the matrices has completed assembly and I'm using the > non-debugging Petsc version. We have no control over the code in Hypre. I suggest mailing the Hypre developers. The setup phase is the most costly part. Matt > Jerome -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Thu Jan 29 19:09:47 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 Jan 2009 19:09:47 -0600 Subject: MatCreateMPIAIJWithSplitArrays() In-Reply-To: <20090129163924.w7ti7mnu2o000wow@webmail.ec-nantes.fr> References: <20090129163924.w7ti7mnu2o000wow@webmail.ec-nantes.fr> Message-ID: On Jan 29, 2009, at 9:39 AM, Panyasantisuk Jarunan wrote: > Hi, > > Is anybody using MatCreateMPIAIJWithSplitArrays()? > I got the error below, eventhough, the matrix is built but not > completed. > > [0]PETSC ERROR: --------------- Error Message --------------- > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Column entry number 1 (actual colum 0) in row 1 is > not sorted! > > I think there is a problem with column indices but I thought that it > has been sorted, please see below. > > I am using this command > > call MatCreateMPIAIJWithSplitArrays(PETSC_COMM_WORLD,2,2, > $ PETSC_DETERMINE,PETSC_DETERMINE,pointer,column,v,opointer, > $ ocolumn,ov,D,ierr) > > for processor 0: > pointer = [0 2 4] ! row pointer into column column = [0 1 > 0 1] ! local column index > v = [5.49 -2.74 -2.74 5.49] ! diagonal value > > off-diagonal processor 0: opointer = [0 1 3] ocolumn = [2 3] > ov = [-2.74 -2.74] > This is wrong. The first row has one entry (1-0) but the second row has two entries (3-1) for a total of three entries. Yet it has only 2 column pointers and two numerical values. Barry > The matrix: > row 0: (0, 5.49395) (1, -2.74697) (2, -2.74697) > row 1: (0, 0) (1, 5.49395) (3, -2.74697) > > I am using fortran with petsc-3.0.0. > > > Thank you, > Jarunan > > > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > > > From bsmith at mcs.anl.gov Thu Jan 29 19:13:06 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 Jan 2009 19:13:06 -0600 Subject: VecSetValues and irreproducible results in parallel In-Reply-To: <4980A3A6.80807@cims.nyu.edu> References: <49809C17.9040109@cims.nyu.edu> <04F06BA3-D1AB-487E-A13B-C49FC2A6B388@mcs.anl.gov> <4980A3A6.80807@cims.nyu.edu> Message-ID: Since the code would only be compiled in when the special reproducability flag is on we can live with the C++. Barry On Jan 28, 2009, at 12:27 PM, Boyce Griffith wrote: > Hi, Barry -- > > I suppose that each of the runs could be thought of as a particular > "realization" or something like that. In my case, I figure they are > all "equally wrong." ;-) I have a hunch that the discrepancies > between runs is exacerbated by under-resolution, but I haven't > confirmed this to be the case. > > What got me started on this was I was looking into the effect of > changing a convergence threshold: how tight do I need to make my > convergence tolerance before I start getting the "same" answer? I > was distressed to discover that no matter how tight the tolerance, > the results were always different. Then I noticed that they were > different even for the same tolerance. This made it a little > difficult to choose the tolerance! > > I'd be happy to supply the code for inclusion in petsc-dev. I use C+ > + STL vectors and sorting algorithms in VecAssemblyEnd, however. > Can I toss you all the code with C++ stuff in it, or would it really > only be useful if I turn it into C-only code first? > > -- Boyce > > Barry Smith wrote: >> Boyce, >> As you know, any of the runs give equally valid answers (all >> right or all wrong depending on how you look at it). I assume you >> want reproducibility to help track down other bugs in the code and >> to see if changes in some places that are not suppose to change >> the solution are not wrong and actually changing the solution >> (which is hard to find if in each run you get a different answer). >> We would be interested in including your additions with petsc-dev >> with a config/configure.py option (say --with-reproducibility) that >> turns >> on the more precise (but slower) version if you would like to >> contribute it. >> Barry >> We see this issue every once it a while and never got the energy >> to comprehensively go through the code and support to make the >> computations much >> more reproducible. >> On Jan 28, 2009, at 11:55 AM, Boyce Griffith wrote: >>> Hi, Folks -- >>> >>> I recently noticed an irreproducibility issue with some parallel >>> code which uses PETSc. Basically, I can run the same parallel >>> simulation on the same number of processors and get different >>> answers. The discrepancies were only apparent to the eye after >>> very long integration times (e.g., approx. 20k timesteps), but in >>> fact the computations start diverging after only a few handfuls >>> (30--40) of timesteps. (Running the same code in serial yields >>> reproducible results, but changing the MPI library or running the >>> code on a different system did not make any difference.) >>> >>> I believe that I have tracked this issue down to VecSetValues/ >>> VecSetValuesBlocked. A part of the computation uses VecSetValues >>> to accumulate the values of forces at the nodes of a mesh. At >>> some nodes, force contributions come from multiple processors. It >>> appears that there is some "randomness" in the way that this >>> accumulation is performed in parallel, presumably related to the >>> order in which values are communicated. >>> >>> I have found that I can make modified versions of >>> VecSetValues_MPI(), VecSetValuesBlocked_MPI(), and >>> VecAssemblyEnd_MPI() to ensure reproducible results. I make the >>> following modifications: >>> >>> (1) VecSetValues_MPI() and VecSetValuesBlocked_MPI() place all >>> values into the stash instead of immediately adding the local >>> components. >>> >>> (2) VecAssemblyEnd_MPI() "extracts" all of the values provided by >>> the stash and bstash, placing these values into lists >>> corresponding to each of the local entries in the vector. Next, >>> once all values have been extracted, I sort each of these lists >>> (e.g., by descending or ascending magnitude). >>> >>> Making these changes appears to yield exactly reproducible >>> results, although I am still performing tests to try to shake out >>> any other problems. Another approach which seems to work is to >>> perform the final summation in higher precision (e.g., using >>> "double-double precision" arithmetic). Using double-double >>> precision allows me to skip the sorting step, although since the >>> number of values to sort is relatively small, it may be cheaper to >>> sort. >>> >>> Using more accurate summation methods (e.g., compensated >>> summation) does not appear to fix the lack or reproducibility >>> problem. >>> >>> I was wondering if anyone has run into similar issues with PETSc >>> and has a better solution. >>> >>> Thanks! >>> >>> -- Boyce >>> >>> PS: These tests were done using PETSc 2.3.3. I have just finished >>> upgrading the code to PETSc 3.0.0 and will re-run them. However, >>> looking at VecSetValues_MPI()/VecSetValuesBlocked_MPI()/etc, it >>> looks like this code is essentially the same in PETSc 2.3.3 and >>> 3.0.0. From schuang at ats.ucla.edu Fri Jan 30 14:06:56 2009 From: schuang at ats.ucla.edu (Shao-Ching Huang) Date: Fri, 30 Jan 2009 12:06:56 -0800 Subject: shared libraries for PETSc 3.0.0-p2 + Hypre Message-ID: <49835DE0.6090001@ats.ucla.edu> Hi, I am trying to build shared library version of PETSC 3.0.0-p2 with Hypre (Fedora 10 Linux on x86_64). With "configure --with-shared=1 ...", it does build shared libraries for PETSC, but the Hypre part is still static. The file externalpackages/hypre-2.4.0b/src/config.log shows that "--enable-shared" (for configuring Hypre) is not there. Is there a way to build shared libraries for Hypre from PETSc's configure command line? Thanks. Shao-Ching My complete PETSc configure command: ./config/configure.py --with-debugging=1 --with-shared=1 \ --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx \ --with-blas-lapack-dir=/usr/lib64 \ --download-hypre=yes The first few lines of externalpackages/hypre-2.4.0b/src/config.log (reformatted): $ ./configure --prefix=/home/schuang/local/petsc-3.0.0-p2-shared/linux-gnu-c-debug CC=mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3 CXX=mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g -fPIC F77=mpif90 -fPIC -Wall -Wno-unused-variable -g --with-MPI-include=/usr/include/openmpi/1.2.4-gcc/64 --with-MPI-lib-dirs= --with-MPI-libs=nsl rt --with-blas-libs= --with-blas-lib-dir= --with-lapack-libs= --with-lapack-lib-dir= --with-blas=yes --with-lapack=yes --without-babel --without-mli --without-fei --without-superlu From balay at mcs.anl.gov Fri Jan 30 14:14:34 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 30 Jan 2009 14:14:34 -0600 (CST) Subject: shared libraries for PETSc 3.0.0-p2 + Hypre In-Reply-To: <49835DE0.6090001@ats.ucla.edu> References: <49835DE0.6090001@ats.ucla.edu> Message-ID: We build external packages minimally only. So the PETSc shared libraries will link with most of the external .a libraries. This works fine for us. To build shared hypre library - you can try the following and see if it works: cd $PETSC_DIR make SHLIBS=libHYPRE shared Satish On Fri, 30 Jan 2009, Shao-Ching Huang wrote: > Hi, > > I am trying to build shared library version of PETSC 3.0.0-p2 with > Hypre (Fedora 10 Linux on x86_64). > > With "configure --with-shared=1 ...", it does build shared libraries > for PETSC, but the Hypre part is still static. The file > externalpackages/hypre-2.4.0b/src/config.log shows that > "--enable-shared" (for configuring Hypre) is not there. > > Is there a way to build shared libraries for Hypre from PETSc's > configure command line? > > Thanks. > > Shao-Ching > > My complete PETSc configure command: > > ./config/configure.py --with-debugging=1 --with-shared=1 \ > --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx \ > --with-blas-lapack-dir=/usr/lib64 \ > --download-hypre=yes > > The first few lines of externalpackages/hypre-2.4.0b/src/config.log > (reformatted): > > $ ./configure > --prefix=/home/schuang/local/petsc-3.0.0-p2-shared/linux-gnu-c-debug > CC=mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3 > CXX=mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g -fPIC > F77=mpif90 -fPIC -Wall -Wno-unused-variable -g > --with-MPI-include=/usr/include/openmpi/1.2.4-gcc/64 > --with-MPI-lib-dirs= --with-MPI-libs=nsl rt --with-blas-libs= > --with-blas-lib-dir= --with-lapack-libs= --with-lapack-lib-dir= > --with-blas=yes --with-lapack=yes --without-babel --without-mli > --without-fei --without-superlu > > > From schuang at ats.ucla.edu Fri Jan 30 14:42:23 2009 From: schuang at ats.ucla.edu (Shao-Ching Huang) Date: Fri, 30 Jan 2009 12:42:23 -0800 Subject: shared libraries for PETSc 3.0.0-p2 + Hypre In-Reply-To: References: <49835DE0.6090001@ats.ucla.edu> Message-ID: <4983662F.2070800@ats.ucla.edu> Satish: Thanks for the suggestion. I will just follow the standard approach then -- PETSc shared libraries with external .a libraries. Shao-Ching Satish Balay wrote: > We build external packages minimally only. So the PETSc shared > libraries will link with most of the external .a libraries. This works > fine for us. > > To build shared hypre library - you can try the following and see if > it works: > > cd $PETSC_DIR > make SHLIBS=libHYPRE shared > > Satish > > On Fri, 30 Jan 2009, Shao-Ching Huang wrote: > >> Hi, >> >> I am trying to build shared library version of PETSC 3.0.0-p2 with >> Hypre (Fedora 10 Linux on x86_64). >> >> With "configure --with-shared=1 ...", it does build shared libraries >> for PETSC, but the Hypre part is still static. The file >> externalpackages/hypre-2.4.0b/src/config.log shows that >> "--enable-shared" (for configuring Hypre) is not there. >> >> Is there a way to build shared libraries for Hypre from PETSc's >> configure command line? >> >> Thanks. >> >> Shao-Ching >> >> My complete PETSc configure command: >> >> ./config/configure.py --with-debugging=1 --with-shared=1 \ >> --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx \ >> --with-blas-lapack-dir=/usr/lib64 \ >> --download-hypre=yes >> >> The first few lines of externalpackages/hypre-2.4.0b/src/config.log >> (reformatted): >> >> $ ./configure >> --prefix=/home/schuang/local/petsc-3.0.0-p2-shared/linux-gnu-c-debug >> CC=mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3 >> CXX=mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g -fPIC >> F77=mpif90 -fPIC -Wall -Wno-unused-variable -g >> --with-MPI-include=/usr/include/openmpi/1.2.4-gcc/64 >> --with-MPI-lib-dirs= --with-MPI-libs=nsl rt --with-blas-libs= >> --with-blas-lib-dir= --with-lapack-libs= --with-lapack-lib-dir= >> --with-blas=yes --with-lapack=yes --without-babel --without-mli >> --without-fei --without-superlu >> >> >> > From balay at mcs.anl.gov Fri Jan 30 14:53:09 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 30 Jan 2009 14:53:09 -0600 (CST) Subject: shared libraries for PETSc 3.0.0-p2 + Hypre In-Reply-To: <4983662F.2070800@ats.ucla.edu> References: <49835DE0.6090001@ats.ucla.edu> <4983662F.2070800@ats.ucla.edu> Message-ID: Note - in our default approach, hypre symbols are pulled into petsc sharedlibraries [and not user binaries - so the user binaries will remain small] Satish On Fri, 30 Jan 2009, Shao-Ching Huang wrote: > Satish: > > Thanks for the suggestion. I will just follow the standard approach > then -- PETSc shared libraries with external .a libraries. > > Shao-Ching > > Satish Balay wrote: > > We build external packages minimally only. So the PETSc shared > > libraries will link with most of the external .a libraries. This works > > fine for us. > > > > To build shared hypre library - you can try the following and see if > > it works: > > > > cd $PETSC_DIR > > make SHLIBS=libHYPRE shared > > > > Satish > > > > On Fri, 30 Jan 2009, Shao-Ching Huang wrote: > > > > > Hi, > > > > > > I am trying to build shared library version of PETSC 3.0.0-p2 with > > > Hypre (Fedora 10 Linux on x86_64). > > > > > > With "configure --with-shared=1 ...", it does build shared libraries > > > for PETSC, but the Hypre part is still static. The file > > > externalpackages/hypre-2.4.0b/src/config.log shows that > > > "--enable-shared" (for configuring Hypre) is not there. > > > > > > Is there a way to build shared libraries for Hypre from PETSc's > > > configure command line? > > > > > > Thanks. > > > > > > Shao-Ching > > > > > > My complete PETSc configure command: > > > > > > ./config/configure.py --with-debugging=1 --with-shared=1 \ > > > --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx \ > > > --with-blas-lapack-dir=/usr/lib64 \ > > > --download-hypre=yes > > > > > > The first few lines of externalpackages/hypre-2.4.0b/src/config.log > > > (reformatted): > > > > > > $ ./configure > > > --prefix=/home/schuang/local/petsc-3.0.0-p2-shared/linux-gnu-c-debug > > > CC=mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3 > > > CXX=mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g -fPIC > > > F77=mpif90 -fPIC -Wall -Wno-unused-variable -g > > > --with-MPI-include=/usr/include/openmpi/1.2.4-gcc/64 > > > --with-MPI-lib-dirs= --with-MPI-libs=nsl rt --with-blas-libs= > > > --with-blas-lib-dir= --with-lapack-libs= --with-lapack-lib-dir= > > > --with-blas=yes --with-lapack=yes --without-babel --without-mli > > > --without-fei --without-superlu > > > > > > > > > > > > From schuang at ats.ucla.edu Fri Jan 30 14:59:42 2009 From: schuang at ats.ucla.edu (Shao-Ching Huang) Date: Fri, 30 Jan 2009 12:59:42 -0800 Subject: shared libraries for PETSc 3.0.0-p2 + Hypre In-Reply-To: References: <49835DE0.6090001@ats.ucla.edu> <4983662F.2070800@ats.ucla.edu> Message-ID: <49836A3E.4090602@ats.ucla.edu> Satish Balay wrote: > Note - in our default approach, hypre symbols are pulled into petsc > sharedlibraries [and not user binaries - so the user binaries will > remain small] This is cool! Thanks. Shao-Ching > > Satish > > On Fri, 30 Jan 2009, Shao-Ching Huang wrote: > >> Satish: >> >> Thanks for the suggestion. I will just follow the standard approach >> then -- PETSc shared libraries with external .a libraries. >> >> Shao-Ching >> >> Satish Balay wrote: >>> We build external packages minimally only. So the PETSc shared >>> libraries will link with most of the external .a libraries. This works >>> fine for us. >>> >>> To build shared hypre library - you can try the following and see if >>> it works: >>> >>> cd $PETSC_DIR >>> make SHLIBS=libHYPRE shared >>> >>> Satish >>> >>> On Fri, 30 Jan 2009, Shao-Ching Huang wrote: >>> >>>> Hi, >>>> >>>> I am trying to build shared library version of PETSC 3.0.0-p2 with >>>> Hypre (Fedora 10 Linux on x86_64). >>>> >>>> With "configure --with-shared=1 ...", it does build shared libraries >>>> for PETSC, but the Hypre part is still static. The file >>>> externalpackages/hypre-2.4.0b/src/config.log shows that >>>> "--enable-shared" (for configuring Hypre) is not there. >>>> >>>> Is there a way to build shared libraries for Hypre from PETSc's >>>> configure command line? >>>> >>>> Thanks. >>>> >>>> Shao-Ching >>>> >>>> My complete PETSc configure command: >>>> >>>> ./config/configure.py --with-debugging=1 --with-shared=1 \ >>>> --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx \ >>>> --with-blas-lapack-dir=/usr/lib64 \ >>>> --download-hypre=yes >>>> >>>> The first few lines of externalpackages/hypre-2.4.0b/src/config.log >>>> (reformatted): >>>> >>>> $ ./configure >>>> --prefix=/home/schuang/local/petsc-3.0.0-p2-shared/linux-gnu-c-debug >>>> CC=mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3 >>>> CXX=mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g -fPIC >>>> F77=mpif90 -fPIC -Wall -Wno-unused-variable -g >>>> --with-MPI-include=/usr/include/openmpi/1.2.4-gcc/64 >>>> --with-MPI-lib-dirs= --with-MPI-libs=nsl rt --with-blas-libs= >>>> --with-blas-lib-dir= --with-lapack-libs= --with-lapack-lib-dir= >>>> --with-blas=yes --with-lapack=yes --without-babel --without-mli >>>> --without-fei --without-superlu >>>> >>>> >>>> > From enjoywm at cs.wm.edu Fri Jan 30 16:10:36 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Fri, 30 Jan 2009 17:10:36 -0500 Subject: detail about ksp in PETSC Message-ID: <49837ADC.5000407@cs.wm.edu> Hi, Where can I find the details about the implementation of KSP solver such as CG in PETSC? Thanks. Yixun From knepley at gmail.com Fri Jan 30 16:25:00 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 30 Jan 2009 16:25:00 -0600 Subject: detail about ksp in PETSC In-Reply-To: <49837ADC.5000407@cs.wm.edu> References: <49837ADC.5000407@cs.wm.edu> Message-ID: You have to look in the code. This solver is implemented exactly as in Saad's book. Matt On Fri, Jan 30, 2009 at 4:10 PM, Yixun Liu wrote: > Hi, > Where can I find the details about the implementation of KSP solver such > as CG in PETSC? > > Thanks. > > Yixun > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Fri Jan 30 16:25:45 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Jan 2009 16:25:45 -0600 Subject: detail about ksp in PETSC In-Reply-To: <49837ADC.5000407@cs.wm.edu> References: <49837ADC.5000407@cs.wm.edu> Message-ID: <0C54DC4D-7C0B-4BF4-9903-923C927414EB@mcs.anl.gov> The source code. They are in for example src/ksp/ksp/impls/cg/cg.c Or all pretty at http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/ksp/ksp/impls/cg/cg.c.html#KSPCG Barry On Jan 30, 2009, at 4:10 PM, Yixun Liu wrote: > Hi, > Where can I find the details about the implementation of KSP solver > such > as CG in PETSC? > > Thanks. > > Yixun From rlmackie862 at gmail.com Fri Jan 30 20:01:33 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 30 Jan 2009 18:01:33 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 Message-ID: <4983B0FD.2050006@gmail.com> I just downloaded and compiled Petsc 3.0.0-p2, and after making some changes in my code to specify the correct location of the include files, finally got everything to compile okay. Now, I'm trying to run my test problem, and it's not converging. When I say not converging, the first line (with ksp_monitor_true_residual) shows that the true and preconditioned residuals are the same as before, but immediately thereafter, the preconditioned residual fails to go below 1e-8 whereas before it quickly went down to 1e-15. The options in my command file are: -ksp_type bcgsl -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 3 -sub_pc_factor_fill 6 The only thing I see in the Change notes are that the ILU defaults to shifting so that it's p.d. but I don't see an easy way to turn this off by the command line to see if that's the problem. I tried to do it in my program, but it's unclear if I did that correctly. Any suggestions? Thanks, Randy From bsmith at mcs.anl.gov Fri Jan 30 20:25:14 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Jan 2009 20:25:14 -0600 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983B0FD.2050006@gmail.com> References: <4983B0FD.2050006@gmail.com> Message-ID: Run the old code with -ksp_view_binary this will create a file called binaryoutput; you can then run src/ksp/ksp/examples/tutorials/ex10.c using that input file. Use the ex10 from the old version of PETSc and then the ex10 from the new version. Do they have they same convergence? Now run the new code (that is build your code with petsc-3.0.) with - ksp_view_binary and run that binaryoutput file with the old and new ex10 to see what happens. Basically there are two possible changes with the change in the version: 1) the matrix/right hand side has changed or 2) the solver has changed to behave differently. By running the four cases you can start to get a handle on what has actually changed, this will lead you to what needs to be investigated next. Barry On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: > I just downloaded and compiled Petsc 3.0.0-p2, and after making some > changes > in my code to specify the correct location of the include files, > finally > got everything to compile okay. > > Now, I'm trying to run my test problem, and it's not converging. > When I say > not converging, the first line (with ksp_monitor_true_residual) > shows that > the true and preconditioned residuals are the same as before, but > immediately > thereafter, the preconditioned residual fails to go below 1e-8 > whereas before > it quickly went down to 1e-15. > > The options in my command file are: > > -ksp_type bcgsl > -pc_type bjacobi > -sub_pc_type ilu > -sub_pc_factor_levels 3 > -sub_pc_factor_fill 6 > > > The only thing I see in the Change notes are that the ILU defaults > to shifting > so that it's p.d. but I don't see an easy way to turn this off by > the command > line to see if that's the problem. I tried to do it in my program, > but it's unclear > if I did that correctly. > > Any suggestions? > > Thanks, Randy From rlmackie862 at gmail.com Fri Jan 30 20:37:50 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 30 Jan 2009 18:37:50 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: References: <4983B0FD.2050006@gmail.com> Message-ID: <4983B97E.1060406@gmail.com> Okay, but can you tell me why when I tried to turn off the positive definite shift by adding: -pc_factor_shift_positive_definite PETSC_FALSE so that I'm consistent with the previous version of Petsc, I got the following error: Invalid Argument! Unknown logical value: PETSC_FALSE! Thanks, Randy Barry Smith wrote: > > Run the old code with -ksp_view_binary this will create a file called > binaryoutput; you can then > run src/ksp/ksp/examples/tutorials/ex10.c using that input file. Use the > ex10 from the old version > of PETSc and then the ex10 from the new version. Do they have they same > convergence? > Now run the new code (that is build your code with petsc-3.0.) with > -ksp_view_binary and run that > binaryoutput file with the old and new ex10 to see what happens. > > Basically there are two possible changes with the change in the version: > 1) the matrix/right hand side has changed or > 2) the solver has changed to behave differently. > > By running the four cases you can start to get a handle on what has > actually changed, this will > lead you to what needs to be investigated next. > > Barry > > On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: > >> I just downloaded and compiled Petsc 3.0.0-p2, and after making some >> changes >> in my code to specify the correct location of the include files, finally >> got everything to compile okay. >> >> Now, I'm trying to run my test problem, and it's not converging. When >> I say >> not converging, the first line (with ksp_monitor_true_residual) shows >> that >> the true and preconditioned residuals are the same as before, but >> immediately >> thereafter, the preconditioned residual fails to go below 1e-8 whereas >> before >> it quickly went down to 1e-15. >> >> The options in my command file are: >> >> -ksp_type bcgsl >> -pc_type bjacobi >> -sub_pc_type ilu >> -sub_pc_factor_levels 3 >> -sub_pc_factor_fill 6 >> >> >> The only thing I see in the Change notes are that the ILU defaults to >> shifting >> so that it's p.d. but I don't see an easy way to turn this off by the >> command >> line to see if that's the problem. I tried to do it in my program, but >> it's unclear >> if I did that correctly. >> >> Any suggestions? >> >> Thanks, Randy > From bsmith at mcs.anl.gov Fri Jan 30 20:58:36 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Jan 2009 20:58:36 -0600 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983B97E.1060406@gmail.com> References: <4983B0FD.2050006@gmail.com> <4983B97E.1060406@gmail.com> Message-ID: <0CBFB388-31DE-4BDC-8507-19221583197E@mcs.anl.gov> Boy the manual pages are messed up. The possible values are true or false (or yes or no). I don't know why the manual page has PETSC_TRUE PETSC_FALSE our mistake. Also it does NOT default to shift for positive definite, though the manual page says it does. It defaults to shift for nonzero pivot, but this won't affect you because it would generate an error with the old version. Barry On Jan 30, 2009, at 8:37 PM, Randall Mackie wrote: > Okay, but can you tell me why when I tried to turn off the positive > definite shift > by adding: > > -pc_factor_shift_positive_definite PETSC_FALSE > > so that I'm consistent with the previous version of Petsc, I got the > following > error: > > Invalid Argument! > Unknown logical value: PETSC_FALSE! > > Thanks, Randy > > > Barry Smith wrote: >> Run the old code with -ksp_view_binary this will create a file >> called binaryoutput; you can then >> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. >> Use the ex10 from the old version >> of PETSc and then the ex10 from the new version. Do they have they >> same convergence? >> Now run the new code (that is build your code with petsc-3.0.) with >> -ksp_view_binary and run that >> binaryoutput file with the old and new ex10 to see what happens. >> Basically there are two possible changes with the change in the >> version: >> 1) the matrix/right hand side has changed or >> 2) the solver has changed to behave differently. >> By running the four cases you can start to get a handle on what >> has actually changed, this will >> lead you to what needs to be investigated next. >> Barry >> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>> I just downloaded and compiled Petsc 3.0.0-p2, and after making >>> some changes >>> in my code to specify the correct location of the include files, >>> finally >>> got everything to compile okay. >>> >>> Now, I'm trying to run my test problem, and it's not converging. >>> When I say >>> not converging, the first line (with ksp_monitor_true_residual) >>> shows that >>> the true and preconditioned residuals are the same as before, but >>> immediately >>> thereafter, the preconditioned residual fails to go below 1e-8 >>> whereas before >>> it quickly went down to 1e-15. >>> >>> The options in my command file are: >>> >>> -ksp_type bcgsl >>> -pc_type bjacobi >>> -sub_pc_type ilu >>> -sub_pc_factor_levels 3 >>> -sub_pc_factor_fill 6 >>> >>> >>> The only thing I see in the Change notes are that the ILU defaults >>> to shifting >>> so that it's p.d. but I don't see an easy way to turn this off by >>> the command >>> line to see if that's the problem. I tried to do it in my program, >>> but it's unclear >>> if I did that correctly. >>> >>> Any suggestions? >>> >>> Thanks, Randy From rlmackie862 at gmail.com Fri Jan 30 21:07:04 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 30 Jan 2009 19:07:04 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <0CBFB388-31DE-4BDC-8507-19221583197E@mcs.anl.gov> References: <4983B0FD.2050006@gmail.com> <4983B97E.1060406@gmail.com> <0CBFB388-31DE-4BDC-8507-19221583197E@mcs.anl.gov> Message-ID: <4983C058.4040508@gmail.com> Barry, The Change Notes for v 3.0.0 state that the ILU preconditioner now defaults to use shift to prevent zero pivot. This is stated on your web page. Randy Barry Smith wrote: > > Boy the manual pages are messed up. The possible values are true or > false (or yes or no). I don't know why the manual > page has PETSC_TRUE PETSC_FALSE our mistake. > > Also it does NOT default to shift for positive definite, though the > manual page says it does. It defaults to shift for nonzero pivot, > but this won't affect you because it would generate an error with the > old version. > > Barry > > On Jan 30, 2009, at 8:37 PM, Randall Mackie wrote: > >> Okay, but can you tell me why when I tried to turn off the positive >> definite shift >> by adding: >> >> -pc_factor_shift_positive_definite PETSC_FALSE >> >> so that I'm consistent with the previous version of Petsc, I got the >> following >> error: >> >> Invalid Argument! >> Unknown logical value: PETSC_FALSE! >> >> Thanks, Randy >> >> >> Barry Smith wrote: >>> Run the old code with -ksp_view_binary this will create a file >>> called binaryoutput; you can then >>> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. Use >>> the ex10 from the old version >>> of PETSc and then the ex10 from the new version. Do they have they >>> same convergence? >>> Now run the new code (that is build your code with petsc-3.0.) with >>> -ksp_view_binary and run that >>> binaryoutput file with the old and new ex10 to see what happens. >>> Basically there are two possible changes with the change in the >>> version: >>> 1) the matrix/right hand side has changed or >>> 2) the solver has changed to behave differently. >>> By running the four cases you can start to get a handle on what has >>> actually changed, this will >>> lead you to what needs to be investigated next. >>> Barry >>> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>>> I just downloaded and compiled Petsc 3.0.0-p2, and after making some >>>> changes >>>> in my code to specify the correct location of the include files, >>>> finally >>>> got everything to compile okay. >>>> >>>> Now, I'm trying to run my test problem, and it's not converging. >>>> When I say >>>> not converging, the first line (with ksp_monitor_true_residual) >>>> shows that >>>> the true and preconditioned residuals are the same as before, but >>>> immediately >>>> thereafter, the preconditioned residual fails to go below 1e-8 >>>> whereas before >>>> it quickly went down to 1e-15. >>>> >>>> The options in my command file are: >>>> >>>> -ksp_type bcgsl >>>> -pc_type bjacobi >>>> -sub_pc_type ilu >>>> -sub_pc_factor_levels 3 >>>> -sub_pc_factor_fill 6 >>>> >>>> >>>> The only thing I see in the Change notes are that the ILU defaults >>>> to shifting >>>> so that it's p.d. but I don't see an easy way to turn this off by >>>> the command >>>> line to see if that's the problem. I tried to do it in my program, >>>> but it's unclear >>>> if I did that correctly. >>>> >>>> Any suggestions? >>>> >>>> Thanks, Randy > From bsmith at mcs.anl.gov Fri Jan 30 21:38:30 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Jan 2009 21:38:30 -0600 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983C058.4040508@gmail.com> References: <4983B0FD.2050006@gmail.com> <4983B97E.1060406@gmail.com> <0CBFB388-31DE-4BDC-8507-19221583197E@mcs.anl.gov> <4983C058.4040508@gmail.com> Message-ID: <20A8A0BE-CD0B-4747-BCBE-EB208EF26B0B@mcs.anl.gov> On Jan 30, 2009, at 9:07 PM, Randall Mackie wrote: > Barry, > > The Change Notes for v 3.0.0 state that the ILU preconditioner now > defaults > to use shift to prevent zero pivot. This is stated on your web page. > Yes this is correct. The manual page incorrectly said that it used a shift to make positive definite by default. Note that previously if it found a zero pivot it would stop with an error (hence I submit that this issue should not be the cause of the change in convergence.) Barry > Randy > > > Barry Smith wrote: >> Boy the manual pages are messed up. The possible values are true >> or false (or yes or no). I don't know why the manual >> page has PETSC_TRUE PETSC_FALSE our mistake. >> Also it does NOT default to shift for positive definite, though >> the manual page says it does. It defaults to shift for nonzero pivot, >> but this won't affect you because it would generate an error with >> the old version. >> Barry >> On Jan 30, 2009, at 8:37 PM, Randall Mackie wrote: >>> Okay, but can you tell me why when I tried to turn off the >>> positive definite shift >>> by adding: >>> >>> -pc_factor_shift_positive_definite PETSC_FALSE >>> >>> so that I'm consistent with the previous version of Petsc, I got >>> the following >>> error: >>> >>> Invalid Argument! >>> Unknown logical value: PETSC_FALSE! >>> >>> Thanks, Randy >>> >>> >>> Barry Smith wrote: >>>> Run the old code with -ksp_view_binary this will create a file >>>> called binaryoutput; you can then >>>> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. >>>> Use the ex10 from the old version >>>> of PETSc and then the ex10 from the new version. Do they have >>>> they same convergence? >>>> Now run the new code (that is build your code with petsc-3.0.) >>>> with -ksp_view_binary and run that >>>> binaryoutput file with the old and new ex10 to see what happens. >>>> Basically there are two possible changes with the change in the >>>> version: >>>> 1) the matrix/right hand side has changed or >>>> 2) the solver has changed to behave differently. >>>> By running the four cases you can start to get a handle on what >>>> has actually changed, this will >>>> lead you to what needs to be investigated next. >>>> Barry >>>> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>>>> I just downloaded and compiled Petsc 3.0.0-p2, and after making >>>>> some changes >>>>> in my code to specify the correct location of the include files, >>>>> finally >>>>> got everything to compile okay. >>>>> >>>>> Now, I'm trying to run my test problem, and it's not converging. >>>>> When I say >>>>> not converging, the first line (with ksp_monitor_true_residual) >>>>> shows that >>>>> the true and preconditioned residuals are the same as before, >>>>> but immediately >>>>> thereafter, the preconditioned residual fails to go below 1e-8 >>>>> whereas before >>>>> it quickly went down to 1e-15. >>>>> >>>>> The options in my command file are: >>>>> >>>>> -ksp_type bcgsl >>>>> -pc_type bjacobi >>>>> -sub_pc_type ilu >>>>> -sub_pc_factor_levels 3 >>>>> -sub_pc_factor_fill 6 >>>>> >>>>> >>>>> The only thing I see in the Change notes are that the ILU >>>>> defaults to shifting >>>>> so that it's p.d. but I don't see an easy way to turn this off >>>>> by the command >>>>> line to see if that's the problem. I tried to do it in my >>>>> program, but it's unclear >>>>> if I did that correctly. >>>>> >>>>> Any suggestions? >>>>> >>>>> Thanks, Randy From rlmackie862 at gmail.com Fri Jan 30 23:11:38 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 30 Jan 2009 21:11:38 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: References: <4983B0FD.2050006@gmail.com> Message-ID: <4983DD8A.9060701@gmail.com> Barry, I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same binaryoutput using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, sub_pc=ilu) and the convergence is different. Note, this is the binaryoutput from the v 2.3.3-p11. Here is the convergence for ex10 from 2.3.3-p11: [rmackie ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/tutorials] ./cmd_test 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 10 KSP preconditioned resid norm 5.141701606491e-09 true resid norm 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 12 KSP preconditioned resid norm 2.591522726108e-09 true resid norm 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 14 KSP preconditioned resid norm 1.452300847705e-09 true resid norm 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 16 KSP preconditioned resid norm 8.978785776773e-10 true resid norm 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 18 KSP preconditioned resid norm 5.743601707920e-10 true resid norm 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 20 KSP preconditioned resid norm 3.678705188041e-10 true resid norm 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 22 KSP preconditioned resid norm 2.687340247327e-10 true resid norm 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 24 KSP preconditioned resid norm 2.142070779181e-10 true resid norm 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 26 KSP preconditioned resid norm 1.927583902818e-10 true resid norm 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 28 KSP preconditioned resid norm 1.330050553611e-10 true resid norm 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 30 KSP preconditioned resid norm 1.125778226987e-10 true resid norm 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 32 KSP preconditioned resid norm 8.786382414490e-11 true resid norm 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 34 KSP preconditioned resid norm 6.396029789854e-11 true resid norm 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 36 KSP preconditioned resid norm 5.462116905743e-11 true resid norm 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 38 KSP preconditioned resid norm 5.410616131680e-11 true resid norm 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 40 KSP preconditioned resid norm 7.946054076400e-11 true resid norm 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 Number of iterations = 42 Here is the convergence for ex10 from 3.0.0-p2: rmackie ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/tutorials] ./cmd_test 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 10 KSP preconditioned resid norm 1.729327127129e-08 true resid norm 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 12 KSP preconditioned resid norm 1.039787850500e-08 true resid norm 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 14 KSP preconditioned resid norm 5.025780191774e-09 true resid norm 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 16 KSP preconditioned resid norm 3.311781967950e-09 true resid norm 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 18 KSP preconditioned resid norm 5.621276662229e-09 true resid norm 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 20 KSP preconditioned resid norm 1.184533040734e-08 true resid norm 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 22 KSP preconditioned resid norm 2.494642590524e-08 true resid norm 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 24 KSP preconditioned resid norm 5.136091311727e-08 true resid norm 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 26 KSP preconditioned resid norm 9.627430082715e-08 true resid norm 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 28 KSP preconditioned resid norm 6.409712928943e-08 true resid norm 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 30 KSP preconditioned resid norm 6.013091685526e-07 true resid norm 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 32 KSP preconditioned resid norm 7.026562454712e-07 true resid norm 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 34 KSP preconditioned resid norm 4.086784421188e-07 true resid norm 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 36 KSP preconditioned resid norm 1.651444280250e-06 true resid norm 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 38 KSP preconditioned resid norm 1.058319572456e-06 true resid norm 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 40 KSP preconditioned resid norm 4.341084013969e-05 true resid norm 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 42 KSP preconditioned resid norm 4.190225826231e-07 true resid norm 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 44 KSP preconditioned resid norm 1.054511038261e-06 true resid norm 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 46 KSP preconditioned resid norm 5.351004248086e-07 true resid norm 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 48 KSP preconditioned resid norm 7.104477128923e-07 true resid norm 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 50 KSP preconditioned resid norm 1.162050733932e-06 true resid norm 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 When I output binaryoutput using my code compiled under 3.0.0-p2 and run ex10, I get exactly the same output as above. In other words, the matrix/rhs are exactly the same (as they should be since I didn't change anything), and something about 3.0.0-p2 using BCGSL and ILU is not working correctly. I changed ksptype to gmres, and I get the same convergence for both 2.3.3-p11 and for 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not working correctly under 3.0.0-p2. Randy Barry Smith wrote: > > Run the old code with -ksp_view_binary this will create a file called > binaryoutput; you can then > run src/ksp/ksp/examples/tutorials/ex10.c using that input file. Use the > ex10 from the old version > of PETSc and then the ex10 from the new version. Do they have they same > convergence? > Now run the new code (that is build your code with petsc-3.0.) with > -ksp_view_binary and run that > binaryoutput file with the old and new ex10 to see what happens. > > Basically there are two possible changes with the change in the version: > 1) the matrix/right hand side has changed or > 2) the solver has changed to behave differently. > > By running the four cases you can start to get a handle on what has > actually changed, this will > lead you to what needs to be investigated next. > > Barry > > On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: > >> I just downloaded and compiled Petsc 3.0.0-p2, and after making some >> changes >> in my code to specify the correct location of the include files, finally >> got everything to compile okay. >> >> Now, I'm trying to run my test problem, and it's not converging. When >> I say >> not converging, the first line (with ksp_monitor_true_residual) shows >> that >> the true and preconditioned residuals are the same as before, but >> immediately >> thereafter, the preconditioned residual fails to go below 1e-8 whereas >> before >> it quickly went down to 1e-15. >> >> The options in my command file are: >> >> -ksp_type bcgsl >> -pc_type bjacobi >> -sub_pc_type ilu >> -sub_pc_factor_levels 3 >> -sub_pc_factor_fill 6 >> >> >> The only thing I see in the Change notes are that the ILU defaults to >> shifting >> so that it's p.d. but I don't see an easy way to turn this off by the >> command >> line to see if that's the problem. I tried to do it in my program, but >> it's unclear >> if I did that correctly. >> >> Any suggestions? >> >> Thanks, Randy > From rlmackie862 at gmail.com Fri Jan 30 23:19:08 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 30 Jan 2009 21:19:08 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983DD8A.9060701@gmail.com> References: <4983B0FD.2050006@gmail.com> <4983DD8A.9060701@gmail.com> Message-ID: <4983DF4C.4010100@gmail.com> Barry, Some further hints: BCGSL doesn't give the same results BCGS does better, but not exactly the same results BICG gives same results GMRES gives same results IBCGS under 3.0.0-p2 bombs out with a fatal error in MPI_Allreduce. Randy Randall Mackie wrote: > Barry, > > I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same binaryoutput > using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, > sub_pc=ilu) > and the convergence is different. Note, this is the binaryoutput from the > v 2.3.3-p11. > > Here is the convergence for ex10 from 2.3.3-p11: > > [rmackie ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/tutorials] > ./cmd_test > 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm > 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm > 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 > 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm > 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 > 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm > 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 > 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm > 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 > 10 KSP preconditioned resid norm 5.141701606491e-09 true resid norm > 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 > 12 KSP preconditioned resid norm 2.591522726108e-09 true resid norm > 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 > 14 KSP preconditioned resid norm 1.452300847705e-09 true resid norm > 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 > 16 KSP preconditioned resid norm 8.978785776773e-10 true resid norm > 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 > 18 KSP preconditioned resid norm 5.743601707920e-10 true resid norm > 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 > 20 KSP preconditioned resid norm 3.678705188041e-10 true resid norm > 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 > 22 KSP preconditioned resid norm 2.687340247327e-10 true resid norm > 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 > 24 KSP preconditioned resid norm 2.142070779181e-10 true resid norm > 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 > 26 KSP preconditioned resid norm 1.927583902818e-10 true resid norm > 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 > 28 KSP preconditioned resid norm 1.330050553611e-10 true resid norm > 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 > 30 KSP preconditioned resid norm 1.125778226987e-10 true resid norm > 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 > 32 KSP preconditioned resid norm 8.786382414490e-11 true resid norm > 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 > 34 KSP preconditioned resid norm 6.396029789854e-11 true resid norm > 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 > 36 KSP preconditioned resid norm 5.462116905743e-11 true resid norm > 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 > 38 KSP preconditioned resid norm 5.410616131680e-11 true resid norm > 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 > 40 KSP preconditioned resid norm 7.946054076400e-11 true resid norm > 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 > 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm > 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 > 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm > 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 > Number of iterations = 42 > > > Here is the convergence for ex10 from 3.0.0-p2: > > rmackie ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/tutorials] > ./cmd_test > 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm > 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm > 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 > 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm > 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 > 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm > 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 > 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm > 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 > 10 KSP preconditioned resid norm 1.729327127129e-08 true resid norm > 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 > 12 KSP preconditioned resid norm 1.039787850500e-08 true resid norm > 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 > 14 KSP preconditioned resid norm 5.025780191774e-09 true resid norm > 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 > 16 KSP preconditioned resid norm 3.311781967950e-09 true resid norm > 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 > 18 KSP preconditioned resid norm 5.621276662229e-09 true resid norm > 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 > 20 KSP preconditioned resid norm 1.184533040734e-08 true resid norm > 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 > 22 KSP preconditioned resid norm 2.494642590524e-08 true resid norm > 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 > 24 KSP preconditioned resid norm 5.136091311727e-08 true resid norm > 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 > 26 KSP preconditioned resid norm 9.627430082715e-08 true resid norm > 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 > 28 KSP preconditioned resid norm 6.409712928943e-08 true resid norm > 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 > 30 KSP preconditioned resid norm 6.013091685526e-07 true resid norm > 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 > 32 KSP preconditioned resid norm 7.026562454712e-07 true resid norm > 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 > 34 KSP preconditioned resid norm 4.086784421188e-07 true resid norm > 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 > 36 KSP preconditioned resid norm 1.651444280250e-06 true resid norm > 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 > 38 KSP preconditioned resid norm 1.058319572456e-06 true resid norm > 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 > 40 KSP preconditioned resid norm 4.341084013969e-05 true resid norm > 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 > 42 KSP preconditioned resid norm 4.190225826231e-07 true resid norm > 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 > 44 KSP preconditioned resid norm 1.054511038261e-06 true resid norm > 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 > 46 KSP preconditioned resid norm 5.351004248086e-07 true resid norm > 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 > 48 KSP preconditioned resid norm 7.104477128923e-07 true resid norm > 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 > 50 KSP preconditioned resid norm 1.162050733932e-06 true resid norm > 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 > > > When I output binaryoutput using my code compiled under 3.0.0-p2 and run > ex10, I get exactly > the same output as above. In other words, the matrix/rhs are exactly the > same (as they > should be since I didn't change anything), and something about 3.0.0-p2 > using BCGSL and ILU > is not working correctly. > > I changed ksptype to gmres, and I get the same convergence for both > 2.3.3-p11 and for > 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not working > correctly under 3.0.0-p2. > > Randy > > > Barry Smith wrote: >> >> Run the old code with -ksp_view_binary this will create a file >> called binaryoutput; you can then >> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. Use >> the ex10 from the old version >> of PETSc and then the ex10 from the new version. Do they have they >> same convergence? >> Now run the new code (that is build your code with petsc-3.0.) with >> -ksp_view_binary and run that >> binaryoutput file with the old and new ex10 to see what happens. >> >> Basically there are two possible changes with the change in the >> version: >> 1) the matrix/right hand side has changed or >> 2) the solver has changed to behave differently. >> >> By running the four cases you can start to get a handle on what has >> actually changed, this will >> lead you to what needs to be investigated next. >> >> Barry >> >> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >> >>> I just downloaded and compiled Petsc 3.0.0-p2, and after making some >>> changes >>> in my code to specify the correct location of the include files, finally >>> got everything to compile okay. >>> >>> Now, I'm trying to run my test problem, and it's not converging. When >>> I say >>> not converging, the first line (with ksp_monitor_true_residual) shows >>> that >>> the true and preconditioned residuals are the same as before, but >>> immediately >>> thereafter, the preconditioned residual fails to go below 1e-8 >>> whereas before >>> it quickly went down to 1e-15. >>> >>> The options in my command file are: >>> >>> -ksp_type bcgsl >>> -pc_type bjacobi >>> -sub_pc_type ilu >>> -sub_pc_factor_levels 3 >>> -sub_pc_factor_fill 6 >>> >>> >>> The only thing I see in the Change notes are that the ILU defaults to >>> shifting >>> so that it's p.d. but I don't see an easy way to turn this off by the >>> command >>> line to see if that's the problem. I tried to do it in my program, >>> but it's unclear >>> if I did that correctly. >>> >>> Any suggestions? >>> >>> Thanks, Randy >> From bsmith at mcs.anl.gov Fri Jan 30 23:20:52 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Jan 2009 23:20:52 -0600 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983DD8A.9060701@gmail.com> References: <4983B0FD.2050006@gmail.com> <4983DD8A.9060701@gmail.com> Message-ID: <13EAD025-72A6-4435-8098-666DD4DA7ADC@mcs.anl.gov> The convergence of both of those is GARBAGE. The true residual gets hugely worse in the first step and then stays bad in both cases. The fact that the preconditioned residual norm gets small better with the old PETSc then the new is kind of irrelevent. The solution that KSP claims to give in both cases is frankly crap. Send me the output for both with -ksp_monitor_true_residual with - ksp_type gmres with old and new PETSc. Barry On Jan 30, 2009, at 11:11 PM, Randall Mackie wrote: > Barry, > > I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same > binaryoutput > using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, > sub_pc=ilu) > and the convergence is different. Note, this is the binaryoutput > from the > v 2.3.3-p11. > > Here is the convergence for ex10 from 2.3.3-p11: > > [rmackie ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/ > tutorials] ./cmd_test > 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm > 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm > 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 > 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm > 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 > 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm > 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 > 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm > 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 > 10 KSP preconditioned resid norm 5.141701606491e-09 true resid norm > 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 > 12 KSP preconditioned resid norm 2.591522726108e-09 true resid norm > 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 > 14 KSP preconditioned resid norm 1.452300847705e-09 true resid norm > 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 > 16 KSP preconditioned resid norm 8.978785776773e-10 true resid norm > 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 > 18 KSP preconditioned resid norm 5.743601707920e-10 true resid norm > 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 > 20 KSP preconditioned resid norm 3.678705188041e-10 true resid norm > 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 > 22 KSP preconditioned resid norm 2.687340247327e-10 true resid norm > 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 > 24 KSP preconditioned resid norm 2.142070779181e-10 true resid norm > 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 > 26 KSP preconditioned resid norm 1.927583902818e-10 true resid norm > 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 > 28 KSP preconditioned resid norm 1.330050553611e-10 true resid norm > 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 > 30 KSP preconditioned resid norm 1.125778226987e-10 true resid norm > 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 > 32 KSP preconditioned resid norm 8.786382414490e-11 true resid norm > 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 > 34 KSP preconditioned resid norm 6.396029789854e-11 true resid norm > 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 > 36 KSP preconditioned resid norm 5.462116905743e-11 true resid norm > 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 > 38 KSP preconditioned resid norm 5.410616131680e-11 true resid norm > 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 > 40 KSP preconditioned resid norm 7.946054076400e-11 true resid norm > 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 > 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm > 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 > 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm > 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 > Number of iterations = 42 > > > Here is the convergence for ex10 from 3.0.0-p2: > > rmackie ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/ > tutorials] ./cmd_test > 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm > 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm > 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 > 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm > 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 > 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm > 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 > 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm > 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 > 10 KSP preconditioned resid norm 1.729327127129e-08 true resid norm > 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 > 12 KSP preconditioned resid norm 1.039787850500e-08 true resid norm > 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 > 14 KSP preconditioned resid norm 5.025780191774e-09 true resid norm > 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 > 16 KSP preconditioned resid norm 3.311781967950e-09 true resid norm > 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 > 18 KSP preconditioned resid norm 5.621276662229e-09 true resid norm > 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 > 20 KSP preconditioned resid norm 1.184533040734e-08 true resid norm > 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 > 22 KSP preconditioned resid norm 2.494642590524e-08 true resid norm > 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 > 24 KSP preconditioned resid norm 5.136091311727e-08 true resid norm > 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 > 26 KSP preconditioned resid norm 9.627430082715e-08 true resid norm > 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 > 28 KSP preconditioned resid norm 6.409712928943e-08 true resid norm > 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 > 30 KSP preconditioned resid norm 6.013091685526e-07 true resid norm > 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 > 32 KSP preconditioned resid norm 7.026562454712e-07 true resid norm > 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 > 34 KSP preconditioned resid norm 4.086784421188e-07 true resid norm > 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 > 36 KSP preconditioned resid norm 1.651444280250e-06 true resid norm > 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 > 38 KSP preconditioned resid norm 1.058319572456e-06 true resid norm > 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 > 40 KSP preconditioned resid norm 4.341084013969e-05 true resid norm > 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 > 42 KSP preconditioned resid norm 4.190225826231e-07 true resid norm > 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 > 44 KSP preconditioned resid norm 1.054511038261e-06 true resid norm > 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 > 46 KSP preconditioned resid norm 5.351004248086e-07 true resid norm > 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 > 48 KSP preconditioned resid norm 7.104477128923e-07 true resid norm > 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 > 50 KSP preconditioned resid norm 1.162050733932e-06 true resid norm > 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 > > > When I output binaryoutput using my code compiled under 3.0.0-p2 and > run ex10, I get exactly > the same output as above. In other words, the matrix/rhs are exactly > the same (as they > should be since I didn't change anything), and something about 3.0.0- > p2 using BCGSL and ILU > is not working correctly. > > I changed ksptype to gmres, and I get the same convergence for both > 2.3.3-p11 and for > 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not working > correctly under 3.0.0-p2. > > Randy > > > Barry Smith wrote: >> Run the old code with -ksp_view_binary this will create a file >> called binaryoutput; you can then >> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. >> Use the ex10 from the old version >> of PETSc and then the ex10 from the new version. Do they have they >> same convergence? >> Now run the new code (that is build your code with petsc-3.0.) with >> -ksp_view_binary and run that >> binaryoutput file with the old and new ex10 to see what happens. >> Basically there are two possible changes with the change in the >> version: >> 1) the matrix/right hand side has changed or >> 2) the solver has changed to behave differently. >> By running the four cases you can start to get a handle on what >> has actually changed, this will >> lead you to what needs to be investigated next. >> Barry >> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>> I just downloaded and compiled Petsc 3.0.0-p2, and after making >>> some changes >>> in my code to specify the correct location of the include files, >>> finally >>> got everything to compile okay. >>> >>> Now, I'm trying to run my test problem, and it's not converging. >>> When I say >>> not converging, the first line (with ksp_monitor_true_residual) >>> shows that >>> the true and preconditioned residuals are the same as before, but >>> immediately >>> thereafter, the preconditioned residual fails to go below 1e-8 >>> whereas before >>> it quickly went down to 1e-15. >>> >>> The options in my command file are: >>> >>> -ksp_type bcgsl >>> -pc_type bjacobi >>> -sub_pc_type ilu >>> -sub_pc_factor_levels 3 >>> -sub_pc_factor_fill 6 >>> >>> >>> The only thing I see in the Change notes are that the ILU defaults >>> to shifting >>> so that it's p.d. but I don't see an easy way to turn this off by >>> the command >>> line to see if that's the problem. I tried to do it in my program, >>> but it's unclear >>> if I did that correctly. >>> >>> Any suggestions? >>> >>> Thanks, Randy From bsmith at mcs.anl.gov Fri Jan 30 23:22:14 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Jan 2009 23:22:14 -0600 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983DF4C.4010100@gmail.com> References: <4983B0FD.2050006@gmail.com> <4983DD8A.9060701@gmail.com> <4983DF4C.4010100@gmail.com> Message-ID: <40744A62-2829-4D15-8D23-FED53C0CFA6E@mcs.anl.gov> You can mail the binary file to petsc-maint at mcs.anl.gov and tell me how many processes you run on. Barry On Jan 30, 2009, at 11:19 PM, Randall Mackie wrote: > Barry, > > Some further hints: > > BCGSL doesn't give the same results > BCGS does better, but not exactly the same results > BICG gives same results > GMRES gives same results > IBCGS under 3.0.0-p2 bombs out with a fatal error in MPI_Allreduce. > > Randy > > > Randall Mackie wrote: >> Barry, >> I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same >> binaryoutput >> using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, >> sub_pc=ilu) >> and the convergence is different. Note, this is the binaryoutput >> from the >> v 2.3.3-p11. >> Here is the convergence for ex10 from 2.3.3-p11: >> [rmackie ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/ >> tutorials] ./cmd_test >> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >> 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm >> 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 >> 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm >> 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 >> 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm >> 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 >> 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm >> 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 >> 10 KSP preconditioned resid norm 5.141701606491e-09 true resid norm >> 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 >> 12 KSP preconditioned resid norm 2.591522726108e-09 true resid norm >> 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 >> 14 KSP preconditioned resid norm 1.452300847705e-09 true resid norm >> 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 >> 16 KSP preconditioned resid norm 8.978785776773e-10 true resid norm >> 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 >> 18 KSP preconditioned resid norm 5.743601707920e-10 true resid norm >> 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 >> 20 KSP preconditioned resid norm 3.678705188041e-10 true resid norm >> 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 >> 22 KSP preconditioned resid norm 2.687340247327e-10 true resid norm >> 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 >> 24 KSP preconditioned resid norm 2.142070779181e-10 true resid norm >> 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 >> 26 KSP preconditioned resid norm 1.927583902818e-10 true resid norm >> 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 >> 28 KSP preconditioned resid norm 1.330050553611e-10 true resid norm >> 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 >> 30 KSP preconditioned resid norm 1.125778226987e-10 true resid norm >> 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 >> 32 KSP preconditioned resid norm 8.786382414490e-11 true resid norm >> 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 >> 34 KSP preconditioned resid norm 6.396029789854e-11 true resid norm >> 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 >> 36 KSP preconditioned resid norm 5.462116905743e-11 true resid norm >> 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 >> 38 KSP preconditioned resid norm 5.410616131680e-11 true resid norm >> 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 >> 40 KSP preconditioned resid norm 7.946054076400e-11 true resid norm >> 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 >> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm >> 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm >> 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >> Number of iterations = 42 >> Here is the convergence for ex10 from 3.0.0-p2: >> rmackie ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/ >> tutorials] ./cmd_test >> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >> 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm >> 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 >> 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm >> 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 >> 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm >> 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 >> 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm >> 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 >> 10 KSP preconditioned resid norm 1.729327127129e-08 true resid norm >> 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 >> 12 KSP preconditioned resid norm 1.039787850500e-08 true resid norm >> 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 >> 14 KSP preconditioned resid norm 5.025780191774e-09 true resid norm >> 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 >> 16 KSP preconditioned resid norm 3.311781967950e-09 true resid norm >> 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 >> 18 KSP preconditioned resid norm 5.621276662229e-09 true resid norm >> 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 >> 20 KSP preconditioned resid norm 1.184533040734e-08 true resid norm >> 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 >> 22 KSP preconditioned resid norm 2.494642590524e-08 true resid norm >> 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 >> 24 KSP preconditioned resid norm 5.136091311727e-08 true resid norm >> 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 >> 26 KSP preconditioned resid norm 9.627430082715e-08 true resid norm >> 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 >> 28 KSP preconditioned resid norm 6.409712928943e-08 true resid norm >> 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 >> 30 KSP preconditioned resid norm 6.013091685526e-07 true resid norm >> 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 >> 32 KSP preconditioned resid norm 7.026562454712e-07 true resid norm >> 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 >> 34 KSP preconditioned resid norm 4.086784421188e-07 true resid norm >> 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 >> 36 KSP preconditioned resid norm 1.651444280250e-06 true resid norm >> 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 >> 38 KSP preconditioned resid norm 1.058319572456e-06 true resid norm >> 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 >> 40 KSP preconditioned resid norm 4.341084013969e-05 true resid norm >> 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 >> 42 KSP preconditioned resid norm 4.190225826231e-07 true resid norm >> 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 >> 44 KSP preconditioned resid norm 1.054511038261e-06 true resid norm >> 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 >> 46 KSP preconditioned resid norm 5.351004248086e-07 true resid norm >> 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 >> 48 KSP preconditioned resid norm 7.104477128923e-07 true resid norm >> 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 >> 50 KSP preconditioned resid norm 1.162050733932e-06 true resid norm >> 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 >> When I output binaryoutput using my code compiled under 3.0.0-p2 >> and run ex10, I get exactly >> the same output as above. In other words, the matrix/rhs are >> exactly the same (as they >> should be since I didn't change anything), and something about >> 3.0.0-p2 using BCGSL and ILU >> is not working correctly. >> I changed ksptype to gmres, and I get the same convergence for both >> 2.3.3-p11 and for >> 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not working >> correctly under 3.0.0-p2. >> Randy >> Barry Smith wrote: >>> >>> Run the old code with -ksp_view_binary this will create a file >>> called binaryoutput; you can then >>> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. >>> Use the ex10 from the old version >>> of PETSc and then the ex10 from the new version. Do they have they >>> same convergence? >>> Now run the new code (that is build your code with petsc-3.0.) >>> with -ksp_view_binary and run that >>> binaryoutput file with the old and new ex10 to see what happens. >>> >>> Basically there are two possible changes with the change in the >>> version: >>> 1) the matrix/right hand side has changed or >>> 2) the solver has changed to behave differently. >>> >>> By running the four cases you can start to get a handle on what >>> has actually changed, this will >>> lead you to what needs to be investigated next. >>> >>> Barry >>> >>> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>> >>>> I just downloaded and compiled Petsc 3.0.0-p2, and after making >>>> some changes >>>> in my code to specify the correct location of the include files, >>>> finally >>>> got everything to compile okay. >>>> >>>> Now, I'm trying to run my test problem, and it's not converging. >>>> When I say >>>> not converging, the first line (with ksp_monitor_true_residual) >>>> shows that >>>> the true and preconditioned residuals are the same as before, but >>>> immediately >>>> thereafter, the preconditioned residual fails to go below 1e-8 >>>> whereas before >>>> it quickly went down to 1e-15. >>>> >>>> The options in my command file are: >>>> >>>> -ksp_type bcgsl >>>> -pc_type bjacobi >>>> -sub_pc_type ilu >>>> -sub_pc_factor_levels 3 >>>> -sub_pc_factor_fill 6 >>>> >>>> >>>> The only thing I see in the Change notes are that the ILU >>>> defaults to shifting >>>> so that it's p.d. but I don't see an easy way to turn this off by >>>> the command >>>> line to see if that's the problem. I tried to do it in my >>>> program, but it's unclear >>>> if I did that correctly. >>>> >>>> Any suggestions? >>>> >>>> Thanks, Randy >>> From rlmackie862 at gmail.com Fri Jan 30 23:34:13 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 30 Jan 2009 21:34:13 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <13EAD025-72A6-4435-8098-666DD4DA7ADC@mcs.anl.gov> References: <4983B0FD.2050006@gmail.com> <4983DD8A.9060701@gmail.com> <13EAD025-72A6-4435-8098-666DD4DA7ADC@mcs.anl.gov> Message-ID: <4983E2D5.3010102@gmail.com> Barry, trust me, these are *NOT* garbage. The reason you're seeing such a jump in true residual is because in the case of ex10, I assume you're starting with zero values. The nature of the physical system I'm solving has air layers with electrical resistivity of 1e10 up against earth layers of resistivity 1. A small change causes a large increase in the true residual. In my program, I start with 1D boundary values which are much closer to the truth, and you see only a slight increase in residual. I know from a numerical perspective you don't like these systems, but that's what we deal with and they have been validated with direct solvers and against 1D and analytic solutions for appropriate models. I will send the binaryoutput to petsc-maint. Regardless of whether or not you like my numerical systems, there is obviously something wrong in 3.0.0-p2 with the BCGSL and BCGS implementations. Randy Barry Smith wrote: > > The convergence of both of those is GARBAGE. The true residual gets > hugely worse in the first > step and then stays bad in both cases. The fact that the preconditioned > residual norm gets small > better with the old PETSc then the new is kind of irrelevent. The > solution that KSP claims to give in > both cases is frankly crap. > > Send me the output for both with -ksp_monitor_true_residual with > -ksp_type gmres with old and > new PETSc. > > Barry > > > On Jan 30, 2009, at 11:11 PM, Randall Mackie wrote: > >> Barry, >> >> I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same >> binaryoutput >> using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, >> sub_pc=ilu) >> and the convergence is different. Note, this is the binaryoutput from the >> v 2.3.3-p11. >> >> Here is the convergence for ex10 from 2.3.3-p11: >> >> [rmackie >> ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/tutorials] ./cmd_test >> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >> 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm >> 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 >> 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm >> 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 >> 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm >> 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 >> 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm >> 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 >> 10 KSP preconditioned resid norm 5.141701606491e-09 true resid norm >> 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 >> 12 KSP preconditioned resid norm 2.591522726108e-09 true resid norm >> 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 >> 14 KSP preconditioned resid norm 1.452300847705e-09 true resid norm >> 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 >> 16 KSP preconditioned resid norm 8.978785776773e-10 true resid norm >> 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 >> 18 KSP preconditioned resid norm 5.743601707920e-10 true resid norm >> 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 >> 20 KSP preconditioned resid norm 3.678705188041e-10 true resid norm >> 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 >> 22 KSP preconditioned resid norm 2.687340247327e-10 true resid norm >> 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 >> 24 KSP preconditioned resid norm 2.142070779181e-10 true resid norm >> 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 >> 26 KSP preconditioned resid norm 1.927583902818e-10 true resid norm >> 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 >> 28 KSP preconditioned resid norm 1.330050553611e-10 true resid norm >> 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 >> 30 KSP preconditioned resid norm 1.125778226987e-10 true resid norm >> 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 >> 32 KSP preconditioned resid norm 8.786382414490e-11 true resid norm >> 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 >> 34 KSP preconditioned resid norm 6.396029789854e-11 true resid norm >> 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 >> 36 KSP preconditioned resid norm 5.462116905743e-11 true resid norm >> 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 >> 38 KSP preconditioned resid norm 5.410616131680e-11 true resid norm >> 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 >> 40 KSP preconditioned resid norm 7.946054076400e-11 true resid norm >> 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 >> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm >> 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm >> 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >> Number of iterations = 42 >> >> >> Here is the convergence for ex10 from 3.0.0-p2: >> >> rmackie ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/tutorials] >> ./cmd_test >> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >> 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm >> 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 >> 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm >> 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 >> 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm >> 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 >> 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm >> 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 >> 10 KSP preconditioned resid norm 1.729327127129e-08 true resid norm >> 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 >> 12 KSP preconditioned resid norm 1.039787850500e-08 true resid norm >> 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 >> 14 KSP preconditioned resid norm 5.025780191774e-09 true resid norm >> 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 >> 16 KSP preconditioned resid norm 3.311781967950e-09 true resid norm >> 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 >> 18 KSP preconditioned resid norm 5.621276662229e-09 true resid norm >> 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 >> 20 KSP preconditioned resid norm 1.184533040734e-08 true resid norm >> 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 >> 22 KSP preconditioned resid norm 2.494642590524e-08 true resid norm >> 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 >> 24 KSP preconditioned resid norm 5.136091311727e-08 true resid norm >> 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 >> 26 KSP preconditioned resid norm 9.627430082715e-08 true resid norm >> 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 >> 28 KSP preconditioned resid norm 6.409712928943e-08 true resid norm >> 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 >> 30 KSP preconditioned resid norm 6.013091685526e-07 true resid norm >> 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 >> 32 KSP preconditioned resid norm 7.026562454712e-07 true resid norm >> 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 >> 34 KSP preconditioned resid norm 4.086784421188e-07 true resid norm >> 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 >> 36 KSP preconditioned resid norm 1.651444280250e-06 true resid norm >> 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 >> 38 KSP preconditioned resid norm 1.058319572456e-06 true resid norm >> 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 >> 40 KSP preconditioned resid norm 4.341084013969e-05 true resid norm >> 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 >> 42 KSP preconditioned resid norm 4.190225826231e-07 true resid norm >> 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 >> 44 KSP preconditioned resid norm 1.054511038261e-06 true resid norm >> 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 >> 46 KSP preconditioned resid norm 5.351004248086e-07 true resid norm >> 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 >> 48 KSP preconditioned resid norm 7.104477128923e-07 true resid norm >> 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 >> 50 KSP preconditioned resid norm 1.162050733932e-06 true resid norm >> 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 >> >> >> When I output binaryoutput using my code compiled under 3.0.0-p2 and >> run ex10, I get exactly >> the same output as above. In other words, the matrix/rhs are exactly >> the same (as they >> should be since I didn't change anything), and something about >> 3.0.0-p2 using BCGSL and ILU >> is not working correctly. >> >> I changed ksptype to gmres, and I get the same convergence for both >> 2.3.3-p11 and for >> 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not working >> correctly under 3.0.0-p2. >> >> Randy >> >> >> Barry Smith wrote: >>> Run the old code with -ksp_view_binary this will create a file >>> called binaryoutput; you can then >>> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. Use >>> the ex10 from the old version >>> of PETSc and then the ex10 from the new version. Do they have they >>> same convergence? >>> Now run the new code (that is build your code with petsc-3.0.) with >>> -ksp_view_binary and run that >>> binaryoutput file with the old and new ex10 to see what happens. >>> Basically there are two possible changes with the change in the >>> version: >>> 1) the matrix/right hand side has changed or >>> 2) the solver has changed to behave differently. >>> By running the four cases you can start to get a handle on what has >>> actually changed, this will >>> lead you to what needs to be investigated next. >>> Barry >>> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>>> I just downloaded and compiled Petsc 3.0.0-p2, and after making some >>>> changes >>>> in my code to specify the correct location of the include files, >>>> finally >>>> got everything to compile okay. >>>> >>>> Now, I'm trying to run my test problem, and it's not converging. >>>> When I say >>>> not converging, the first line (with ksp_monitor_true_residual) >>>> shows that >>>> the true and preconditioned residuals are the same as before, but >>>> immediately >>>> thereafter, the preconditioned residual fails to go below 1e-8 >>>> whereas before >>>> it quickly went down to 1e-15. >>>> >>>> The options in my command file are: >>>> >>>> -ksp_type bcgsl >>>> -pc_type bjacobi >>>> -sub_pc_type ilu >>>> -sub_pc_factor_levels 3 >>>> -sub_pc_factor_fill 6 >>>> >>>> >>>> The only thing I see in the Change notes are that the ILU defaults >>>> to shifting >>>> so that it's p.d. but I don't see an easy way to turn this off by >>>> the command >>>> line to see if that's the problem. I tried to do it in my program, >>>> but it's unclear >>>> if I did that correctly. >>>> >>>> Any suggestions? >>>> >>>> Thanks, Randy > From recrusader at gmail.com Sat Jan 31 18:33:00 2009 From: recrusader at gmail.com (Yujie) Date: Sat, 31 Jan 2009 16:33:00 -0800 Subject: further about MatGetSubMatrix_MPIDense(), and MatTranspose() and MatGetSubMatrix() In-Reply-To: <7ff0ee010901311625xec3bf83l8d80e2d5f81c58f3@mail.gmail.com> References: <7ff0ee010901311625xec3bf83l8d80e2d5f81c58f3@mail.gmail.com> Message-ID: <7ff0ee010901311633i6fd61741x3673c464b9b918d9@mail.gmail.com> Hi, PETSc Developers I have sent the matrices and codes to you. However, the sizes of the matrices are a little big and reach your limit. If you need it. I will further send them to you. thanks a lot. Regards, Yujie ---------- Forwarded message ---------- From: Yujie Date: Sat, Jan 31, 2009 at 4:25 PM Subject: further about MatGetSubMatrix_MPIDense(), and MatTranspose() and MatGetSubMatrix() To: PETSc users list Hi, PETSc Developers: Recently, I try to find the bug in MatGetSubMatrix_MPIDense(). The problem should be in "av = v + ((Mat_SeqDense *)newmatd->A->data)->lda*icol[i];" of the following codes. Since "av" and "v" are used for parent matrix, I think it should not be "newmatd" but "mat" in the codes. I have revised it. The results are correct. 224: for (i=0; iA->data)->lda*icol[i]; 226: for (j=0; j -------------- next part -------------- A non-text attachment was scrubbed... Name: ex113.c Type: application/octet-stream Size: 2865 bytes Desc: not available URL: From recrusader at gmail.com Sat Jan 31 18:44:39 2009 From: recrusader at gmail.com (Yujie) Date: Sat, 31 Jan 2009 16:44:39 -0800 Subject: further about MatGetSubMatrix_MPIDense(), and MatTranspose() and MatGetSubMatrix() In-Reply-To: <7ff0ee010901311633i6fd61741x3673c464b9b918d9@mail.gmail.com> References: <7ff0ee010901311625xec3bf83l8d80e2d5f81c58f3@mail.gmail.com> <7ff0ee010901311633i6fd61741x3673c464b9b918d9@mail.gmail.com> Message-ID: <7ff0ee010901311644g5ed7a1a3gc5fc19a3c52d3796@mail.gmail.com> Petsc Version is 3.0.0-p2. ---------- Forwarded message ---------- From: Yujie Date: Sat, Jan 31, 2009 at 4:33 PM Subject: further about MatGetSubMatrix_MPIDense(), and MatTranspose() and MatGetSubMatrix() To: PETSc users list Hi, PETSc Developers I have sent the matrices and codes to you. However, the sizes of the matrices are a little big and reach your limit. If you need it. I will further send them to you. thanks a lot. Regards, Yujie ---------- Forwarded message ---------- From: Yujie Date: Sat, Jan 31, 2009 at 4:25 PM Subject: further about MatGetSubMatrix_MPIDense(), and MatTranspose() and MatGetSubMatrix() To: PETSc users list Hi, PETSc Developers: Recently, I try to find the bug in MatGetSubMatrix_MPIDense(). The problem should be in "av = v + ((Mat_SeqDense *)newmatd->A->data)->lda*icol[i];" of the following codes. Since "av" and "v" are used for parent matrix, I think it should not be "newmatd" but "mat" in the codes. I have revised it. The results are correct. 224: for (i=0; iA->data)->lda*icol[i]; 226: for (j=0; j -------------- next part -------------- A non-text attachment was scrubbed... Name: ex113.c Type: application/octet-stream Size: 2865 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat Jan 31 21:40:19 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 31 Jan 2009 21:40:19 -0600 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <4983E2D5.3010102@gmail.com> References: <4983B0FD.2050006@gmail.com> <4983DD8A.9060701@gmail.com> <13EAD025-72A6-4435-8098-666DD4DA7ADC@mcs.anl.gov> <4983E2D5.3010102@gmail.com> Message-ID: <39F0FBDF-7AAD-4BFD-AE15-C5786082C951@mcs.anl.gov> I have found the problem. Both BiCG-stab and BiCG-stab lhad been optimized to use MDots in some place instead of Dot(). Unfortunately the PetscConj() was not handled properly for complex numbers. This will be fixed in the next patch. You can also use the two attached files src/vec/vec/utils/vinv.c and src/ksp/ksp/impls/bcgsl/bcgsl.c (recompile in those two directories). Barry I'd be very nervous about trusting any linear solution results if the residual norms are large. On Jan 30, 2009, at 11:34 PM, Randall Mackie wrote: > Barry, trust me, these are *NOT* garbage. The reason you're seeing > such > a jump in true residual is because in the case of ex10, I assume > you're starting > with zero values. The nature of the physical system I'm solving has > air layers with > electrical resistivity of 1e10 up against earth layers of > resistivity 1. > A small change causes a large increase in the true residual. In my > program, > I start with 1D boundary values which are much closer to the truth, > and you > see only a slight increase in residual. > > I know from a numerical perspective you don't like these systems, > but that's what we deal with and they have been validated with direct > solvers and against 1D and analytic solutions for appropriate models. > > > I will send the binaryoutput to petsc-maint. Regardless of whether or > not you like my numerical systems, there is obviously something wrong > in 3.0.0-p2 with the BCGSL and BCGS implementations. > > Randy > > > > Barry Smith wrote: >> The convergence of both of those is GARBAGE. The true residual >> gets hugely worse in the first >> step and then stays bad in both cases. The fact that the >> preconditioned residual norm gets small >> better with the old PETSc then the new is kind of irrelevent. The >> solution that KSP claims to give in >> both cases is frankly crap. >> Send me the output for both with -ksp_monitor_true_residual with - >> ksp_type gmres with old and >> new PETSc. >> Barry >> On Jan 30, 2009, at 11:11 PM, Randall Mackie wrote: >>> Barry, >>> >>> I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same >>> binaryoutput >>> using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, >>> sub_pc=ilu) >>> and the convergence is different. Note, this is the binaryoutput >>> from the >>> v 2.3.3-p11. >>> >>> Here is the convergence for ex10 from 2.3.3-p11: >>> >>> [rmackie ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/ >>> tutorials] ./cmd_test >>> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >>> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >>> 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm >>> 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 >>> 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm >>> 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 >>> 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm >>> 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 >>> 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm >>> 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 >>> 10 KSP preconditioned resid norm 5.141701606491e-09 true resid >>> norm 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 >>> 12 KSP preconditioned resid norm 2.591522726108e-09 true resid >>> norm 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 >>> 14 KSP preconditioned resid norm 1.452300847705e-09 true resid >>> norm 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 >>> 16 KSP preconditioned resid norm 8.978785776773e-10 true resid >>> norm 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 >>> 18 KSP preconditioned resid norm 5.743601707920e-10 true resid >>> norm 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 >>> 20 KSP preconditioned resid norm 3.678705188041e-10 true resid >>> norm 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 >>> 22 KSP preconditioned resid norm 2.687340247327e-10 true resid >>> norm 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 >>> 24 KSP preconditioned resid norm 2.142070779181e-10 true resid >>> norm 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 >>> 26 KSP preconditioned resid norm 1.927583902818e-10 true resid >>> norm 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 >>> 28 KSP preconditioned resid norm 1.330050553611e-10 true resid >>> norm 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 >>> 30 KSP preconditioned resid norm 1.125778226987e-10 true resid >>> norm 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 >>> 32 KSP preconditioned resid norm 8.786382414490e-11 true resid >>> norm 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 >>> 34 KSP preconditioned resid norm 6.396029789854e-11 true resid >>> norm 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 >>> 36 KSP preconditioned resid norm 5.462116905743e-11 true resid >>> norm 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 >>> 38 KSP preconditioned resid norm 5.410616131680e-11 true resid >>> norm 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 >>> 40 KSP preconditioned resid norm 7.946054076400e-11 true resid >>> norm 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 >>> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid >>> norm 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >>> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid >>> norm 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >>> Number of iterations = 42 >>> >>> >>> Here is the convergence for ex10 from 3.0.0-p2: >>> >>> rmackie ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/ >>> tutorials] ./cmd_test >>> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >>> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >>> 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm >>> 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 >>> 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm >>> 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 >>> 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm >>> 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 >>> 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm >>> 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 >>> 10 KSP preconditioned resid norm 1.729327127129e-08 true resid >>> norm 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 >>> 12 KSP preconditioned resid norm 1.039787850500e-08 true resid >>> norm 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 >>> 14 KSP preconditioned resid norm 5.025780191774e-09 true resid >>> norm 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 >>> 16 KSP preconditioned resid norm 3.311781967950e-09 true resid >>> norm 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 >>> 18 KSP preconditioned resid norm 5.621276662229e-09 true resid >>> norm 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 >>> 20 KSP preconditioned resid norm 1.184533040734e-08 true resid >>> norm 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 >>> 22 KSP preconditioned resid norm 2.494642590524e-08 true resid >>> norm 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 >>> 24 KSP preconditioned resid norm 5.136091311727e-08 true resid >>> norm 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 >>> 26 KSP preconditioned resid norm 9.627430082715e-08 true resid >>> norm 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 >>> 28 KSP preconditioned resid norm 6.409712928943e-08 true resid >>> norm 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 >>> 30 KSP preconditioned resid norm 6.013091685526e-07 true resid >>> norm 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 >>> 32 KSP preconditioned resid norm 7.026562454712e-07 true resid >>> norm 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 >>> 34 KSP preconditioned resid norm 4.086784421188e-07 true resid >>> norm 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 >>> 36 KSP preconditioned resid norm 1.651444280250e-06 true resid >>> norm 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 >>> 38 KSP preconditioned resid norm 1.058319572456e-06 true resid >>> norm 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 >>> 40 KSP preconditioned resid norm 4.341084013969e-05 true resid >>> norm 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 >>> 42 KSP preconditioned resid norm 4.190225826231e-07 true resid >>> norm 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 >>> 44 KSP preconditioned resid norm 1.054511038261e-06 true resid >>> norm 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 >>> 46 KSP preconditioned resid norm 5.351004248086e-07 true resid >>> norm 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 >>> 48 KSP preconditioned resid norm 7.104477128923e-07 true resid >>> norm 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 >>> 50 KSP preconditioned resid norm 1.162050733932e-06 true resid >>> norm 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 >>> >>> >>> When I output binaryoutput using my code compiled under 3.0.0-p2 >>> and run ex10, I get exactly >>> the same output as above. In other words, the matrix/rhs are >>> exactly the same (as they >>> should be since I didn't change anything), and something about >>> 3.0.0-p2 using BCGSL and ILU >>> is not working correctly. >>> >>> I changed ksptype to gmres, and I get the same convergence for >>> both 2.3.3-p11 and for >>> 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not >>> working correctly under 3.0.0-p2. >>> >>> Randy >>> >>> >>> Barry Smith wrote: >>>> Run the old code with -ksp_view_binary this will create a file >>>> called binaryoutput; you can then >>>> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. >>>> Use the ex10 from the old version >>>> of PETSc and then the ex10 from the new version. Do they have >>>> they same convergence? >>>> Now run the new code (that is build your code with petsc-3.0.) >>>> with -ksp_view_binary and run that >>>> binaryoutput file with the old and new ex10 to see what happens. >>>> Basically there are two possible changes with the change in the >>>> version: >>>> 1) the matrix/right hand side has changed or >>>> 2) the solver has changed to behave differently. >>>> By running the four cases you can start to get a handle on what >>>> has actually changed, this will >>>> lead you to what needs to be investigated next. >>>> Barry >>>> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>>>> I just downloaded and compiled Petsc 3.0.0-p2, and after making >>>>> some changes >>>>> in my code to specify the correct location of the include files, >>>>> finally >>>>> got everything to compile okay. >>>>> >>>>> Now, I'm trying to run my test problem, and it's not converging. >>>>> When I say >>>>> not converging, the first line (with ksp_monitor_true_residual) >>>>> shows that >>>>> the true and preconditioned residuals are the same as before, >>>>> but immediately >>>>> thereafter, the preconditioned residual fails to go below 1e-8 >>>>> whereas before >>>>> it quickly went down to 1e-15. >>>>> >>>>> The options in my command file are: >>>>> >>>>> -ksp_type bcgsl >>>>> -pc_type bjacobi >>>>> -sub_pc_type ilu >>>>> -sub_pc_factor_levels 3 >>>>> -sub_pc_factor_fill 6 >>>>> >>>>> >>>>> The only thing I see in the Change notes are that the ILU >>>>> defaults to shifting >>>>> so that it's p.d. but I don't see an easy way to turn this off >>>>> by the command >>>>> line to see if that's the problem. I tried to do it in my >>>>> program, but it's unclear >>>>> if I did that correctly. >>>>> >>>>> Any suggestions? >>>>> >>>>> Thanks, Randy -------------- next part -------------- A non-text attachment was scrubbed... Name: vinv.c Type: application/octet-stream Size: 40678 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: bcgsl.c Type: application/octet-stream Size: 17709 bytes Desc: not available URL: From rlmackie862 at gmail.com Sat Jan 31 22:08:27 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Sat, 31 Jan 2009 20:08:27 -0800 Subject: Problem in convergence after upgrade to petsc 3.0.0 In-Reply-To: <39F0FBDF-7AAD-4BFD-AE15-C5786082C951@mcs.anl.gov> References: <4983B0FD.2050006@gmail.com> <4983DD8A.9060701@gmail.com> <13EAD025-72A6-4435-8098-666DD4DA7ADC@mcs.anl.gov> <4983E2D5.3010102@gmail.com> <39F0FBDF-7AAD-4BFD-AE15-C5786082C951@mcs.anl.gov> Message-ID: <4985203B.5090309@gmail.com> Thanks. I know you're nervous, but trust me, it's because I have these air layers with resistivity 1e10. A 1% error in the air causes a huge residual. The residual reduction I'm getting is fantastic for these problems - I could send you lots of papers from the geophysics community dealing with low frequency EM propagation in the earth - a nasty numerical problem I've been dealing with for 20 years. The residual error reduction I'm getting now is fantastic, and the solutions at the earth's surface are practically indistinguishable from direct solution results (for smaller models). Actually, it was *YOU* several years ago who forced me to rethink these solutions. Remember, I was beginning to use DA's, and because my equations were so strongly coupled, the performance was terrible in parallel. I had to go back and rethink the physics, and I wound up adding in another equation, div H = 0,l and that stabilized everything and now the convergence is great (even if you don't think so!). Thanks again, I'll give these a try. Randy Barry Smith wrote: > > I have found the problem. Both BiCG-stab and BiCG-stab lhad been > optimized to > use MDots in some place instead of Dot(). Unfortunately the PetscConj() > was not handled > properly for complex numbers. > > This will be fixed in the next patch. You can also use the two > attached files > src/vec/vec/utils/vinv.c and src/ksp/ksp/impls/bcgsl/bcgsl.c (recompile > in those two > directories). > > Barry > > I'd be very nervous about trusting any linear solution results if the > residual norms are large. > > On Jan 30, 2009, at 11:34 PM, Randall Mackie wrote: > >> Barry, trust me, these are *NOT* garbage. The reason you're seeing >> such >> a jump in true residual is because in the case of ex10, I assume >> you're starting >> with zero values. The nature of the physical system I'm solving has >> air layers with >> electrical resistivity of 1e10 up against earth layers of resistivity 1. >> A small change causes a large increase in the true residual. In my >> program, >> I start with 1D boundary values which are much closer to the truth, >> and you >> see only a slight increase in residual. >> >> I know from a numerical perspective you don't like these systems, >> but that's what we deal with and they have been validated with direct >> solvers and against 1D and analytic solutions for appropriate models. >> >> >> I will send the binaryoutput to petsc-maint. Regardless of whether or >> not you like my numerical systems, there is obviously something wrong >> in 3.0.0-p2 with the BCGSL and BCGS implementations. >> >> Randy >> >> >> >> Barry Smith wrote: >>> The convergence of both of those is GARBAGE. The true residual gets >>> hugely worse in the first >>> step and then stays bad in both cases. The fact that the >>> preconditioned residual norm gets small >>> better with the old PETSc then the new is kind of irrelevent. The >>> solution that KSP claims to give in >>> both cases is frankly crap. >>> Send me the output for both with -ksp_monitor_true_residual with >>> -ksp_type gmres with old and >>> new PETSc. >>> Barry >>> On Jan 30, 2009, at 11:11 PM, Randall Mackie wrote: >>>> Barry, >>>> >>>> I've run ex10 from 2.3.3-p11 and from 3.0.0-p2 on the same >>>> binaryoutput >>>> using the same parameters input to ex10 (ksp=bcgsl and pc=bjacobi, >>>> sub_pc=ilu) >>>> and the convergence is different. Note, this is the binaryoutput >>>> from the >>>> v 2.3.3-p11. >>>> >>>> Here is the convergence for ex10 from 2.3.3-p11: >>>> >>>> [rmackie >>>> ~/SPARSE/PETsc/petsc-2.3.3-p11/src/ksp/ksp/examples/tutorials] >>>> ./cmd_test >>>> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >>>> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >>>> 2 KSP preconditioned resid norm 4.170258940641e-07 true resid norm >>>> 1.624814448703e+04 ||Ae||/||Ax|| 4.873234927583e+05 >>>> 4 KSP preconditioned resid norm 8.984963207977e-08 true resid norm >>>> 2.848180153976e+04 ||Ae||/||Ax|| 8.542422193181e+05 >>>> 6 KSP preconditioned resid norm 3.526062648105e-08 true resid norm >>>> 1.884591980560e+04 ||Ae||/||Ax|| 5.652374319564e+05 >>>> 8 KSP preconditioned resid norm 1.241153849592e-08 true resid norm >>>> 1.045401827970e+04 ||Ae||/||Ax|| 3.135427990247e+05 >>>> 10 KSP preconditioned resid norm 5.141701606491e-09 true resid norm >>>> 5.371087738119e+03 ||Ae||/||Ax|| 1.610926859088e+05 >>>> 12 KSP preconditioned resid norm 2.591522726108e-09 true resid norm >>>> 3.519633363751e+03 ||Ae||/||Ax|| 1.055628244456e+05 >>>> 14 KSP preconditioned resid norm 1.452300847705e-09 true resid norm >>>> 2.509415501645e+03 ||Ae||/||Ax|| 7.526380184636e+04 >>>> 16 KSP preconditioned resid norm 8.978785776773e-10 true resid norm >>>> 1.872610647707e+03 ||Ae||/||Ax|| 5.616439231845e+04 >>>> 18 KSP preconditioned resid norm 5.743601707920e-10 true resid norm >>>> 1.266421550645e+03 ||Ae||/||Ax|| 3.798322779917e+04 >>>> 20 KSP preconditioned resid norm 3.678705188041e-10 true resid norm >>>> 7.536064149571e+02 ||Ae||/||Ax|| 2.260258767363e+04 >>>> 22 KSP preconditioned resid norm 2.687340247327e-10 true resid norm >>>> 5.533061905955e+02 ||Ae||/||Ax|| 1.659507062981e+04 >>>> 24 KSP preconditioned resid norm 2.142070779181e-10 true resid norm >>>> 4.684485692902e+02 ||Ae||/||Ax|| 1.404997309978e+04 >>>> 26 KSP preconditioned resid norm 1.927583902818e-10 true resid norm >>>> 6.146993326148e+02 ||Ae||/||Ax|| 1.843640829296e+04 >>>> 28 KSP preconditioned resid norm 1.330050553611e-10 true resid norm >>>> 4.217422387701e+02 ||Ae||/||Ax|| 1.264913055181e+04 >>>> 30 KSP preconditioned resid norm 1.125778226987e-10 true resid norm >>>> 3.058702983977e+02 ||Ae||/||Ax|| 9.173834111652e+03 >>>> 32 KSP preconditioned resid norm 8.786382414490e-11 true resid norm >>>> 2.001904013306e+02 ||Ae||/||Ax|| 6.004223169665e+03 >>>> 34 KSP preconditioned resid norm 6.396029789854e-11 true resid norm >>>> 1.761414838880e+02 ||Ae||/||Ax|| 5.282934504701e+03 >>>> 36 KSP preconditioned resid norm 5.462116905743e-11 true resid norm >>>> 1.540860953921e+02 ||Ae||/||Ax|| 4.621436881724e+03 >>>> 38 KSP preconditioned resid norm 5.410616131680e-11 true resid norm >>>> 1.910613970561e+02 ||Ae||/||Ax|| 5.730420936308e+03 >>>> 40 KSP preconditioned resid norm 7.946054076400e-11 true resid norm >>>> 2.743761797072e+02 ||Ae||/||Ax|| 8.229244781230e+03 >>>> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm >>>> 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >>>> 42 KSP preconditioned resid norm 3.519925335531e-11 true resid norm >>>> 1.022381658728e+02 ||Ae||/||Ax|| 3.066384603244e+03 >>>> Number of iterations = 42 >>>> >>>> >>>> Here is the convergence for ex10 from 3.0.0-p2: >>>> >>>> rmackie >>>> ~/SPARSE/PETsc/petsc-3.0.0-p2/src/ksp/ksp/examples/tutorials] >>>> ./cmd_test >>>> 0 KSP preconditioned resid norm 2.434324107505e-05 true resid norm >>>> 3.334159901684e-02 ||Ae||/||Ax|| 1.000000000000e+00 >>>> 2 KSP preconditioned resid norm 4.798562862178e-07 true resid norm >>>> 1.642225014976e+04 ||Ae||/||Ax|| 4.925453677690e+05 >>>> 4 KSP preconditioned resid norm 1.587355624039e-07 true resid norm >>>> 3.502938376404e+04 ||Ae||/||Ax|| 1.050620989904e+06 >>>> 6 KSP preconditioned resid norm 8.015528577103e-08 true resid norm >>>> 2.536595627800e+04 ||Ae||/||Ax|| 7.607900348506e+05 >>>> 8 KSP preconditioned resid norm 3.039306325625e-08 true resid norm >>>> 1.253662335500e+04 ||Ae||/||Ax|| 3.760054623856e+05 >>>> 10 KSP preconditioned resid norm 1.729327127129e-08 true resid norm >>>> 8.080934369743e+03 ||Ae||/||Ax|| 2.423679309940e+05 >>>> 12 KSP preconditioned resid norm 1.039787850500e-08 true resid norm >>>> 4.777791224801e+03 ||Ae||/||Ax|| 1.432982030162e+05 >>>> 14 KSP preconditioned resid norm 5.025780191774e-09 true resid norm >>>> 3.147651686153e+03 ||Ae||/||Ax|| 9.440614064618e+04 >>>> 16 KSP preconditioned resid norm 3.311781967950e-09 true resid norm >>>> 2.688053658600e+03 ||Ae||/||Ax|| 8.062161797467e+04 >>>> 18 KSP preconditioned resid norm 5.621276662229e-09 true resid norm >>>> 3.098810918425e+03 ||Ae||/||Ax|| 9.294128085637e+04 >>>> 20 KSP preconditioned resid norm 1.184533040734e-08 true resid norm >>>> 5.469874887175e+03 ||Ae||/||Ax|| 1.640555656737e+05 >>>> 22 KSP preconditioned resid norm 2.494642590524e-08 true resid norm >>>> 1.003335643955e+04 ||Ae||/||Ax|| 3.009260723962e+05 >>>> 24 KSP preconditioned resid norm 5.136091311727e-08 true resid norm >>>> 1.828432513826e+04 ||Ae||/||Ax|| 5.483937686680e+05 >>>> 26 KSP preconditioned resid norm 9.627430082715e-08 true resid norm >>>> 1.175348501769e+04 ||Ae||/||Ax|| 3.525171366783e+05 >>>> 28 KSP preconditioned resid norm 6.409712928943e-08 true resid norm >>>> 5.524687582334e+03 ||Ae||/||Ax|| 1.656995388716e+05 >>>> 30 KSP preconditioned resid norm 6.013091685526e-07 true resid norm >>>> 1.371019320496e+04 ||Ae||/||Ax|| 4.112038297274e+05 >>>> 32 KSP preconditioned resid norm 7.026562454712e-07 true resid norm >>>> 1.053982255306e+04 ||Ae||/||Ax|| 3.161162890759e+05 >>>> 34 KSP preconditioned resid norm 4.086784421188e-07 true resid norm >>>> 5.503180350963e+03 ||Ae||/||Ax|| 1.650544818856e+05 >>>> 36 KSP preconditioned resid norm 1.651444280250e-06 true resid norm >>>> 1.984011183420e+04 ||Ae||/||Ax|| 5.950557987388e+05 >>>> 38 KSP preconditioned resid norm 1.058319572456e-06 true resid norm >>>> 1.403784173466e+04 ||Ae||/||Ax|| 4.210308488073e+05 >>>> 40 KSP preconditioned resid norm 4.341084013969e-05 true resid norm >>>> 3.837773616917e+05 ||Ae||/||Ax|| 1.151046659453e+07 >>>> 42 KSP preconditioned resid norm 4.190225826231e-07 true resid norm >>>> 6.768382935039e+03 ||Ae||/||Ax|| 2.030011497535e+05 >>>> 44 KSP preconditioned resid norm 1.054511038261e-06 true resid norm >>>> 4.966771429542e+03 ||Ae||/||Ax|| 1.489662036615e+05 >>>> 46 KSP preconditioned resid norm 5.351004248086e-07 true resid norm >>>> 5.112611747101e+03 ||Ae||/||Ax|| 1.533403285343e+05 >>>> 48 KSP preconditioned resid norm 7.104477128923e-07 true resid norm >>>> 5.478002736962e+03 ||Ae||/||Ax|| 1.642993407183e+05 >>>> 50 KSP preconditioned resid norm 1.162050733932e-06 true resid norm >>>> 5.395393687747e+03 ||Ae||/||Ax|| 1.618216836278e+05 >>>> >>>> >>>> When I output binaryoutput using my code compiled under 3.0.0-p2 and >>>> run ex10, I get exactly >>>> the same output as above. In other words, the matrix/rhs are exactly >>>> the same (as they >>>> should be since I didn't change anything), and something about >>>> 3.0.0-p2 using BCGSL and ILU >>>> is not working correctly. >>>> >>>> I changed ksptype to gmres, and I get the same convergence for both >>>> 2.3.3-p11 and for >>>> 3.0.0-p2. Therefore, I conclude that bcgsl (PCBCGSL) is not working >>>> correctly under 3.0.0-p2. >>>> >>>> Randy >>>> >>>> >>>> Barry Smith wrote: >>>>> Run the old code with -ksp_view_binary this will create a file >>>>> called binaryoutput; you can then >>>>> run src/ksp/ksp/examples/tutorials/ex10.c using that input file. >>>>> Use the ex10 from the old version >>>>> of PETSc and then the ex10 from the new version. Do they have they >>>>> same convergence? >>>>> Now run the new code (that is build your code with petsc-3.0.) with >>>>> -ksp_view_binary and run that >>>>> binaryoutput file with the old and new ex10 to see what happens. >>>>> Basically there are two possible changes with the change in the >>>>> version: >>>>> 1) the matrix/right hand side has changed or >>>>> 2) the solver has changed to behave differently. >>>>> By running the four cases you can start to get a handle on what >>>>> has actually changed, this will >>>>> lead you to what needs to be investigated next. >>>>> Barry >>>>> On Jan 30, 2009, at 8:01 PM, Randall Mackie wrote: >>>>>> I just downloaded and compiled Petsc 3.0.0-p2, and after making >>>>>> some changes >>>>>> in my code to specify the correct location of the include files, >>>>>> finally >>>>>> got everything to compile okay. >>>>>> >>>>>> Now, I'm trying to run my test problem, and it's not converging. >>>>>> When I say >>>>>> not converging, the first line (with ksp_monitor_true_residual) >>>>>> shows that >>>>>> the true and preconditioned residuals are the same as before, but >>>>>> immediately >>>>>> thereafter, the preconditioned residual fails to go below 1e-8 >>>>>> whereas before >>>>>> it quickly went down to 1e-15. >>>>>> >>>>>> The options in my command file are: >>>>>> >>>>>> -ksp_type bcgsl >>>>>> -pc_type bjacobi >>>>>> -sub_pc_type ilu >>>>>> -sub_pc_factor_levels 3 >>>>>> -sub_pc_factor_fill 6 >>>>>> >>>>>> >>>>>> The only thing I see in the Change notes are that the ILU defaults >>>>>> to shifting >>>>>> so that it's p.d. but I don't see an easy way to turn this off by >>>>>> the command >>>>>> line to see if that's the problem. I tried to do it in my program, >>>>>> but it's unclear >>>>>> if I did that correctly. >>>>>> >>>>>> Any suggestions? >>>>>> >>>>>> Thanks, Randy