From khalid_eee at yahoo.com Fri Apr 1 00:53:23 2011 From: khalid_eee at yahoo.com (khalid ashraf) Date: Thu, 31 Mar 2011 22:53:23 -0700 (PDT) Subject: [petsc-users] FFT with PETSC Message-ID: <932406.30502.qm@web112615.mail.gq1.yahoo.com> I tried to use the FFTW using the PETSC interface. It gives the following error ex142.c(8): catastrophic error: could not open source file "petscmat.h" #include ^ The same error with Do I need to configure petsc with something else ? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Fri Apr 1 01:44:42 2011 From: jed at 59A2.org (Jed Brown) Date: Fri, 1 Apr 2011 09:44:42 +0300 Subject: [petsc-users] PETSC with SuperLU In-Reply-To: <4D9536AE.4050301@UManitoba.ca> References: <4D9536AE.4050301@UManitoba.ca> Message-ID: On Fri, Apr 1, 2011 at 05:21, Ormiston, Scott J. wrote: > I rebuilt PETSc with SuperLU and ParMETIS, and then I tried to run ex15f.F > (it was working with a previous build of PETSc). > SuperLU is not the same package as SuperLU_Dist (don't ask me why they organize software that way). You also misspelled the command line option. If you run the example below in serial with -pc_factor_mat_solver_package superlu it should work (using SuperLU instead of PETSc's native direct solver). To use SuperLU_Dist (which works in parallel), configure with --download-superlu_dist and run with -pc_factor_mat_solver_package superlu_dist. You do not have to touch the source code. > > I ran > > % mpiexec -n 4 ex15f -pc_type lu -pc_factor_mat_solver_type superlu_dist > > but this generated the following error messages: > > > ==================================================================================== > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct > solver! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32 > CST 2010 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31 > 21:15:34 2011 > [0]PETSC ERROR: Libraries linked from > /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011 > [0]PETSC ERROR: Configure options > --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist > --download-parmetis > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c > [0]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: No support for this operation for this object type! > [1]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct > solver! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32 > CST 2010 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31 > 21:15:34 2011 > [1]PETSC ERROR: Libraries linked from > /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib > [1]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011 > [1]PETSC ERROR: Configure options > --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist > --download-parmetis > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c > [1]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c > [1]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c > [1]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: No support for this operation for this object type! > [2]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct > solver! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32 > CST 2010 > [2]PETSC ERROR: See docs/changes/index.html for recent updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [2]PETSC ERROR: See docs/index.html for manual pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31 > 21:15:34 2011 > [2]PETSC ERROR: Libraries linked from > /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib > [2]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011 > [2]PETSC ERROR: Configure options > --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist > --download-parmetis > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c > [2]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c > [2]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c > [2]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: No support for this operation for this object type! > [3]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct > solver! > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32 > CST 2010 > [3]PETSC ERROR: See docs/changes/index.html for recent updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [3]PETSC ERROR: See docs/index.html for manual pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31 > 21:15:34 2011 > [3]PETSC ERROR: Libraries linked from > /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib > [3]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011 > [3]PETSC ERROR: Configure options > --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist > --download-parmetis > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c > [3]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c > [3]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c > [3]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [3]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > Norm of error 2.0050E+02 iterations 0 > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-pc_factor_mat_solver_type value: superlu_dist > > ==================================================================================== > > Does this require changing the code in ex15f.F or is there something else > that I am doing wrong? > > Scott Ormiston > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Fri Apr 1 01:56:37 2011 From: jed at 59A2.org (Jed Brown) Date: Fri, 1 Apr 2011 09:56:37 +0300 Subject: [petsc-users] FFT with PETSC In-Reply-To: <932406.30502.qm@web112615.mail.gq1.yahoo.com> References: <932406.30502.qm@web112615.mail.gq1.yahoo.com> Message-ID: On Fri, Apr 1, 2011 at 08:53, khalid ashraf wrote: > ex142.c(8): catastrophic error: could not open source file "petscmat.h" > #include > ^ > What commands are printed when you typed "make ex142"? Are you trying to invoke the compiler yourself (thus missing some include paths)? > The same error with > > Do I need to configure petsc with something else ? > Did you use --download-fftw or --with-fftw-dir=/path/to/your/fftw3 ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Fri Apr 1 02:54:17 2011 From: gdiso at ustc.edu (Gong Ding) Date: Fri, 1 Apr 2011 15:54:17 +0800 (CST) Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not prealloc memory? Message-ID: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu> Hi, I am stiall dealing with the ill conditioned problem. :-( Yesderday, I installed slepc-3.1-p6 for SVD calculation of my matrix. The SVD solver works well for the largest singualr value calculation. But for the smallest singualr value, all most all the methods fail. Finally, I chosen the most inefficient way. That build the cyclic matrix explicitly with shift-and-invert spectral transformation. And solve the eigen value problem by LU preconditioned GMRES. The preconditioner should be superlu rather than others. I guess the reason is superlu use static pivot. Because solver with partial pivot such as mumps can not work. Anyway, slepc solved my problem. However, the explicit building cyclic matrix takes too long to finish. The log info says [0] MatSetUpPreallocation(): Warning not preallocating matrix storage [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 35038 X 35038; storage space: 325302 unneeded,482948 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 42204 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 33 [0] Mat_CheckInode(): Found 26800 nodes of 35038. Limit used: 5. Using Inode routines It seems no preallocation for cyclic matrix. Is it a bug or I forgot something? From jroman at dsic.upv.es Fri Apr 1 03:11:06 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 1 Apr 2011 10:11:06 +0200 Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not prealloc memory? In-Reply-To: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu> References: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu> Message-ID: On 01/04/2011, Gong Ding wrote: > Hi, > I am stiall dealing with the ill conditioned problem. :-( > Yesderday, I installed slepc-3.1-p6 for SVD calculation of my matrix. > > The SVD solver works well for the largest singualr value calculation. > But for the smallest singualr value, all most all the methods fail. > > Finally, I chosen the most inefficient way. > That build the cyclic matrix explicitly with > shift-and-invert spectral transformation. > And solve the eigen value problem by LU preconditioned GMRES. > The preconditioner should be superlu rather than others. > I guess the reason is superlu use static pivot. > Because solver with partial pivot such as mumps can not work. > Anyway, slepc solved my problem. > > However, the explicit building cyclic matrix takes too long to finish. > The log info says > > [0] MatSetUpPreallocation(): Warning not preallocating matrix storage > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 35038 X 35038; storage space: 325302 unneeded,482948 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 42204 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 33 > [0] Mat_CheckInode(): Found 26800 nodes of 35038. Limit used: 5. Using Inode routines > > It seems no preallocation for cyclic matrix. Is it a bug or I forgot something? > Yes, you are right. No preallocation is done in this case within SLEPc. This is a problem also in SLEPc's QEPLINEAR. This is pending, my intention is to get it fixed for the next release. Thanks. Jose From gdiso at ustc.edu Fri Apr 1 03:45:00 2011 From: gdiso at ustc.edu (Gong Ding) Date: Fri, 1 Apr 2011 16:45:00 +0800 (CST) Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not prealloc memory? In-Reply-To: References: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu> Message-ID: <5531913.32841301647500088.JavaMail.coremail@mail.ustc.edu> > On 01/04/2011, Gong Ding wrote: > > > > > Hi, > > > I am stiall dealing with the ill conditioned problem. :-( > > > Yesderday, I installed slepc-3.1-p6 for SVD calculation of my matrix. > > > > > > The SVD solver works well for the largest singualr value calculation. > > > But for the smallest singualr value, all most all the methods fail. > > > > > > Finally, I chosen the most inefficient way. > > > That build the cyclic matrix explicitly with > > > shift-and-invert spectral transformation. > > > And solve the eigen value problem by LU preconditioned GMRES. > > > The preconditioner should be superlu rather than others. > > > I guess the reason is superlu use static pivot. > > > Because solver with partial pivot such as mumps can not work. > > > Anyway, slepc solved my problem. > > > > > > However, the explicit building cyclic matrix takes too long to finish. > > > The log info says > > > > > > [0] MatSetUpPreallocation(): Warning not preallocating matrix storage > > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 35038 X 35038; storage space: 325302 unneeded,482948 used > > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 42204 > > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 33 > > > [0] Mat_CheckInode(): Found 26800 nodes of 35038. Limit used: 5. Using Inode routines > > > > > > It seems no preallocation for cyclic matrix. Is it a bug or I forgot something? > > > > > > > Yes, you are right. No preallocation is done in this case within SLEPc. This is a problem also in SLEPc's QEPLINEAR. This is pending, my intention is to get it fixed for the next release. > Hope the problem can be solved soon. And do you have some comment on how to solve the smallest singular value? I guess i am not on the right way since matlab (with arpack) can calculate smallest singular value. But I never make arpack work. Even for smallest eigen value problem, arpack report no eigen value are found. From jroman at dsic.upv.es Fri Apr 1 07:02:50 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 1 Apr 2011 14:02:50 +0200 Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not prealloc memory? In-Reply-To: <5531913.32841301647500088.JavaMail.coremail@mail.ustc.edu> References: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu> <5531913.32841301647500088.JavaMail.coremail@mail.ustc.edu> Message-ID: On 01/04/2011, Gong Ding wrote: > Hope the problem can be solved soon. > And do you have some comment on how to solve the smallest singular value? > I guess i am not on the right way since matlab (with arpack) can calculate smallest singular value. > But I never make arpack work. > Even for smallest eigen value problem, arpack report no eigen value are found. > As discussed in section 3.3 of our report http://www.grycap.upv.es/slepc/documentation/reports/str8.pdf this is a difficult case. Probably the best choice is to use harmonic extraction in trlanczos. But this is not implemented in SLEPc, and there is no guarantee it works for difficult problems. Jose From SJ_Ormiston at UManitoba.ca Fri Apr 1 08:43:12 2011 From: SJ_Ormiston at UManitoba.ca (Ormiston, Scott J.) Date: Fri, 01 Apr 2011 08:43:12 -0500 Subject: [petsc-users] Performance of superlu_dist Message-ID: <4D95D670.3050102@UManitoba.ca> I am just starting to try superlu_dist to get a direct solver that runs in parallel with PETSc. My first tests (with ex15f) show that it takes longer and longer as the number of cores increases. For example 4 cores takes 8 times longer than 2 cores and 8 cores takes 25 times longer than 4 cores. Obviously I expected a speed-up; has anyone else seen this behaviour with superlu_dist? If not, what could be going wrong here? Scott Ormiston -------------- next part -------------- A non-text attachment was scrubbed... Name: SJ_Ormiston.vcf Type: text/x-vcard Size: 321 bytes Desc: not available URL: From desire.nuentsa_wakam at inria.fr Fri Apr 1 09:48:45 2011 From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM) Date: Fri, 01 Apr 2011 16:48:45 +0200 Subject: [petsc-users] Performance of superlu_dist In-Reply-To: <4D95D670.3050102@UManitoba.ca> References: <4D95D670.3050102@UManitoba.ca> Message-ID: <4D95E5CD.8040107@inria.fr> On a multicore node, you may not get a very good speedup if the bandwidth is heavily shared between all the cores. I guess this is what Petsc people have explained here http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#computers If you have a multi-socket multicore node, my guess would be to keep one MPI process on each socket and then to use a multithreaded BLAS (like Goto) inside each socket to keep the cores busy during BLAS operations. Hope this helps Desire On 04/01/2011 03:43 PM, Ormiston, Scott J. wrote: > I am just starting to try superlu_dist to get a direct solver that > runs in parallel with PETSc. > > My first tests (with ex15f) show that it takes longer and longer as > the number of cores increases. For example 4 cores takes 8 times > longer than 2 cores and 8 cores takes 25 times longer than 4 cores. > Obviously I expected a speed-up; has anyone else seen this behaviour > with superlu_dist? If not, what could be going wrong here? > > Scott Ormiston From SJ_Ormiston at UManitoba.ca Fri Apr 1 10:40:16 2011 From: SJ_Ormiston at UManitoba.ca (Ormiston, Scott J.) Date: Fri, 01 Apr 2011 10:40:16 -0500 Subject: [petsc-users] Performance of superlu_dist In-Reply-To: <4D95E5CD.8040107@inria.fr> References: <4D95D670.3050102@UManitoba.ca> <4D95E5CD.8040107@inria.fr> Message-ID: <4D95F1E0.6020203@UManitoba.ca> Desire NUENTSA WAKAM wrote: > On a multicore node, you may not get a very good speedup if the > bandwidth is heavily shared between all the cores. I guess this is what > Petsc people have explained here > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#computers > If you have a multi-socket multicore node, my guess would be to keep one > MPI process on each socket and then to use a multithreaded BLAS (like > Goto) inside each socket to keep the cores busy during BLAS operations. > Hope this helps It was very helpful. Merci infiniment. Scott Ormiston -------------- next part -------------- A non-text attachment was scrubbed... Name: SJ_Ormiston.vcf Type: text/x-vcard Size: 321 bytes Desc: not available URL: From kenway at utias.utoronto.ca Fri Apr 1 12:13:28 2011 From: kenway at utias.utoronto.ca (Gaetan Kenway) Date: Fri, 1 Apr 2011 13:13:28 -0400 Subject: [petsc-users] Performance of superlu_dist Message-ID: I have seen the same thing with SuperLU_dist as Scott Ormiston has. I've been using to solve (small-ish) 3D solid finite element structural system with rarely more than ~30,000 dof. Basically, if you use more than 2 cores, SuperLU_dist tanks and the factorization time goes through the roof exponentially. However, if you solve the same system with Spooles, its orders of magnitude faster. I'm not overly concerned with speed, since I only do this factorization once in my code and as such I don't have precise timing results. WIth 22,000 dof on an dual socket Xeon X5500 series machine (8 cores per node), with spooles, there's a speed up going from 1-8 procs. I could go up to about 32 procs before it takes longer than the single processor case. I hope this is of some use. Gaetan -------------- next part -------------- An HTML attachment was scrubbed... URL: From SJ_Ormiston at UManitoba.ca Fri Apr 1 12:36:05 2011 From: SJ_Ormiston at UManitoba.ca (Ormiston, Scott J.) Date: Fri, 01 Apr 2011 12:36:05 -0500 Subject: [petsc-users] Performance of superlu_dist In-Reply-To: References: Message-ID: <4D960D05.4030908@UManitoba.ca> Gaetan Kenway wrote: > I have seen the same thing with SuperLU_dist as Scott Ormiston has. I've > been using to solve (small-ish) 3D solid finite element structural > system with rarely more than ~30,000 dof. Basically, if you use more > than 2 cores, SuperLU_dist tanks and the factorization time goes through > the roof exponentially. However, if you solve the same system with > Spooles, its orders of magnitude faster. I'm not overly concerned with > speed, since I only do this factorization once in my code and as such I > don't have precise timing results. WIth 22,000 dof on an dual socket > Xeon X5500 series machine (8 cores per node), with spooles, there's a > speed up going from 1-8 procs. I could go up to about 32 procs before it > takes longer than the single processor case. Following the suggestion of Desire Nuentsa Wakam (who pointed me to the FAQ), I have had better performance from superlu_dist using mpiexec --cpus-per-proc 4 --bind-to-core -np 3 executable_name \ -pc_type lu -pc_factor_mat_solver_package superlu_dist on a server that has 4 quad-core CPUS and 64 Gb of RAM. I assume other option settings will be needed on other arrangements of cores and interconnects. I have not done enough tests to see about any speed-up. Thank you for your pointer to Spooles. Scott Ormiston -------------- next part -------------- A non-text attachment was scrubbed... Name: SJ_Ormiston.vcf Type: text/x-vcard Size: 321 bytes Desc: not available URL: From gaurish108 at gmail.com Fri Apr 1 18:02:38 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Fri, 1 Apr 2011 19:02:38 -0400 Subject: [petsc-users] Implementing a new routine in PETSc. Message-ID: Hi, I am planning to implement the LSMR algorithm in PETSc which does least squares and is supposed to have more favourable mathematical properties than LSQR which has already been implemented in PETSc. But I am not really sure how to go about this, since the implementations of the standard KSP methods themselves look quite complicated. The manual does not seeem to say much about how to go about adding routines to the PETSc library. Could you give me guidelines I should follow while implementing a KSP method? Also is it necessary to build the PETSc library again after implementing this routine. If so is it necessary to make any changes to makefiles ? Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Apr 1 18:18:07 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 1 Apr 2011 18:18:07 -0500 Subject: [petsc-users] Implementing a new routine in PETSc. In-Reply-To: References: Message-ID: You must install petsc-dev http://www.mcs.anl.gov/petsc/petsc-as/developers/index.html to develop this new code. The CG implementation has detailed information about what needs to be provided for a new Krylov method. Start by making a new directory src/ksp/ksp/impls/lsmr and copy over to it the files src/ksp/ksp/impls/cg/ makefile cg.c cgimpl.h call them lsmr.c and lsmrimpl.h modify the copied over makefile to list the lsmr stuff instead of the cg In lsmrimpl.h put in the data structure you'll need to store all the vectors and other information needed by lsmr in lsmr.c go through the current code and change it all for the lsmr algorithm. You do not need to recompile all of PETSc to access the new solver, just run make in that new directory. You will also need to edit src/ksp/ksp/interface/itregis.c to register your new method. Join petsc-dev http://www.mcs.anl.gov/petsc/petsc-as/miscellaneous/mailing-lists.html to correspond with the petsc developers if questions/issues come up. Have fun, Barry On Apr 1, 2011, at 6:02 PM, Gaurish Telang wrote: > Hi, > > I am planning to implement the LSMR algorithm in PETSc which does least squares and is supposed to have more favourable mathematical properties than LSQR which has already been implemented in PETSc. > > But I am not really sure how to go about this, since the implementations of the standard KSP methods themselves look quite complicated. The manual does not seeem to say much > about how to go about adding routines to the PETSc library. > > Could you give me guidelines I should follow while implementing a KSP method? > > Also is it necessary to build the PETSc library again after implementing this routine. If so is it necessary to make any changes to makefiles ? > > Regards From gaurish108 at gmail.com Sun Apr 3 19:26:55 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Sun, 3 Apr 2011 20:26:55 -0400 Subject: [petsc-users] meaning of KSP_MatMult Message-ID: Hi What is the difference between KSP_MatMult and MatMult? I am trying to implement a new KSP method and see that all Matrix vector multiplies are done with KSP_MatMult in cg.c which implements the conjugate gradient. Regards, Gaurish -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Apr 3 19:57:49 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 3 Apr 2011 19:57:49 -0500 Subject: [petsc-users] meaning of KSP_MatMult In-Reply-To: References: Message-ID: Usually it is best just to look at the code. #define KSP_MatMult(ksp,A,x,y) (!ksp->transpose_solve) ? MatMult(A,x,y) : MatMultTranspose(A,x,y) It is only there to allow the same code work to solve with A or the transpose system with A'. Of course with CG it doesn't even need to be used since the matrix is symmetric. Barry On Apr 3, 2011, at 7:26 PM, Gaurish Telang wrote: > Hi > > What is the difference between KSP_MatMult and MatMult? I am trying to implement a new KSP method and see that all Matrix vector multiplies are done with > KSP_MatMult in cg.c which implements the conjugate gradient. > > Regards, > > Gaurish From gaurish108 at gmail.com Sun Apr 3 20:45:22 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Sun, 3 Apr 2011 21:45:22 -0400 Subject: [petsc-users] KSP structure and understanding some PETSc functions, . Message-ID: Hi, Where can I find the details of the ksp data structure? Specifically I wish to understand what ksp->converged, ksp->reason, and ksp>cnvP mean. Also I think the functions PetscObjecttakeAccess and PetscObjectGrantAccess are undocumented. What do these functions do? Thanks, Gaurish. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Apr 3 21:51:57 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 3 Apr 2011 21:51:57 -0500 Subject: [petsc-users] KSP structure and understanding some PETSc functions, . In-Reply-To: References: Message-ID: <50D51353-F3B5-437B-A35C-12EF8F2DEAAE@mcs.anl.gov> On Apr 3, 2011, at 8:45 PM, Gaurish Telang wrote: > Hi, > > Where can I find the details of the ksp data structure? Specifically I wish to understand what ksp->converged, ksp->reason, and ksp>cnvP mean. include/private/kspimpl.h > > Also I think the functions PetscObjecttakeAccess and PetscObjectGrantAccess are undocumented. What do these functions do? These are currently unused and can be ignored; there is no reason to put them in the code. Barry > > Thanks, > > Gaurish. > From ckontzialis at lycos.com Mon Apr 4 00:59:08 2011 From: ckontzialis at lycos.com (Kontsantinos Kontzialis) Date: Mon, 04 Apr 2011 08:59:08 +0300 Subject: [petsc-users] help with snes Message-ID: <4D995E2C.8060106@lycos.com> Dear all, I use snes to apply an implicit Runge-Kutta method. I see that the snes and ksp solvers converge very slowly. Here is the runtime options I use: mpiexec -np 4 ./hoac cylinder -snes_mf_operator -pc_type hypre -pc_hypre_type euclid -pc_hypre_euclid_levels 2 -ksp_type gmres -dt 1.0e-1 -n_out 10 -file_out cylinder.txt -mat_inode_limit 5 -snes_monitor -end_time 1.0e+1 -roe_flux -snes_converged_reason -snes_max_fail 50 -u_mom 0.2 -implicit -implicit_type 2 -snes_atol 1.0e-6 -snes_ksp_ew_conv -ksp_gmres_cgs_refinement_type REFINE_IFNEEDED -ksp_gmres_classicalgramschmidt Also, I use coloring to compute the jacobian of the system. Any suggestions? Thank you, Costas From jed at 59A2.org Mon Apr 4 01:21:02 2011 From: jed at 59A2.org (Jed Brown) Date: Mon, 4 Apr 2011 08:21:02 +0200 Subject: [petsc-users] help with snes In-Reply-To: <4D995E2C.8060106@lycos.com> References: <4D995E2C.8060106@lycos.com> Message-ID: On Mon, Apr 4, 2011 at 07:59, Kontsantinos Kontzialis wrote: > I use snes to apply an implicit Runge-Kutta method. I see that the snes and > ksp solvers converge very slowly. Here is the runtime options I use: > > mpiexec -np 4 ./hoac cylinder -snes_mf_operator -pc_type hypre > -pc_hypre_type euclid -pc_hypre_euclid_levels 2 -ksp_type gmres -dt 1.0e-1 > -n_out 10 -file_out cylinder.txt -mat_inode_limit 5 -snes_monitor -end_time > 1.0e+1 -roe_flux -snes_converged_reason -snes_max_fail 50 -u_mom 0.2 > -implicit -implicit_type 2 -snes_atol 1.0e-6 -snes_ksp_ew_conv > -ksp_gmres_cgs_refinement_type REFINE_IFNEEDED > -ksp_gmres_classicalgramschmidt > > Also, I use coloring to compute the jacobian of the system. Any > suggestions? > This is not enough information to do more than guess. What equations are you solving, what methods have you tried, and how do they perform (show convergence history, "very slowly" means very different things to different people)? To speed up Newton, you can (a) more accurate linear solve, (b) better initial guess, e.g. provided by stable extrapolation or grid sequencing, (c) more exotic things like nonlinear Schwarz. For the linear solve, you usually have to improve the preconditioner (unless something is being done "wrong" like failing to acknowledge a low-dimensional null space). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckontzialis at lycos.com Mon Apr 4 04:21:53 2011 From: ckontzialis at lycos.com (Kontsantinos Kontzialis) Date: Mon, 04 Apr 2011 12:21:53 +0300 Subject: [petsc-users] help with snes In-Reply-To: References: <4D995E2C.8060106@lycos.com> Message-ID: <4D998DB1.7000601@lycos.com> On 04/04/2011 09:21 AM, Jed Brown wrote: > On Mon, Apr 4, 2011 at 07:59, Kontsantinos Kontzialis > > wrote: > > I use snes to apply an implicit Runge-Kutta method. I see that the > snes and ksp solvers converge very slowly. Here is the runtime > options I use: > > mpiexec -np 4 ./hoac cylinder -snes_mf_operator -pc_type hypre > -pc_hypre_type euclid -pc_hypre_euclid_levels 2 -ksp_type gmres > -dt 1.0e-1 -n_out 10 -file_out cylinder.txt -mat_inode_limit 5 > -snes_monitor -end_time 1.0e+1 -roe_flux -snes_converged_reason > -snes_max_fail 50 -u_mom 0.2 -implicit -implicit_type 2 -snes_atol > 1.0e-6 -snes_ksp_ew_conv -ksp_gmres_cgs_refinement_type > REFINE_IFNEEDED -ksp_gmres_classicalgramschmidt > > Also, I use coloring to compute the jacobian of the system. Any > suggestions? > > > This is not enough information to do more than guess. What equations > are you solving, what methods have you tried, and how do they perform > (show convergence history, "very slowly" means very different things > to different people)? > > To speed up Newton, you can (a) more accurate linear solve, (b) better > initial guess, e.g. provided by stable extrapolation or grid > sequencing, (c) more exotic things like nonlinear Schwarz. For the > linear solve, you usually have to improve the preconditioner (unless > something is being done "wrong" like failing to acknowledge a > low-dimensional null space). Jed, I am using a Discontinuous Galerkin method for the Euler equations of gas dynamics. I noticed that snes iterations do not drop the function norm and i need to set ksp_rtol to very low values in order to get converged solution. But this takes time. I use a matrix-free method with coloring for computing the jacobian as a preconditioner. For instance Timestep 0: step size = 0.1, time = 0, 2-norm residual = 0, CFL = 2910.23 Stage 1 0 SNES Function norm 1.027079922974e-01 0 KSP Residual norm 6.927237709481e+00 1 KSP Residual norm 6.418924686154e-01 2 KSP Residual norm 3.848527585443e-01 3 KSP Residual norm 3.125092840762e-01 4 KSP Residual norm 1.723224124866e-01 5 KSP Residual norm 1.593049965355e-01 6 KSP Residual norm 1.316596985033e-01 7 KSP Residual norm 1.103893708040e-01 8 KSP Residual norm 7.067294579947e-02 9 KSP Residual norm 6.780500381016e-02 10 KSP Residual norm 5.681755604855e-02 11 KSP Residual norm 5.080542537215e-02 12 KSP Residual norm 4.491096057365e-02 13 KSP Residual norm 3.677544617683e-02 14 KSP Residual norm 3.511275641736e-02 15 KSP Residual norm 2.950583865595e-02 16 KSP Residual norm 2.907564423196e-02 17 KSP Residual norm 2.503394162059e-02 18 KSP Residual norm 2.404208792763e-02 19 KSP Residual norm 2.384167488597e-02 20 KSP Residual norm 2.152623807336e-02 21 KSP Residual norm 2.140123445364e-02 22 KSP Residual norm 1.868992128733e-02 23 KSP Residual norm 1.828641368024e-02 24 KSP Residual norm 1.490016051590e-02 25 KSP Residual norm 1.460258473433e-02 26 KSP Residual norm 1.297026817438e-02 27 KSP Residual norm 1.287620709307e-02 28 KSP Residual norm 1.227961630095e-02 29 KSP Residual norm 1.195861330978e-02 30 KSP Residual norm 1.194779545098e-02 31 KSP Residual norm 1.194557385421e-02 32 KSP Residual norm 1.179010880493e-02 33 KSP Residual norm 1.149738273328e-02 34 KSP Residual norm 1.145048206086e-02 35 KSP Residual norm 1.085469098538e-02 36 KSP Residual norm 1.074679906620e-02 37 KSP Residual norm 9.663878491019e-03 38 KSP Residual norm 9.616289395896e-03 39 KSP Residual norm 8.856279445189e-03 40 KSP Residual norm 8.843663029957e-03 41 KSP Residual norm 8.499163748714e-03 42 KSP Residual norm 8.390473347232e-03 43 KSP Residual norm 8.308302849856e-03 44 KSP Residual norm 7.657464451533e-03 45 KSP Residual norm 7.633733629069e-03 46 KSP Residual norm 6.791549457705e-03 47 KSP Residual norm 6.761807932440e-03 48 KSP Residual norm 6.310159420349e-03 49 KSP Residual norm 6.308135765891e-03 50 KSP Residual norm 6.178469644420e-03 51 KSP Residual norm 6.092441106798e-03 52 KSP Residual norm 6.060022864473e-03 53 KSP Residual norm 5.690961052139e-03 54 KSP Residual norm 5.643775917490e-03 55 KSP Residual norm 5.203641551515e-03 56 KSP Residual norm 5.146470256961e-03 57 KSP Residual norm 4.903702702525e-03 58 KSP Residual norm 4.898824722874e-03 59 KSP Residual norm 4.813924377178e-03 60 KSP Residual norm 4.723526162768e-03 61 KSP Residual norm 4.711592750967e-03 62 KSP Residual norm 4.631580873260e-03 63 KSP Residual norm 4.616608546291e-03 64 KSP Residual norm 4.600837802917e-03 65 KSP Residual norm 4.474343564228e-03 66 KSP Residual norm 4.443094833339e-03 67 KSP Residual norm 4.224165899917e-03 68 KSP Residual norm 4.200220155408e-03 69 KSP Residual norm 3.982639561746e-03 70 KSP Residual norm 3.982638871843e-03 71 KSP Residual norm 3.922944438857e-03 72 KSP Residual norm 3.874158019578e-03 73 KSP Residual norm 3.846756027386e-03 74 KSP Residual norm 3.602959352657e-03 75 KSP Residual norm 3.555118913445e-03 76 KSP Residual norm 3.177203273748e-03 77 KSP Residual norm 3.156576767081e-03 78 KSP Residual norm 3.001743691226e-03 79 KSP Residual norm 2.993215617549e-03 80 KSP Residual norm 2.958723177572e-03 81 KSP Residual norm 2.756494447464e-03 82 KSP Residual norm 2.748309056765e-03 83 KSP Residual norm 2.334551378904e-03 84 KSP Residual norm 2.330135779280e-03 85 KSP Residual norm 2.158060720380e-03 86 KSP Residual norm 2.142593286155e-03 87 KSP Residual norm 2.117828962844e-03 88 KSP Residual norm 1.994027737245e-03 89 KSP Residual norm 1.992614508152e-03 90 KSP Residual norm 1.846186979481e-03 91 KSP Residual norm 1.845847460736e-03 92 KSP Residual norm 1.750049894784e-03 93 KSP Residual norm 1.749884975326e-03 94 KSP Residual norm 1.718243135806e-03 95 KSP Residual norm 1.692788497273e-03 96 KSP Residual norm 1.690541479527e-03 97 KSP Residual norm 1.606671137740e-03 98 KSP Residual norm 1.604648007071e-03 99 KSP Residual norm 1.491359345392e-03 100 KSP Residual norm 1.487500193779e-03 101 KSP Residual norm 1.431517768459e-03 102 KSP Residual norm 1.413892181036e-03 103 KSP Residual norm 1.411244613047e-03 104 KSP Residual norm 1.358895518150e-03 105 KSP Residual norm 1.352907680814e-03 106 KSP Residual norm 1.290825812795e-03 107 KSP Residual norm 1.277895797362e-03 108 KSP Residual norm 1.239827948290e-03 109 KSP Residual norm 1.231707202118e-03 110 KSP Residual norm 1.226844439539e-03 111 KSP Residual norm 1.207141785145e-03 112 KSP Residual norm 1.205237619380e-03 113 KSP Residual norm 1.174105568044e-03 114 KSP Residual norm 1.161571710906e-03 115 KSP Residual norm 1.127339527466e-03 116 KSP Residual norm 1.118975806161e-03 117 KSP Residual norm 1.108813712733e-03 118 KSP Residual norm 1.097495827387e-03 119 KSP Residual norm 1.095458651348e-03 120 KSP Residual norm 1.099354754200e-03 121 KSP Residual norm 1.097690891747e-03 122 KSP Residual norm 1.080236637641e-03 123 KSP Residual norm 1.075249519522e-03 124 KSP Residual norm 1.072337341208e-03 125 KSP Residual norm 1.061274156366e-03 126 KSP Residual norm 1.061069171322e-03 127 KSP Residual norm 1.034869646780e-03 128 KSP Residual norm 1.033331884671e-03 129 KSP Residual norm 1.008920875186e-03 130 KSP Residual norm 1.004952247105e-03 131 KSP Residual norm 9.995273012222e-04 132 KSP Residual norm 9.912110212877e-04 133 KSP Residual norm 9.911763000203e-04 134 KSP Residual norm 9.706007632685e-04 135 KSP Residual norm 9.684837666814e-04 136 KSP Residual norm 9.322763749700e-04 137 KSP Residual norm 9.298333190027e-04 138 KSP Residual norm 9.127255619624e-04 139 KSP Residual norm 9.021675689666e-04 140 KSP Residual norm 9.021675667879e-04 141 KSP Residual norm 8.785893989448e-04 142 KSP Residual norm 8.723618462518e-04 143 KSP Residual norm 8.321111011021e-04 144 KSP Residual norm 8.231111137986e-04 145 KSP Residual norm 7.991951376136e-04 146 KSP Residual norm 7.918718558159e-04 147 KSP Residual norm 7.910248570583e-04 148 KSP Residual norm 7.797518457299e-04 149 KSP Residual norm 7.740621090192e-04 150 KSP Residual norm 7.996865400922e-04 151 KSP Residual norm 7.985981734959e-04 152 KSP Residual norm 7.629754272125e-04 153 KSP Residual norm 7.621465465035e-04 154 KSP Residual norm 7.494704585166e-04 155 KSP Residual norm 7.416807470073e-04 156 KSP Residual norm 7.397837511960e-04 157 KSP Residual norm 7.160474295065e-04 158 KSP Residual norm 7.153230970945e-04 159 KSP Residual norm 6.833108294843e-04 160 KSP Residual norm 6.825301707967e-04 161 KSP Residual norm 6.689873131620e-04 162 KSP Residual norm 6.665860331512e-04 163 KSP Residual norm 6.657084022499e-04 164 KSP Residual norm 6.576302936674e-04 165 KSP Residual norm 6.551257109061e-04 166 KSP Residual norm 6.432495386190e-04 167 KSP Residual norm 6.393490617663e-04 168 KSP Residual norm 6.289563544548e-04 169 KSP Residual norm 6.248514077557e-04 170 KSP Residual norm 6.241333036068e-04 171 KSP Residual norm 6.172289972591e-04 172 KSP Residual norm 6.161332336230e-04 173 KSP Residual norm 6.013433202566e-04 174 KSP Residual norm 5.948389748744e-04 175 KSP Residual norm 5.820459960548e-04 176 KSP Residual norm 5.780888500630e-04 177 KSP Residual norm 5.765550402812e-04 178 KSP Residual norm 5.680617497540e-04 179 KSP Residual norm 5.653573865953e-04 180 KSP Residual norm 5.941451965319e-04 181 KSP Residual norm 5.940284187867e-04 182 KSP Residual norm 5.629093879046e-04 183 KSP Residual norm 5.616029219408e-04 184 KSP Residual norm 5.534251512246e-04 185 KSP Residual norm 5.503051759591e-04 186 KSP Residual norm 5.494615294258e-04 187 KSP Residual norm 5.360903464690e-04 188 KSP Residual norm 5.357596839925e-04 189 KSP Residual norm 5.200595645635e-04 190 KSP Residual norm 5.193081989440e-04 191 KSP Residual norm 5.128888843140e-04 192 KSP Residual norm 5.113184826859e-04 193 KSP Residual norm 5.108224858562e-04 194 KSP Residual norm 5.032655916038e-04 195 KSP Residual norm 5.023522053806e-04 196 KSP Residual norm 4.899188744381e-04 197 KSP Residual norm 4.883553377255e-04 198 KSP Residual norm 4.800289474213e-04 199 KSP Residual norm 4.777279100041e-04 200 KSP Residual norm 4.759559094663e-04 201 KSP Residual norm 4.694818511931e-04 202 KSP Residual norm 4.689973302866e-04 203 KSP Residual norm 4.548454768507e-04 204 KSP Residual norm 4.520367591463e-04 205 KSP Residual norm 4.368451030442e-04 206 KSP Residual norm 4.351974523374e-04 207 KSP Residual norm 4.287065645698e-04 208 KSP Residual norm 4.260330488789e-04 209 KSP Residual norm 4.260155248925e-04 210 KSP Residual norm 4.840531342466e-04 211 KSP Residual norm 4.825477592344e-04 212 KSP Residual norm 4.548435246139e-04 213 KSP Residual norm 4.544995360569e-04 214 KSP Residual norm 4.467888907692e-04 215 KSP Residual norm 4.408942282764e-04 216 KSP Residual norm 4.370393567572e-04 217 KSP Residual norm 4.221207630715e-04 218 KSP Residual norm 4.210509669927e-04 219 KSP Residual norm 4.054854033659e-04 220 KSP Residual norm 4.054421274470e-04 221 KSP Residual norm 3.997032630117e-04 222 KSP Residual norm 3.993282795883e-04 223 KSP Residual norm 3.986988207685e-04 224 KSP Residual norm 3.925181315143e-04 225 KSP Residual norm 3.921926168323e-04 226 KSP Residual norm 3.821204190940e-04 227 KSP Residual norm 3.814133566930e-04 228 KSP Residual norm 3.766128637251e-04 229 KSP Residual norm 3.757253419248e-04 230 KSP Residual norm 3.744322716702e-04 231 KSP Residual norm 3.701160966981e-04 232 KSP Residual norm 3.701152921823e-04 233 KSP Residual norm 3.631617558377e-04 234 KSP Residual norm 3.625708411480e-04 235 KSP Residual norm 3.553808760610e-04 236 KSP Residual norm 3.546034916825e-04 237 KSP Residual norm 3.540786524554e-04 238 KSP Residual norm 3.511344181564e-04 239 KSP Residual norm 3.505042720512e-04 240 KSP Residual norm 4.132290626555e-04 241 KSP Residual norm 4.082857512512e-04 242 KSP Residual norm 3.711356050665e-04 243 KSP Residual norm 3.704069640625e-04 244 KSP Residual norm 3.547568213329e-04 245 KSP Residual norm 3.546708808789e-04 246 KSP Residual norm 3.500496433773e-04 247 KSP Residual norm 3.470649802709e-04 248 KSP Residual norm 3.468562003147e-04 249 KSP Residual norm 3.373236757208e-04 250 KSP Residual norm 3.372785835705e-04 251 KSP Residual norm 3.289650260262e-04 252 KSP Residual norm 3.286262513859e-04 253 KSP Residual norm 3.258404684367e-04 254 KSP Residual norm 3.233806739275e-04 255 KSP Residual norm 3.232848418577e-04 256 KSP Residual norm 3.160390660748e-04 257 KSP Residual norm 3.160023434928e-04 258 KSP Residual norm 3.076800563449e-04 259 KSP Residual norm 3.075950769044e-04 260 KSP Residual norm 3.017997123649e-04 261 KSP Residual norm 3.010370755465e-04 262 KSP Residual norm 3.007348493909e-04 263 KSP Residual norm 2.968283761627e-04 264 KSP Residual norm 2.965667647944e-04 265 KSP Residual norm 2.875262248594e-04 266 KSP Residual norm 2.864872321977e-04 267 KSP Residual norm 2.796799864284e-04 268 KSP Residual norm 2.779800721903e-04 269 KSP Residual norm 2.767755518062e-04 270 KSP Residual norm 3.701476508480e-04 271 KSP Residual norm 3.550531831491e-04 272 KSP Residual norm 3.237961295410e-04 273 KSP Residual norm 3.212659135171e-04 274 KSP Residual norm 3.107416795714e-04 275 KSP Residual norm 3.107404695290e-04 276 KSP Residual norm 3.079497903315e-04 277 KSP Residual norm 3.007874486462e-04 278 KSP Residual norm 3.002891498945e-04 279 KSP Residual norm 2.879889328174e-04 280 KSP Residual norm 2.879415924827e-04 281 KSP Residual norm 2.815601989921e-04 282 KSP Residual norm 2.806654362436e-04 283 KSP Residual norm 2.792915793473e-04 284 KSP Residual norm 2.751905583679e-04 285 KSP Residual norm 2.751559506503e-04 286 KSP Residual norm 2.665087440439e-04 287 KSP Residual norm 2.664771447418e-04 288 KSP Residual norm 2.617548480595e-04 289 KSP Residual norm 2.617541578606e-04 290 KSP Residual norm 2.605161966159e-04 291 KSP Residual norm 2.594628541708e-04 292 KSP Residual norm 2.594611806903e-04 293 KSP Residual norm 2.551674006816e-04 294 KSP Residual norm 2.549606997571e-04 295 KSP Residual norm 2.496835632981e-04 296 KSP Residual norm 2.494290948044e-04 297 KSP Residual norm 2.466008586027e-04 298 KSP Residual norm 2.449860389047e-04 299 KSP Residual norm 2.448908502370e-04 300 KSP Residual norm 3.252504105699e-04 301 KSP Residual norm 3.128088862424e-04 302 KSP Residual norm 2.825197019556e-04 303 KSP Residual norm 2.797432079325e-04 304 KSP Residual norm 2.629301571560e-04 305 KSP Residual norm 2.624809613098e-04 306 KSP Residual norm 2.574770828907e-04 307 KSP Residual norm 2.561990302939e-04 308 KSP Residual norm 2.550132639501e-04 309 KSP Residual norm 2.479276320314e-04 310 KSP Residual norm 2.475314075344e-04 311 KSP Residual norm 2.399863564606e-04 312 KSP Residual norm 2.399843817621e-04 313 KSP Residual norm 2.370399439375e-04 314 KSP Residual norm 2.357745430699e-04 315 KSP Residual norm 2.352348621885e-04 316 KSP Residual norm 2.309261579390e-04 317 KSP Residual norm 2.307545694928e-04 318 KSP Residual norm 2.257800220221e-04 319 KSP Residual norm 2.257254923503e-04 320 KSP Residual norm 2.229018664838e-04 321 KSP Residual norm 2.227391409799e-04 322 KSP Residual norm 2.223152932089e-04 323 KSP Residual norm 2.202129550826e-04 324 KSP Residual norm 2.202120254316e-04 325 KSP Residual norm 2.140858308118e-04 326 KSP Residual norm 2.139462657614e-04 327 KSP Residual norm 2.089236399059e-04 328 KSP Residual norm 2.087138619354e-04 329 KSP Residual norm 2.076200058793e-04 330 KSP Residual norm 3.105137354000e-04 331 KSP Residual norm 2.971618102980e-04 332 KSP Residual norm 2.632509491662e-04 333 KSP Residual norm 2.606782931874e-04 334 KSP Residual norm 2.420375352578e-04 335 KSP Residual norm 2.420149875537e-04 336 KSP Residual norm 2.365092764887e-04 337 KSP Residual norm 2.327611491860e-04 338 KSP Residual norm 2.309994170994e-04 339 KSP Residual norm 2.207932064590e-04 340 KSP Residual norm 2.206876586929e-04 341 KSP Residual norm 2.138312153956e-04 342 KSP Residual norm 2.138302439214e-04 343 KSP Residual norm 2.110520083920e-04 344 KSP Residual norm 2.098418044530e-04 345 KSP Residual norm 2.096235899540e-04 346 KSP Residual norm 2.034185757316e-04 347 KSP Residual norm 2.034184620635e-04 348 KSP Residual norm 1.976488405192e-04 349 KSP Residual norm 1.976115434786e-04 350 KSP Residual norm 1.945816030084e-04 351 KSP Residual norm 1.943065435743e-04 352 KSP Residual norm 1.937314638157e-04 353 KSP Residual norm 1.914843655114e-04 354 KSP Residual norm 1.914699026461e-04 355 KSP Residual norm 1.874847304454e-04 356 KSP Residual norm 1.872309478602e-04 357 KSP Residual norm 1.848831376039e-04 358 KSP Residual norm 1.841538952700e-04 359 KSP Residual norm 1.840508058060e-04 360 KSP Residual norm 2.955683800624e-04 361 KSP Residual norm 2.787092713017e-04 362 KSP Residual norm 2.375973954952e-04 363 KSP Residual norm 2.362837002318e-04 364 KSP Residual norm 2.168814309608e-04 365 KSP Residual norm 2.168107429718e-04 366 KSP Residual norm 2.100451614964e-04 367 KSP Residual norm 2.076551935282e-04 368 KSP Residual norm 2.055029282370e-04 369 KSP Residual norm 1.984878777061e-04 370 KSP Residual norm 1.982729424399e-04 371 KSP Residual norm 1.897882681691e-04 372 KSP Residual norm 1.897059804945e-04 373 KSP Residual norm 1.874194039890e-04 374 KSP Residual norm 1.863443914354e-04 375 KSP Residual norm 1.862854861081e-04 376 KSP Residual norm 1.823365401806e-04 377 KSP Residual norm 1.822124001004e-04 378 KSP Residual norm 1.768066383954e-04 379 KSP Residual norm 1.767749820797e-04 380 KSP Residual norm 1.732546065272e-04 381 KSP Residual norm 1.729449336935e-04 382 KSP Residual norm 1.722785014035e-04 383 KSP Residual norm 1.708647196037e-04 384 KSP Residual norm 1.708139978897e-04 385 KSP Residual norm 1.669271053730e-04 386 KSP Residual norm 1.664487742887e-04 387 KSP Residual norm 1.633713872445e-04 388 KSP Residual norm 1.623611873959e-04 389 KSP Residual norm 1.619434626999e-04 390 KSP Residual norm 2.704606714739e-04 391 KSP Residual norm 2.508430425343e-04 392 KSP Residual norm 2.188049406578e-04 393 KSP Residual norm 2.149849069956e-04 394 KSP Residual norm 1.986339539905e-04 395 KSP Residual norm 1.982077076504e-04 396 KSP Residual norm 1.920881700711e-04 397 KSP Residual norm 1.895113714247e-04 398 KSP Residual norm 1.880286374771e-04 399 KSP Residual norm 1.800051766499e-04 400 KSP Residual norm 1.791342664207e-04 401 KSP Residual norm 1.723021103007e-04 402 KSP Residual norm 1.722983364536e-04 403 KSP Residual norm 1.699546057741e-04 404 KSP Residual norm 1.688016004024e-04 405 KSP Residual norm 1.684647860428e-04 406 KSP Residual norm 1.642044267398e-04 407 KSP Residual norm 1.641199945107e-04 408 KSP Residual norm 1.600279143621e-04 409 KSP Residual norm 1.600082345451e-04 410 KSP Residual norm 1.571845001808e-04 411 KSP Residual norm 1.568421146682e-04 412 KSP Residual norm 1.566121984341e-04 413 KSP Residual norm 1.541905989228e-04 414 KSP Residual norm 1.541893814351e-04 415 KSP Residual norm 1.503096486796e-04 416 KSP Residual norm 1.502069549475e-04 417 KSP Residual norm 1.474600334036e-04 418 KSP Residual norm 1.470811856273e-04 419 KSP Residual norm 1.468503446550e-04 420 KSP Residual norm 2.763803512664e-04 421 KSP Residual norm 2.602057695388e-04 422 KSP Residual norm 2.183342925626e-04 423 KSP Residual norm 2.173993739737e-04 424 KSP Residual norm 1.911804707591e-04 425 KSP Residual norm 1.909057097612e-04 426 KSP Residual norm 1.799387525085e-04 427 KSP Residual norm 1.776177027182e-04 428 KSP Residual norm 1.742907904950e-04 429 KSP Residual norm 1.663678496699e-04 430 KSP Residual norm 1.649483630841e-04 431 KSP Residual norm 1.574852643960e-04 432 KSP Residual norm 1.572807748854e-04 433 KSP Residual norm 1.548887544042e-04 434 KSP Residual norm 1.546798467860e-04 435 KSP Residual norm 1.544697005440e-04 436 KSP Residual norm 1.519993025554e-04 437 KSP Residual norm 1.519316487948e-04 438 KSP Residual norm 1.478918963793e-04 439 KSP Residual norm 1.478889661490e-04 440 KSP Residual norm 1.445882234540e-04 441 KSP Residual norm 1.444716701122e-04 442 KSP Residual norm 1.437624306066e-04 443 KSP Residual norm 1.422790872780e-04 444 KSP Residual norm 1.422764539369e-04 445 KSP Residual norm 1.393786944559e-04 446 KSP Residual norm 1.393457143473e-04 447 KSP Residual norm 1.365525969412e-04 448 KSP Residual norm 1.363095456469e-04 449 KSP Residual norm 1.354028782839e-04 450 KSP Residual norm 2.605601751211e-04 451 KSP Residual norm 2.519694861974e-04 452 KSP Residual norm 2.090311698383e-04 453 KSP Residual norm 2.085715879165e-04 454 KSP Residual norm 1.838015375310e-04 455 KSP Residual norm 1.837739523119e-04 456 KSP Residual norm 1.753463622085e-04 457 KSP Residual norm 1.712295128609e-04 458 KSP Residual norm 1.706509763269e-04 459 KSP Residual norm 1.603179612581e-04 460 KSP Residual norm 1.602910110614e-04 461 KSP Residual norm 1.515043047705e-04 462 KSP Residual norm 1.514059184851e-04 463 KSP Residual norm 1.487375893344e-04 464 KSP Residual norm 1.472465439488e-04 465 KSP Residual norm 1.471786946789e-04 466 KSP Residual norm 1.426656793695e-04 467 KSP Residual norm 1.426493614178e-04 468 KSP Residual norm 1.368739431348e-04 469 KSP Residual norm 1.368507658167e-04 470 KSP Residual norm 1.334891298516e-04 471 KSP Residual norm 1.332407426119e-04 472 KSP Residual norm 1.327738486878e-04 473 KSP Residual norm 1.307942849615e-04 474 KSP Residual norm 1.307351952401e-04 475 KSP Residual norm 1.272337855340e-04 476 KSP Residual norm 1.270264025130e-04 477 KSP Residual norm 1.242721163254e-04 478 KSP Residual norm 1.239396247568e-04 479 KSP Residual norm 1.231277348178e-04 480 KSP Residual norm 2.661551792497e-04 481 KSP Residual norm 2.535113528327e-04 482 KSP Residual norm 2.047238652433e-04 483 KSP Residual norm 2.030655099375e-04 484 KSP Residual norm 1.747312421388e-04 485 KSP Residual norm 1.745992877616e-04 486 KSP Residual norm 1.656341028631e-04 487 KSP Residual norm 1.637510460120e-04 488 KSP Residual norm 1.614472737567e-04 489 KSP Residual norm 1.528805694879e-04 490 KSP Residual norm 1.528597718238e-04 491 KSP Residual norm 1.433703303374e-04 492 KSP Residual norm 1.433634532650e-04 493 KSP Residual norm 1.402208125363e-04 494 KSP Residual norm 1.396024825606e-04 495 KSP Residual norm 1.395185211705e-04 496 KSP Residual norm 1.368721239565e-04 497 KSP Residual norm 1.368721047917e-04 498 KSP Residual norm 1.313099701242e-04 499 KSP Residual norm 1.312558057896e-04 500 KSP Residual norm 1.278276826286e-04 501 KSP Residual norm 1.276445215817e-04 502 KSP Residual norm 1.269142176155e-04 503 KSP Residual norm 1.257668921397e-04 504 KSP Residual norm 1.257526015128e-04 505 KSP Residual norm 1.223245237264e-04 506 KSP Residual norm 1.221042366836e-04 507 KSP Residual norm 1.191294573606e-04 508 KSP Residual norm 1.186779609848e-04 509 KSP Residual norm 1.179260379388e-04 510 KSP Residual norm 2.510412421241e-04 511 KSP Residual norm 2.296629012501e-04 512 KSP Residual norm 1.891291364587e-04 513 KSP Residual norm 1.852147919892e-04 514 KSP Residual norm 1.677771665780e-04 515 KSP Residual norm 1.672450311159e-04 516 KSP Residual norm 1.591298996024e-04 517 KSP Residual norm 1.573813490451e-04 518 KSP Residual norm 1.537945683810e-04 519 KSP Residual norm 1.453012826230e-04 520 KSP Residual norm 1.446639040447e-04 521 KSP Residual norm 1.370962330437e-04 522 KSP Residual norm 1.370622043459e-04 523 KSP Residual norm 1.350810905802e-04 524 KSP Residual norm 1.344406903453e-04 525 KSP Residual norm 1.340256203640e-04 526 KSP Residual norm 1.295916728961e-04 527 KSP Residual norm 1.295055652264e-04 528 KSP Residual norm 1.239274488400e-04 529 KSP Residual norm 1.238972505288e-04 530 KSP Residual norm 1.201152286403e-04 531 KSP Residual norm 1.200306481147e-04 532 KSP Residual norm 1.191679467888e-04 533 KSP Residual norm 1.177359365688e-04 534 KSP Residual norm 1.176440079687e-04 535 KSP Residual norm 1.141371623941e-04 536 KSP Residual norm 1.141353394103e-04 537 KSP Residual norm 1.111902304574e-04 538 KSP Residual norm 1.111435026607e-04 539 KSP Residual norm 1.104373518780e-04 540 KSP Residual norm 2.532065970323e-04 541 KSP Residual norm 2.328558435699e-04 542 KSP Residual norm 1.856863400432e-04 543 KSP Residual norm 1.824052503129e-04 544 KSP Residual norm 1.632727866436e-04 545 KSP Residual norm 1.623079398737e-04 546 KSP Residual norm 1.526844888068e-04 547 KSP Residual norm 1.511265397186e-04 548 KSP Residual norm 1.477189110182e-04 549 KSP Residual norm 1.383359375652e-04 550 KSP Residual norm 1.375702414606e-04 551 KSP Residual norm 1.291115277686e-04 552 KSP Residual norm 1.289805911758e-04 553 KSP Residual norm 1.264565638309e-04 554 KSP Residual norm 1.261764820334e-04 555 KSP Residual norm 1.259971989377e-04 556 KSP Residual norm 1.224124087888e-04 557 KSP Residual norm 1.224106346986e-04 558 KSP Residual norm 1.171149088916e-04 559 KSP Residual norm 1.171056991973e-04 560 KSP Residual norm 1.135465444557e-04 561 KSP Residual norm 1.133273524148e-04 562 KSP Residual norm 1.126341467351e-04 563 KSP Residual norm 1.110537515701e-04 564 KSP Residual norm 1.110037552185e-04 565 KSP Residual norm 1.078890765170e-04 566 KSP Residual norm 1.078787122262e-04 567 KSP Residual norm 1.055011245599e-04 568 KSP Residual norm 1.054883845546e-04 569 KSP Residual norm 1.049715929970e-04 570 KSP Residual norm 2.593584803792e-04 571 KSP Residual norm 2.455814515878e-04 572 KSP Residual norm 1.938992453788e-04 573 KSP Residual norm 1.922723159534e-04 574 KSP Residual norm 1.638561052430e-04 575 KSP Residual norm 1.638003809937e-04 576 KSP Residual norm 1.524110974008e-04 577 KSP Residual norm 1.506343440379e-04 578 KSP Residual norm 1.478423845281e-04 579 KSP Residual norm 1.385076232286e-04 580 KSP Residual norm 1.382879555696e-04 581 KSP Residual norm 1.288719967621e-04 582 KSP Residual norm 1.288691781733e-04 583 KSP Residual norm 1.255148505764e-04 584 KSP Residual norm 1.252964998236e-04 585 KSP Residual norm 1.249334077999e-04 586 KSP Residual norm 1.226044863792e-04 587 KSP Residual norm 1.226037028980e-04 588 KSP Residual norm 1.178394232303e-04 589 KSP Residual norm 1.178390359455e-04 590 KSP Residual norm 1.140645774874e-04 591 KSP Residual norm 1.139549138636e-04 592 KSP Residual norm 1.131882577208e-04 593 KSP Residual norm 1.119290884708e-04 594 KSP Residual norm 1.119288058877e-04 595 KSP Residual norm 1.090328205509e-04 596 KSP Residual norm 1.087834210599e-04 597 KSP Residual norm 1.059285536635e-04 598 KSP Residual norm 1.056123757512e-04 and it is still running. I have set for this run ksp_rtol to 1.0e-7. Any ideas? Costas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Mon Apr 4 06:26:17 2011 From: jed at 59A2.org (Jed Brown) Date: Mon, 4 Apr 2011 13:26:17 +0200 Subject: [petsc-users] help with snes In-Reply-To: <4D998DB1.7000601@lycos.com> References: <4D995E2C.8060106@lycos.com> <4D998DB1.7000601@lycos.com> Message-ID: On Mon, Apr 4, 2011 at 11:21, Kontsantinos Kontzialis wrote: > I am using a Discontinuous Galerkin method for the Euler equations of gas > dynamics. I noticed that snes iterations do not drop the function norm and i > need to set ksp_rtol to very low values in order to get converged solution. > But this takes time. I use a matrix-free method with coloring for computing > the jacobian as a preconditioner. I don't see SNES convergence here because it's still in the linear solve. The restart is apparently too small for use with this preconditioner because you are losing a lot of ground in each restart. For reference, how does -pc_type asm -sub_pc_type lu work? For globalization at moderate to high Mach numbers, I would recommend grid sequencing if possible, otherwise you may be forced to take smaller time steps. For the linear solve, especially at low Mach number, you can precondition using the Schur complement of momentum applied in the pressure space (this contains the fast acoustic waves, see http://epubs.siam.org/sisc/resource/1/sjoce3/v32/i6/p3394_s1 for several examples defining the operator for use with semi-implicit integration). An alternative is to build a custom multigrid, see e.g. http://aero-comlab.stanford.edu/Papers/jameson.aiaa.01-2673.pdf. Both of these options require some work on your part. You should recognize that scalable solvers for implicit Euler at large time steps is still a hard enough problem that "black box" approaches will not give the best performance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckontzialis at lycos.com Mon Apr 4 07:15:28 2011 From: ckontzialis at lycos.com (Kontsantinos Kontzialis) Date: Mon, 04 Apr 2011 15:15:28 +0300 Subject: [petsc-users] help with snes In-Reply-To: References: <4D995E2C.8060106@lycos.com> <4D998DB1.7000601@lycos.com> Message-ID: <4D99B660.6000407@lycos.com> On 04/04/2011 02:26 PM, Jed Brown wrote: > -pc_type asm -sub_pc_type lu Jed, I do a run with -pc_type asm -sub_pc_type lu and higher rate of gmres restart. I work on unstructured meshes. I'll let you know as soon as possible. Costas From gaurish108 at gmail.com Mon Apr 4 13:41:51 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Mon, 4 Apr 2011 14:41:51 -0400 Subject: [petsc-users] compile errors with petsc-dev Message-ID: Hi, I have installed two versions of PETSc on my computer, the debug version petsc-3.1.p7 and petsc-dev. However, when I try to compile my codes with petsc-dev(which were successfully compiled and executed under the first version) I get lots of compile errors. Why could this be happening? I have pasted the error message below. gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR bash: /home/gaurish108/software/petsc-dev: is a directory gaurish108 at telang:~/Desktop/LSQR_progress$ make main /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -o main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -I/home/gaurish108/software/petsc-dev/include -I/home/gaurish108/software/petsc-dev/linux-gnu-c/include -D__INSDIR__= main.c main.c: In function ?main?: main.c:17: error: ?PetscTruth? undeclared (first use in this function) main.c:17: error: (Each undeclared identifier is reported only once main.c:17: error: for each function it appears in.) main.c:17: error: expected ?;? before ?flg_b? main.c:26: error: ?flg_b? undeclared (first use in this function) main.c:27: error: macro "SETERRQ" requires 3 arguments, but only 2 given main.c:27: error: ?SETERRQ? undeclared (first use in this function) main.c:30: warning: passing argument 1 of ?VecLoad? from incompatible pointer type /home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected ?Vec? but argument is of type ?PetscViewer? main.c:30: warning: passing argument 2 of ?VecLoad? from incompatible pointer type /home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected ?PetscViewer? but argument is of type ?const char *? main.c:30: error: too many arguments to function ?VecLoad? main.c:37: error: ?flg_A? undeclared (first use in this function) main.c:38: error: macro "SETERRQ" requires 3 arguments, but only 2 given main.c:41: warning: passing argument 1 of ?MatLoad? from incompatible pointer type /home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected ?Mat? but argument is of type ?PetscViewer? main.c:41: warning: passing argument 2 of ?MatLoad? from incompatible pointer type /home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected ?PetscViewer? but argument is of type ?const char *? main.c:41: error: too many arguments to function ?MatLoad? make: [main.o] Error 1 (ignored) /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -o main main.o -L/home/gaurish108/software/petsc-dev/linux-gnu-c/lib -lpetsc -lX11 -Wl,-rpath,/home/gaurish108/software/petsc-dev/linux-gnu-c/lib -lflapack -lfblas -lm -L/usr/lib/gcc/i686-linux-gnu/4.4.5 -L/usr/lib/i686-linux-gnu -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -lmpichf90 -lgfortran -lm -lm -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl gcc: main.o: No such file or directory make: [main] Error 1 (ignored) /bin/rm -f main.o gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR bash: /home/gaurish108/software/petsc-dev: is a directory gaurish108 at telang:~/Desktop/LSQR_progress$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Apr 4 13:46:08 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Apr 2011 13:46:08 -0500 Subject: [petsc-users] compile errors with petsc-dev In-Reply-To: References: Message-ID: On Mon, Apr 4, 2011 at 1:41 PM, Gaurish Telang wrote: > Hi, > > I have installed two versions of PETSc on my computer, the debug version > petsc-3.1.p7 and petsc-dev. > > However, when I try to compile my codes with petsc-dev(which were > successfully compiled and executed under the first version) > I get lots of compile errors. > > Why could this be happening? I have pasted the error message below. > http://www.mcs.anl.gov/petsc/petsc-as/documentation/changes/dev.html The first item. Matt > gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR > bash: /home/gaurish108/software/petsc-dev: is a directory > > gaurish108 at telang:~/Desktop/LSQR_progress$ make main > /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -o main.o -c > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 > -I/home/gaurish108/software/petsc-dev/include > -I/home/gaurish108/software/petsc-dev/linux-gnu-c/include -D__INSDIR__= > main.c > main.c: In function ?main?: > main.c:17: error: ?PetscTruth? undeclared (first use in this function) > main.c:17: error: (Each undeclared identifier is reported only once > main.c:17: error: for each function it appears in.) > main.c:17: error: expected ?;? before ?flg_b? > main.c:26: error: ?flg_b? undeclared (first use in this function) > main.c:27: error: macro "SETERRQ" requires 3 arguments, but only 2 given > main.c:27: error: ?SETERRQ? undeclared (first use in this function) > main.c:30: warning: passing argument 1 of ?VecLoad? from incompatible > pointer type > /home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected > ?Vec? but argument is of type ?PetscViewer? > main.c:30: warning: passing argument 2 of ?VecLoad? from incompatible > pointer type > /home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected > ?PetscViewer? but argument is of type ?const char *? > main.c:30: error: too many arguments to function ?VecLoad? > main.c:37: error: ?flg_A? undeclared (first use in this function) > main.c:38: error: macro "SETERRQ" requires 3 arguments, but only 2 given > main.c:41: warning: passing argument 1 of ?MatLoad? from incompatible > pointer type > /home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected > ?Mat? but argument is of type ?PetscViewer? > main.c:41: warning: passing argument 2 of ?MatLoad? from incompatible > pointer type > /home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected > ?PetscViewer? but argument is of type ?const char *? > main.c:41: error: too many arguments to function ?MatLoad? > make: [main.o] Error 1 (ignored) > /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -o main > main.o -L/home/gaurish108/software/petsc-dev/linux-gnu-c/lib -lpetsc -lX11 > -Wl,-rpath,/home/gaurish108/software/petsc-dev/linux-gnu-c/lib -lflapack > -lfblas -lm -L/usr/lib/gcc/i686-linux-gnu/4.4.5 -L/usr/lib/i686-linux-gnu > -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -lmpichf90 -lgfortran -lm > -lm -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > gcc: main.o: No such file or directory > make: [main] Error 1 (ignored) > /bin/rm -f main.o > gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR > bash: /home/gaurish108/software/petsc-dev: is a directory > gaurish108 at telang:~/Desktop/LSQR_progress$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean at mcs.anl.gov Mon Apr 4 13:46:43 2011 From: sean at mcs.anl.gov (Sean Farley) Date: Mon, 4 Apr 2011 13:46:43 -0500 Subject: [petsc-users] compile errors with petsc-dev In-Reply-To: References: Message-ID: > > gaurish108 at telang:~/Desktop/LSQR_progress$ make main > /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -o main.o -c > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 > -I/home/gaurish108/software/petsc-dev/include > -I/home/gaurish108/software/petsc-dev/linux-gnu-c/include -D__INSDIR__= > main.c > main.c: In function ?main?: > main.c:17: error: ?PetscTruth? undeclared (first use in this function) > main.c:17: error: (Each undeclared identifier is reported only once > main.c:17: error: for each function it appears in.) > main.c:17: error: expected ?;? before ?flg_b? > main.c:26: error: ?flg_b? undeclared (first use in this function) > This error is because in petsc-dev PetscTruth changed to PetscBool: http://www.mcs.anl.gov/petsc/petsc-as/documentation/changes/dev.html "Changed PetscTruth to PetscBool, PETSC_TRUTH to PETSC_BOOL, PetscOptionsTruth to PetscOptionsBool, etc." main.c:27: error: macro "SETERRQ" requires 3 arguments, but only 2 given > main.c:27: error: ?SETERRQ? undeclared (first use in this function) Also, SETERRQX changed: "PetscError() and SETERRQX() now take a MPI_Comm as the first argument to indicate where the error is known. If you don't know what communicator use then pass in PETSC_COMM_SELF" Hope that helps, Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaurish108 at gmail.com Mon Apr 4 16:30:27 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Mon, 4 Apr 2011 17:30:27 -0400 Subject: [petsc-users] square root function In-Reply-To: References: Message-ID: Hi, Is there a squareroot function implemented in PETSc which can be applied to a PetscScalar type? I am not sure if the sqrt function of the C math library will work on this datatype. Regards, Gaurish -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Apr 4 16:42:57 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Apr 2011 16:42:57 -0500 Subject: [petsc-users] square root function In-Reply-To: References: Message-ID: <098AE37E-C0B6-4F66-81D9-C292BFF0AEC7@mcs.anl.gov> PetscSqrtScalar() macro automatically becomes the correct thing. Barry On Apr 4, 2011, at 4:30 PM, Gaurish Telang wrote: > Hi, > > Is there a squareroot function implemented in PETSc which can be applied to a PetscScalar type? I am not sure if the sqrt function of the C math library will work on this datatype. > > Regards, > > Gaurish > From gaurish108 at gmail.com Mon Apr 4 21:54:42 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Mon, 4 Apr 2011 22:54:42 -0400 Subject: [petsc-users] Modification of a routine Message-ID: Hi, I have tried to implement a recent least squares algorithm called LSMR, by making modifications in the file src/ksp/ksp/impls/lsqr/lsqr.c Is it necessary to make any changes in other PETSc files or build the PETSc library again? I solved a simple least squares problem by supplying the flags -ksp_type lsqr -pc_type none and the problem seems to get solved correctly. However, I had placed a couple of PetscPrintf statements inside the main do-while loop of the algorithm in lsqr.c but those statements are not getting printed to standard output. Thanks, Gaurish. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Mon Apr 4 22:07:32 2011 From: gdiso at ustc.edu (Gong Ding) Date: Tue, 5 Apr 2011 11:07:32 +0800 (CST) Subject: [petsc-users] Slepc eigen value solver gives strange behavior Message-ID: <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> Hi, I use slepc eigen value solver to evaluate the eigen values of jacobian matrix on each nonlinear iteration. First, I create the EPS structure and set the operator to jacobian matrix. And then call the EPSSolve in the SNES build jacobian matrix functuion. This method create EPS once, call EPSSolve many times. However, it seems EPSSolve only work at first time, and result eigen value never changes (however, relative error becomes larger and larger) in the following solve procedure. Then I create EPS each time in the SNES build jacobian matrix functuion, do EPSSolve and delete EPS at the end of function. This method gives eigen value for the jacobian matrix with small relative error ~1e-8. Of course, create and destroy the EPS solver each time is not efficient. Does something get wrong in the first method? The code I used is attached here // create the EPS solver for smallest and largest eigen value EPS eps_s; EPS eps_l; FVM_NonlinearSolver & nonlinear_solver = dynamic_cast(_solver); Mat & J = nonlinear_solver.jacobian_matrix(); // create eigen value problem solver EPSCreate(PETSC_COMM_WORLD, &eps_s); EPSCreate(PETSC_COMM_WORLD, &eps_l); // Set operator EPSSetOperators(eps_s, J, PETSC_NULL); EPSSetOperators(eps_l, J, PETSC_NULL); // calculate smallest and largest eigen value EPSSetWhichEigenpairs(eps_s, EPS_SMALLEST_MAGNITUDE); EPSSetWhichEigenpairs(eps_l, EPS_LARGEST_MAGNITUDE); // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero ST st_s; EPSGetST(eps_s, &st_s); STSetType(st_s, STSINVERT); // Set solver parameters at runtime EPSSetFromOptions(eps_s); EPSSetFromOptions(eps_l); /////////////////////////////////////////////////////////////////////////////// // this part is called after jacobian matrix assemly PetscScalar kr_s, ki_s; PetscScalar kr_l, ki_l; PetscReal error_s; PetscReal error_l; PetscInt nconv_s; PetscInt nconv_l; // get the smallest eigen value EPSSolve( eps_s ); EPSGetConverged( eps_s, &nconv_s ); if( nconv_s > 0 ) { EPSGetEigenvalue( eps_s, 0, &kr_s, &ki_s ); EPSComputeRelativeError( eps_s, 0, &error_s ); } // get the largest eigen value EPSSolve( eps_l ); EPSGetConverged( eps_l, &nconv_l ); if( nconv_l > 0 ) { EPSGetEigenvalue( eps_l, 0, &kr_l, &ki_l ); EPSComputeRelativeError( eps_l, 0, &error_l ); } From gdiso at ustc.edu Mon Apr 4 22:16:16 2011 From: gdiso at ustc.edu (Gong Ding) Date: Tue, 5 Apr 2011 11:16:16 +0800 (CST) Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix Message-ID: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> Hi, I'd like to scaling the jacobian matrix as if the condition number can be improved. That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration. Does SNES already exist some interface to do this? From bsmith at mcs.anl.gov Mon Apr 4 22:23:23 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Apr 2011 22:23:23 -0500 Subject: [petsc-users] Modification of a routine In-Reply-To: References: Message-ID: <39419F71-C854-4023-A93F-D9F60B7E0A5E@mcs.anl.gov> Use the debugger and step through the code to verify it is getting to those lines. Barry On Apr 4, 2011, at 9:54 PM, Gaurish Telang wrote: > Hi, > > I have tried to implement a recent least squares algorithm called LSMR, by making modifications in the file src/ksp/ksp/impls/lsqr/lsqr.c > > Is it necessary to make any changes in other PETSc files or build the PETSc library again? > > I solved a simple least squares problem by supplying the flags -ksp_type lsqr -pc_type none and the problem seems to get solved correctly. > > However, I had placed a couple of PetscPrintf statements inside the main do-while loop of the algorithm in lsqr.c but those statements are not getting printed to standard output. > > Thanks, > > Gaurish. From bsmith at mcs.anl.gov Mon Apr 4 22:25:48 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Apr 2011 22:25:48 -0500 Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix In-Reply-To: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> Message-ID: On Apr 4, 2011, at 10:16 PM, Gong Ding wrote: > Hi, > I'd like to scaling the jacobian matrix as if the condition number can be improved. > That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration. > Would this be in addition to building a preconditioner from the resulting scaled matrix? Or do you just want to use a symmetric Jacobi preconditioner? Barry > Does SNES already exist some interface to do this? > > > > > From bsmith at mcs.anl.gov Mon Apr 4 22:52:47 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Apr 2011 22:52:47 -0500 Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix In-Reply-To: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> Message-ID: <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov> If you are looking for something like this: When solving F(x) = 0, I would like to be able to scale both the solution vector x and the residual function vector F, simply by specifying scaling vectors, sx and sf, say. (These vectors would be the diagonal entries of scaling matrices Dx and Df.) I realize this can be achieved, at least in part, within the user residual function. This is what I had been doing, until I looked at Denis and Schnabel (sp?), Brown and Saad, and the KINSOL user guide. It seems one has to take the scaling matrices into account when computing various norms, when applying the preconditioner, and when computing the step size, \sigma. No doubt there are other things I have missed that also need to be done. http://www.mcs.anl.gov/petsc/petsc-as/developers/projects.html we don't have support for this (nor do I understand it). Anyways it has been on the "projects to do list" for a very long time; suspect it would require a good amount of futzing around in the source code to add. Barry On Apr 4, 2011, at 10:16 PM, Gong Ding wrote: > Hi, > I'd like to scaling the jacobian matrix as if the condition number can be improved. > That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration. > > Does SNES already exist some interface to do this? > > > > > From bsmith at mcs.anl.gov Mon Apr 4 23:10:20 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Apr 2011 23:10:20 -0500 Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix In-Reply-To: <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov> References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov> Message-ID: The literature is unclear to me, but I don't think these scalings are done in this way to improve the conditioning of the matrix. They are done to change the relative importance of different entries in the vector to determine stopping conditions and search directions in Newton's method. For example, if you consider getting the first vector entry in the residual/error small more important than the other entries you would use the scaling vector like [bignumber 1 1 1 1 ....]. In some way the scaling vectors reflect working with a different norm to measure the residual. Since PETSc does not support providing these scaling vectors you can get the same effect if you define your a new function (and hence also new Jacobian) that weights the various entries the way you want based on their importance. In other words newF(x) = diagonalscaling1* oldF( diagonalscaling2 * y) then if x* is the solution to the new problem, y* = inv(diagonalscaling2*x*) is the solution to the original problem. In some cases this transformation can correspond to working in "dimensionless coordinates" but all that language is over my head. If you just want to scale the matrix to have ones on the diagonal before forming the preconditioner (on the theory that it is better to solve problems with a "well-scaled" matrix) you can use the run time options -ksp_diagonal_scale -ksp_diagonal_scale_fix or in the code with KSPSetDiagonalScale() KSPSetDiagonalScaleFix(). Barry On Apr 4, 2011, at 10:52 PM, Barry Smith wrote: > > If you are looking for something like this: > > When solving F(x) = 0, I would like to be able to scale both the solution > vector x and the residual function vector F, simply by specifying scaling > vectors, sx and sf, say. (These vectors would be the diagonal entries of > scaling matrices Dx and Df.) > I realize this can be achieved, at least in part, within the user residual > function. > This is what I had been doing, until I looked at Denis and Schnabel (sp?), > Brown and Saad, and the KINSOL user guide. It seems one has to take the > scaling matrices into account when computing various norms, when applying the > preconditioner, and when computing the step size, \sigma. No doubt there > are other things I have missed that also need to be done. > > http://www.mcs.anl.gov/petsc/petsc-as/developers/projects.html > > we don't have support for this (nor do I understand it). Anyways it has been on the "projects to do list" for a very long time; suspect it would require a good amount of futzing around in the source code to add. > > Barry > > > On Apr 4, 2011, at 10:16 PM, Gong Ding wrote: > >> Hi, >> I'd like to scaling the jacobian matrix as if the condition number can be improved. >> That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration. >> >> Does SNES already exist some interface to do this? >> >> >> >> >> > From jed at 59A2.org Tue Apr 5 01:24:37 2011 From: jed at 59A2.org (Jed Brown) Date: Tue, 5 Apr 2011 08:24:37 +0200 Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix In-Reply-To: References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov> Message-ID: On Tue, Apr 5, 2011 at 06:10, Barry Smith wrote: > They are done to change the relative importance of different entries in the > vector to determine stopping conditions and search directions in Newton's > method. Note that weighting fields differently is equivalent to choosing units for the different fields. I think it is generally a good idea to make the units a runtime option anyway since it lets you check that the code is dimensionally correct. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Apr 5 04:53:17 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 5 Apr 2011 11:53:17 +0200 Subject: [petsc-users] Slepc eigen value solver gives strange behavior In-Reply-To: <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> References: <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> Message-ID: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> El 05/04/2011, a las 05:07, Gong Ding escribi?: > Hi, > I use slepc eigen value solver to evaluate the eigen values of jacobian matrix on each nonlinear iteration. > > First, I create the EPS structure and set the operator to jacobian matrix. And then call the EPSSolve in the SNES build > jacobian matrix functuion. This method create EPS once, call EPSSolve many times. However, it seems EPSSolve only work at > first time, and result eigen value never changes (however, relative error becomes larger and larger) in the following solve procedure. > > Then I create EPS each time in the SNES build jacobian matrix functuion, do EPSSolve and delete EPS at the end of function. > This method gives eigen value for the jacobian matrix with small relative error ~1e-8. Of course, create and destroy the EPS solver each time > is not efficient. > > Does something get wrong in the first method? You will always solve the same eigenproblem, unless you call EPSSetOperators every time the matrix changes. When EPSSetOperators is called, the EPS object is reset and therefore EPSSetUp will be called in the next EPSSolve. So basically the first approach will have the same cost as the second approach. By the way, you are not using STSINVERT correctly. You should set EPS_TARGET_MAGNITUDE (instead of EPS_SMALLEST_MAGNITUDE) together with target=0.0 with EPSSetTarget. Jose > > The code I used is attached here > > // create the EPS solver for smallest and largest eigen value > EPS eps_s; > EPS eps_l; > > FVM_NonlinearSolver & nonlinear_solver = dynamic_cast(_solver); > Mat & J = nonlinear_solver.jacobian_matrix(); > > // create eigen value problem solver > EPSCreate(PETSC_COMM_WORLD, &eps_s); > EPSCreate(PETSC_COMM_WORLD, &eps_l); > // Set operator > EPSSetOperators(eps_s, J, PETSC_NULL); > EPSSetOperators(eps_l, J, PETSC_NULL); > > // calculate smallest and largest eigen value > EPSSetWhichEigenpairs(eps_s, EPS_SMALLEST_MAGNITUDE); > EPSSetWhichEigenpairs(eps_l, EPS_LARGEST_MAGNITUDE); > > // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero > ST st_s; > EPSGetST(eps_s, &st_s); > STSetType(st_s, STSINVERT); > > > // Set solver parameters at runtime > EPSSetFromOptions(eps_s); > EPSSetFromOptions(eps_l); > > > /////////////////////////////////////////////////////////////////////////////// > > // this part is called after jacobian matrix assemly > > PetscScalar kr_s, ki_s; > PetscScalar kr_l, ki_l; > PetscReal error_s; > PetscReal error_l; > PetscInt nconv_s; > PetscInt nconv_l; > > // get the smallest eigen value > EPSSolve( eps_s ); > EPSGetConverged( eps_s, &nconv_s ); > if( nconv_s > 0 ) > { > EPSGetEigenvalue( eps_s, 0, &kr_s, &ki_s ); > EPSComputeRelativeError( eps_s, 0, &error_s ); > } > > // get the largest eigen value > EPSSolve( eps_l ); > EPSGetConverged( eps_l, &nconv_l ); > if( nconv_l > 0 ) > { > EPSGetEigenvalue( eps_l, 0, &kr_l, &ki_l ); > EPSComputeRelativeError( eps_l, 0, &error_l ); > } > > From gdiso at ustc.edu Tue Apr 5 10:44:03 2011 From: gdiso at ustc.edu (Gong Ding) Date: Tue, 5 Apr 2011 23:44:03 +0800 (CST) Subject: [petsc-users] Slepc eigen value solver gives strange behavior In-Reply-To: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> Message-ID: <33536591.37741302018243123.JavaMail.coremail@mail.ustc.edu> Thank you very much. > El 05/04/2011, a las 05:07, Gong Ding escribi?: > > > > > Hi, > > > I use slepc eigen value solver to evaluate the eigen values of jacobian matrix on each nonlinear iteration. > > > > > > First, I create the EPS structure and set the operator to jacobian matrix. And then call the EPSSolve in the SNES build > > > jacobian matrix functuion. This method create EPS once, call EPSSolve many times. However, it seems EPSSolve only work at > > > first time, and result eigen value never changes (however, relative error becomes larger and larger) in the following solve procedure. > > > > > > Then I create EPS each time in the SNES build jacobian matrix functuion, do EPSSolve and delete EPS at the end of function. > > > This method gives eigen value for the jacobian matrix with small relative error ~1e-8. Of course, create and destroy the EPS solver each time > > > is not efficient. > > > > > > Does something get wrong in the first method? > > > > You will always solve the same eigenproblem, unless you call EPSSetOperators every time the matrix changes. > > When EPSSetOperators is called, the EPS object is reset and therefore EPSSetUp will be called in the next EPSSolve. So basically the first approach will have the same cost as the second approach. > > > > By the way, you are not using STSINVERT correctly. You should set EPS_TARGET_MAGNITUDE (instead of EPS_SMALLEST_MAGNITUDE) together with target=0.0 with EPSSetTarget. > > > > Jose > > > > > > > > The code I used is attached here > > > > > > // create the EPS solver for smallest and largest eigen value > > > EPS eps_s; > > > EPS eps_l; > > > > > > FVM_NonlinearSolver & nonlinear_solver = dynamic_cast(_solver); > > > Mat & J = nonlinear_solver.jacobian_matrix(); > > > > > > // create eigen value problem solver > > > EPSCreate(PETSC_COMM_WORLD, &eps_s); > > > EPSCreate(PETSC_COMM_WORLD, &eps_l); > > > // Set operator > > > EPSSetOperators(eps_s, J, PETSC_NULL); > > > EPSSetOperators(eps_l, J, PETSC_NULL); > > > > > > // calculate smallest and largest eigen value > > > EPSSetWhichEigenpairs(eps_s, EPS_SMALLEST_MAGNITUDE); > > > EPSSetWhichEigenpairs(eps_l, EPS_LARGEST_MAGNITUDE); > > > > > > // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero > > > ST st_s; > > > EPSGetST(eps_s, &st_s); > > > STSetType(st_s, STSINVERT); > > > > > > > > > // Set solver parameters at runtime > > > EPSSetFromOptions(eps_s); > > > EPSSetFromOptions(eps_l); > > > > > > > > > /////////////////////////////////////////////////////////////////////////////// > > > > > > // this part is called after jacobian matrix assemly > > > > > > PetscScalar kr_s, ki_s; > > > PetscScalar kr_l, ki_l; > > > PetscReal error_s; > > > PetscReal error_l; > > > PetscInt nconv_s; > > > PetscInt nconv_l; > > > > > > // get the smallest eigen value > > > EPSSolve( eps_s ); > > > EPSGetConverged( eps_s, &nconv_s ); > > > if( nconv_s > 0 ) > > > { > > > EPSGetEigenvalue( eps_s, 0, &kr_s, &ki_s ); > > > EPSComputeRelativeError( eps_s, 0, &error_s ); > > > } > > > > > > // get the largest eigen value > > > EPSSolve( eps_l ); > > > EPSGetConverged( eps_l, &nconv_l ); > > > if( nconv_l > 0 ) > > > { > > > EPSGetEigenvalue( eps_l, 0, &kr_l, &ki_l ); > > > EPSComputeRelativeError( eps_l, 0, &error_l ); > > > } > > > > > > > > > > From gdiso at ustc.edu Tue Apr 5 10:58:29 2011 From: gdiso at ustc.edu (Gong Ding) Date: Tue, 5 Apr 2011 23:58:29 +0800 (CST) Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix In-Reply-To: References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov> Message-ID: <13664069.37751302019109451.JavaMail.coremail@mail.ustc.edu> I known the diagonal scaling. And I will try it tomorrow. Thank to slepc, I can monitor the eigen values as an approximation of condition number. The original problem as condition number about 1e20, which defeat any iterative solver. I hope I can reduce it as much as possible. Further more, can I use MC64, which permute and scale a sparse unsymmetric matrix to put large entries on the diagonal? > The literature is unclear to me, but I don't think these scalings are done in this way to improve the conditioning of the matrix. They are done to change the relative importance of different entries in the vector to determine stopping conditions and search directions in Newton's method. For example, if you consider getting the first vector entry in the residual/error small more important than the other entries you would use the scaling vector like [bignumber 1 1 1 1 ....]. In some way the scaling vectors reflect working with a different norm to measure the residual. Since PETSc does not support providing these scaling vectors you can get the same effect if you define your a new function (and hence also new Jacobian) that weights the various entries the way you want based on their importance. In other words newF(x) = diagonalscaling1* oldF( diagonalscaling2 * y) then if x* is the solution to the new problem, y* = inv(diagonalscaling2*x*) is the solution to the original problem. In some cases this transformation can correspond to working in "dimensionless coordinates" but all that language is over my head. > > > > > > If you just want to scale the matrix to have ones on the diagonal before forming the preconditioner (on the theory that it is better to solve problems with a "well-scaled" matrix) you can use the run time options -ksp_diagonal_scale -ksp_diagonal_scale_fix or in the code with KSPSetDiagonalScale() KSPSetDiagonalScaleFix(). > > > > Barry > > > > On Apr 4, 2011, at 10:52 PM, Barry Smith wrote: > > > > > > > > If you are looking for something like this: > > > > > > When solving F(x) = 0, I would like to be able to scale both the solution > > > vector x and the residual function vector F, simply by specifying scaling > > > vectors, sx and sf, say. (These vectors would be the diagonal entries of > > > scaling matrices Dx and Df.) > > > I realize this can be achieved, at least in part, within the user residual > > > function. > > > This is what I had been doing, until I looked at Denis and Schnabel (sp?), > > > Brown and Saad, and the KINSOL user guide. It seems one has to take the > > > scaling matrices into account when computing various norms, when applying the > > > preconditioner, and when computing the step size, \sigma. No doubt there > > > are other things I have missed that also need to be done. > > > > > > http://www.mcs.anl.gov/petsc/petsc-as/developers/projects.html > > > > > > we don't have support for this (nor do I understand it). Anyways it has been on the "projects to do list" for a very long time; suspect it would require a good amount of futzing around in the source code to add. > > > > > > Barry > > > > > > > > > On Apr 4, 2011, at 10:16 PM, Gong Ding wrote: > > > > > >> Hi, > > >> I'd like to scaling the jacobian matrix as if the condition number can be improved. > > >> That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration. > > >> > > >> Does SNES already exist some interface to do this? > > >> > > >> > > >> > > >> > > >> > > > > > > > From u.tabak at tudelft.nl Tue Apr 5 11:14:02 2011 From: u.tabak at tudelft.nl (Umut Tabak) Date: Tue, 05 Apr 2011 18:14:02 +0200 Subject: [petsc-users] Is there a way to do row/column scaling of jacobian matrix In-Reply-To: <13664069.37751302019109451.JavaMail.coremail@mail.ustc.edu> References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu> <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov> <13664069.37751302019109451.JavaMail.coremail@mail.ustc.edu> Message-ID: <4D9B3FCA.1020600@tudelft.nl> On 04/05/2011 05:58 PM, Gong Ding wrote: > I known the diagonal scaling. And I will try it tomorrow. > Thank to slepc, I can monitor the eigen values as an approximation of condition number. > The original problem as condition number about 1e20, which defeat any iterative solver. > From personal experience, condition estimates larger than 1e+10 means practically singular and it is almost impossible to really decrease that to a reasonable number unless you use some specific system information which is really really difficult... And again from personal experience, diagonal scaling is the most naive scaling out there(however the first to try if you do now know sth better), and it does not bring much on these kinds of ill-conditioned systems. Trying to reformulate the problem seems like a better option to me. Experts will comment on the above propositions ;) Good luck. U. -- If I have a thousand ideas and only one turns out to be good, I am satisfied. Alfred Nobel From zhaonanavril at gmail.com Tue Apr 5 11:40:36 2011 From: zhaonanavril at gmail.com (NAN ZHAO) Date: Tue, 5 Apr 2011 10:40:36 -0600 Subject: [petsc-users] sequential version of petsc (no mpi) Message-ID: Dear all, I need to build a no mpi version of petsc for some reason. I use the option --with-mpi=0. The build seems to be successful. But when I compile my code with petsc, then it has some errors related to undefined reference to MPI_ABORT, MPI_NUITMP... I just tired to use KSP solver. Is anyone have some suggestions? Thanks, Nan -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Apr 5 11:45:12 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 5 Apr 2011 11:45:12 -0500 (CDT) Subject: [petsc-users] sequential version of petsc (no mpi) In-Reply-To: References: Message-ID: send all build logs [configure.log,make.log,test.log] to petsc-maint at mcs.anl.gov satish On Tue, 5 Apr 2011, NAN ZHAO wrote: > Dear all, > > I need to build a no mpi version of petsc for some reason. I use the option > --with-mpi=0. The build seems to be successful. But when I compile my code > with petsc, then it has some errors related to undefined reference to > MPI_ABORT, MPI_NUITMP... > I just tired to use KSP solver. Is anyone have some suggestions? > > Thanks, > Nan > From panourg at mech.upatras.gr Tue Apr 5 15:17:40 2011 From: panourg at mech.upatras.gr (panourg at mech.upatras.gr) Date: Tue, 5 Apr 2011 23:17:40 +0300 (EEST) Subject: [petsc-users] sequential version of petsc (no mpi) In-Reply-To: References: Message-ID: <2324.94.64.236.148.1302034660.squirrel@mail.mech.upatras.gr> You can run petsc in one process regardeless of mpi setup in your pc. I believe that some routines or other packages of petsc need mpi and therefore you take these errors. Do setup of mpi and make your code as before. K.P > Dear all, > > I need to build a no mpi version of petsc for some reason. I use the > option > --with-mpi=0. The build seems to be successful. But when I compile my code > with petsc, then it has some errors related to undefined reference to > MPI_ABORT, MPI_NUITMP... > I just tired to use KSP solver. Is anyone have some suggestions? > > Thanks, > Nan > From vyan2000 at gmail.com Tue Apr 5 21:52:53 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 5 Apr 2011 22:52:53 -0400 Subject: [petsc-users] about pclu Message-ID: Hi, I am wondering is there a way of checking the residual of a direct solver. It should be one shot and very small. I tried -ksp_monitor_true_residual, but no thing shows up. I guess a piece of code $Ax-b$ will do the trick? Thanks, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Apr 5 21:59:23 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Apr 2011 21:59:23 -0500 Subject: [petsc-users] about pclu In-Reply-To: References: Message-ID: On Tue, Apr 5, 2011 at 9:52 PM, Ryan Yan wrote: > Hi, > I am wondering is there a way of checking the residual of a direct solver. > It should > be one shot and very small. I tried -ksp_monitor_true_residual, but no > thing shows up. I guess a piece of code > $Ax-b$ will do the trick? > If you use it through KSPSolve, then -ksp_monitor will give you the residual. Matt > Thanks, > > Yan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Apr 5 22:06:41 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Apr 2011 22:06:41 -0500 Subject: [petsc-users] about pclu In-Reply-To: References: Message-ID: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote: > Hi, > I am wondering is there a way of checking the residual of a direct solver. It should > be one shot and very small. I tried -ksp_monitor_true_residual, but no thing shows up. I guess a piece of code > $Ax-b$ will do the trick? The reason that the monitor doesn't display anything is not the direct solver but because you are using LU with KSPType of KSPPREONLY if you run with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual will print the residual as you want. Barry > > Thanks, > > Yan From vyan2000 at gmail.com Tue Apr 5 22:16:05 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 5 Apr 2011 23:16:05 -0400 Subject: [petsc-users] about pclu In-Reply-To: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: Dear Barry and Matt, Thanks for the help, Indeed, the monitor starts to work with "richardson". Yan On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith wrote: > > On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote: > > > Hi, > > I am wondering is there a way of checking the residual of a direct > solver. It should > > be one shot and very small. I tried -ksp_monitor_true_residual, but no > thing shows up. I guess a piece of code > > $Ax-b$ will do the trick? > > The reason that the monitor doesn't display anything is not the direct > solver but because you are using LU with KSPType of KSPPREONLY if you run > with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual > will print the residual as you want. > > Barry > > > > > Thanks, > > > > Yan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Tue Apr 5 22:20:26 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 5 Apr 2011 23:20:26 -0400 Subject: [petsc-users] about pclu In-Reply-To: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: One more ask, :-) Which one is more efficient, richardson, preonly or no difference, if I am going to use direct solver for many times steps. Thanks, Yan On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith wrote: > > On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote: > > > Hi, > > I am wondering is there a way of checking the residual of a direct > solver. It should > > be one shot and very small. I tried -ksp_monitor_true_residual, but no > thing shows up. I guess a piece of code > > $Ax-b$ will do the trick? > > The reason that the monitor doesn't display anything is not the direct > solver but because you are using LU with KSPType of KSPPREONLY if you run > with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual > will print the residual as you want. > > Barry > > > > > Thanks, > > > > Yan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Tue Apr 5 22:21:21 2011 From: gdiso at ustc.edu (Gong Ding) Date: Wed, 6 Apr 2011 11:21:21 +0800 (CST) Subject: [petsc-users] Slepc eigen value solver gives strange behavior In-Reply-To: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> Message-ID: <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu> Dear Jose, will you please also take a look at the SVD code for smallest singular value? It seems work except the time consuming SVDCyclicSetExplicitMatrix routine. However, I wonder if there exist some more clever method. // SVD solver for smallest singular value SVD svd_s; EPS eps_s; ST st_s; KSP ksp_s; PC pc_s; PetscErrorCode ierr; // Create singular value solver context ierr = SVDCreate(PETSC_COMM_WORLD, &svd_s); // Set operator ierr = SVDSetOperator(svd_s, J); // small singular value use eigen value solver on Cyclic Matrix ierr = SVDSetWhichSingularTriplets(svd_s, SVD_SMALLEST); ierr = SVDSetType(svd_s, SVDCYCLIC); ierr = SVDCyclicSetExplicitMatrix(svd_s, PETSC_TRUE); // <-----time consuming // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero ierr = SVDCyclicGetEPS(svd_s, &eps_s); ierr = EPSSetType(eps_s, EPSKRYLOVSCHUR); ierr = EPSGetST(eps_s, &st_s); ierr = STSetType(st_s, STSINVERT); ierr = STGetKSP(st_s, &ksp_s); ierr = KSPGetPC(ksp_s, &pc_s); // since we have to deal with bad conditioned problem, we choose direct solver whenever possible // direct solver as preconditioner ierr = KSPSetType (ksp_s, (char*) KSPGMRES); assert(!ierr); // superlu which use static pivot seems very stable ierr = PCSetType (pc_s, (char*) PCLU); assert(!ierr); ierr = PCFactorSetMatSolverPackage (pc_s, "superlu"); assert(!ierr); // Set solver parameters at runtime ierr = SVDSetFromOptions(svd_s); assert(!ierr); ierr = SVDSetUp(svd_s); assert(!ierr); PetscReal sigma_large=1, sigma_small=1; PetscInt nconv; PetscReal error; // find the smallest singular value SVDSolve(svd_s); From knepley at gmail.com Tue Apr 5 22:27:03 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Apr 2011 22:27:03 -0500 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: On Tue, Apr 5, 2011 at 10:20 PM, Ryan Yan wrote: > One more ask, :-) > > Which one is more efficient, richardson, preonly or no difference, if I am > going to use direct solver for many times steps. > There should be no difference since direct solves take so long. Matt > Thanks, > > Yan > > On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith wrote: > >> >> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote: >> >> > Hi, >> > I am wondering is there a way of checking the residual of a direct >> solver. It should >> > be one shot and very small. I tried -ksp_monitor_true_residual, but no >> thing shows up. I guess a piece of code >> > $Ax-b$ will do the trick? >> >> The reason that the monitor doesn't display anything is not the direct >> solver but because you are using LU with KSPType of KSPPREONLY if you run >> with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual >> will print the residual as you want. >> >> Barry >> >> > >> > Thanks, >> > >> > Yan >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Tue Apr 5 22:34:05 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 5 Apr 2011 23:34:05 -0400 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: I see. Thanks, Matt. Yan On Tue, Apr 5, 2011 at 11:27 PM, Matthew Knepley wrote: > On Tue, Apr 5, 2011 at 10:20 PM, Ryan Yan wrote: > >> One more ask, :-) >> >> Which one is more efficient, richardson, preonly or no difference, if I am >> going to use direct solver for many times steps. >> > > There should be no difference since direct solves take so long. > > Matt > > >> Thanks, >> >> Yan >> >> On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith wrote: >> >>> >>> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote: >>> >>> > Hi, >>> > I am wondering is there a way of checking the residual of a direct >>> solver. It should >>> > be one shot and very small. I tried -ksp_monitor_true_residual, but no >>> thing shows up. I guess a piece of code >>> > $Ax-b$ will do the trick? >>> >>> The reason that the monitor doesn't display anything is not the direct >>> solver but because you are using LU with KSPType of KSPPREONLY if you run >>> with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual >>> will print the residual as you want. >>> >>> Barry >>> >>> > >>> > Thanks, >>> > >>> > Yan >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 6 00:43:51 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 6 Apr 2011 07:43:51 +0200 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: On Wed, Apr 6, 2011 at 05:27, Matthew Knepley wrote: > There should be no difference since direct solves take so long. That is, solves are very fast compared to factorization. If you just want to check the residual, Richardson is cheaper than GMRES because it will require one fewer preconditioner/matrix applications. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Wed Apr 6 09:24:50 2011 From: gdiso at ustc.edu (Gong Ding) Date: Wed, 6 Apr 2011 22:24:50 +0800 (CST) Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? Message-ID: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> Hi, Can some one gives me advise on how to solve the ill conditioned problem efficiently with iterative method (since the problem size is big). I calculated the smallest eigen values as well as the largest eigen values. There exist one extremely small eigen value, which made the system ill conditioned. I guess method such as Tikhonov regularization may work? Or there are some cheaper method works, if I can endure some inaccuracy in the solution. Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14 Smallest 1 eigen value: -2.480170e-04 with error 6.150216e-04 Smallest 2 eigen value: -2.787193e-04 with error 2.808614e-04 Smallest 3 eigen value: -2.825241e-04 with error 6.620491e-04 Smallest 4 eigen value: -2.825241e-04 with error 6.620491e-04 Smallest 5 eigen value: -2.833565e-04 with error 2.990142e-04 Smallest 6 eigen value: -3.020135e-04 with error 6.313397e-04 Smallest 7 eigen value: -3.020149e-04 with error 4.939515e-04 Smallest 8 eigen value: -3.083228e-04 with error 1.114806e-03 Largest 0 eigen value: -4.076308e+03 with error 2.403326e-08 Largest 1 eigen value: -3.894209e+03 with error 6.314489e-08 Largest 2 eigen value: -3.893185e+03 with error 3.924167e-08 Largest 3 eigen value: -3.855228e+03 with error 3.504644e-09 Largest 4 eigen value: -3.739288e+03 with error 1.689236e-08 Thanks. From jed at 59A2.org Wed Apr 6 09:32:34 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 06 Apr 2011 17:32:34 +0300 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> Message-ID: <87k4f730xp.fsf@59A2.org> On Wed, 6 Apr 2011 22:24:50 +0800 (CST), "Gong Ding" wrote: > Hi, > Can some one gives me advise on how to solve the ill conditioned problem > efficiently with iterative method (since the problem size is big). > > I calculated the smallest eigen values as well as the largest eigen values. > There exist one extremely small eigen value, which made the system ill conditioned. > I guess method such as Tikhonov regularization may work? > Or there are some cheaper method works, if I can endure some inaccuracy in the solution. > > > Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14 Your problem has a null space of dimension 1. Determine the eigenvector associated with this eigenvalue. That is the null space, it might just be a constant. Create a MatNullSpace and use KSPSetNullSpace(). (If it is the constant, you can just use -ksp_constant_null_space.) See the section in the users manual on solving singular systems. From u.tabak at tudelft.nl Wed Apr 6 09:37:26 2011 From: u.tabak at tudelft.nl (Umut Tabak) Date: Wed, 06 Apr 2011 16:37:26 +0200 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <87k4f730xp.fsf@59A2.org> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> Message-ID: <4D9C7AA6.301@tudelft.nl> On 04/06/2011 04:32 PM, Jed Brown wrote: > On Wed, 6 Apr 2011 22:24:50 +0800 (CST), "Gong Ding" wrote: > >> Hi, >> Can some one gives me advise on how to solve the ill conditioned problem >> efficiently with iterative method (since the problem size is big). >> >> I calculated the smallest eigen values as well as the largest eigen values. >> There exist one extremely small eigen value, which made the system ill conditioned. >> I guess method such as Tikhonov regularization may work? >> Or there are some cheaper method works, if I can endure some inaccuracy in the solution. >> >> >> Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14 >> > Your problem has a null space of dimension 1. Determine the eigenvector associated with this eigenvalue. That is the null space, it might just be a constant. Create a MatNullSpace and use KSPSetNullSpace(). (If it is the constant, you can just use -ksp_constant_null_space.) See the section in the users manual on solving singular systems. > Just curious, are not the other negative eigenvalues problematic as well? From bsmith at mcs.anl.gov Wed Apr 6 09:48:01 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 6 Apr 2011 09:48:01 -0500 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <4D9C7AA6.301@tudelft.nl> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl> Message-ID: On Apr 6, 2011, at 9:37 AM, Umut Tabak wrote: > On 04/06/2011 04:32 PM, Jed Brown wrote: >> On Wed, 6 Apr 2011 22:24:50 +0800 (CST), "Gong Ding" wrote: >> >>> Hi, >>> Can some one gives me advise on how to solve the ill conditioned problem >>> efficiently with iterative method (since the problem size is big). >>> >>> I calculated the smallest eigen values as well as the largest eigen values. >>> There exist one extremely small eigen value, which made the system ill conditioned. >>> I guess method such as Tikhonov regularization may work? >>> Or there are some cheaper method works, if I can endure some inaccuracy in the solution. >>> >>> >>> Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14 >>> >> Your problem has a null space of dimension 1. Determine the eigenvector associated with this eigenvalue. That is the null space, it might just be a constant. Create a MatNullSpace and use KSPSetNullSpace(). (If it is the constant, you can just use -ksp_constant_null_space.) See the section in the users manual on solving singular systems. >> > Just curious, are not the other negative eigenvalues problematic as well? They are not nice that they are not necessarily the end of the world (like the functionally zero one). After removing the functionally zero eigenvalue the condition number of the matrix is around 10^7 which is very large but within the realm of solvable. With that functionally zero one the problem is simply not solvable. Barry > From jed at 59A2.org Wed Apr 6 09:50:17 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 06 Apr 2011 17:50:17 +0300 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <4D9C7AA6.301@tudelft.nl> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl> Message-ID: <87fwpv3046.fsf@59A2.org> On Wed, 06 Apr 2011 16:37:26 +0200, Umut Tabak wrote: > Just curious, are not the other negative eigenvalues problematic > as well? Negative eigenvalues do not pose any particular problem to Krylov methods like GMRES. Conjugate gradients does require that the matrix be SPD, but petsc-dev detects when a matrix is negative definite and still does the right thing. With petsc-3.1, you could simply change the sign of everything. (I prefer to build to formulate my equations with positive matrices when possible, but those other negative eigenvalues are not the problem here.) From u.tabak at tudelft.nl Wed Apr 6 10:24:23 2011 From: u.tabak at tudelft.nl (Umut Tabak) Date: Wed, 06 Apr 2011 17:24:23 +0200 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <87fwpv3046.fsf@59A2.org> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl> <87fwpv3046.fsf@59A2.org> Message-ID: <4D9C85A7.1030405@tudelft.nl> On 04/06/2011 04:50 PM, Jed Brown wrote: > On Wed, 06 Apr 2011 16:37:26 +0200, Umut Tabak > wrote: >> Just curious, are not the other negative eigenvalues problematic as >> well? > > Negative eigenvalues do not pose any particular problem to Krylov > methods like GMRES. Conjugate gradients does require that the matrix > be SPD, but petsc-dev detects when a matrix is negative definite and > still does the right thing. Also with cg type methods? if yes, how? Because I am dealing with a similar problem in a projection sense which makes some factors that are already available very good preconditioners, completely problem specific, then cg converges incredibly fast, sth like 4 to 8 iterations. However, projection is the key and at every step, in cg, I should make sure that the search directions in cg are orthogonal to the previous ones by cgs/mgs, otherwise I bump into the well know orthogonality issues of Lanczos type methods... why I am digging is to see some better options if there are any. Greetz, Umut From jed at 59A2.org Wed Apr 6 10:30:20 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 06 Apr 2011 18:30:20 +0300 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <4D9C8180.5030901@tudelft.nl> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl> <87fwpv3046.fsf@59A2.org> <4D9C8180.5030901@tudelft.nl> Message-ID: <87d3kz2y9f.fsf@59A2.org> On Wed, 06 Apr 2011 17:06:40 +0200, Umut Tabak wrote: > On 04/06/2011 04:50 PM, Jed Brown wrote: > > On Wed, 06 Apr 2011 16:37:26 +0200, Umut Tabak > > wrote: > >> Just curious, are not the other negative eigenvalues problematic as > >> well? > > > > Negative eigenvalues do not pose any particular problem to Krylov > > methods like GMRES. Conjugate gradients does require that the matrix > > be SPD, but petsc-dev detects when a matrix is negative definite and > > still does the right thing. > > Also with cg type methods? if yes, how? http://petsc.cs.iit.edu/petsc/petsc-dev/rev/cae94ca39fcb From jroman at dsic.upv.es Wed Apr 6 13:33:27 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 6 Apr 2011 20:33:27 +0200 Subject: [petsc-users] Slepc eigen value solver gives strange behavior In-Reply-To: <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu> References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu> Message-ID: El 06/04/2011, a las 05:21, Gong Ding escribi?: > Dear Jose, > will you please also take a look at the SVD code for smallest singular value? > It seems work except the time consuming SVDCyclicSetExplicitMatrix routine. > However, I wonder if there exist some more clever method. I think it is correct. Jose > > // SVD solver for smallest singular value > SVD svd_s; > EPS eps_s; > ST st_s; > KSP ksp_s; > PC pc_s; > > PetscErrorCode ierr; > > // Create singular value solver context > ierr = SVDCreate(PETSC_COMM_WORLD, &svd_s); > > // Set operator > ierr = SVDSetOperator(svd_s, J); > > > // small singular value use eigen value solver on Cyclic Matrix > ierr = SVDSetWhichSingularTriplets(svd_s, SVD_SMALLEST); > ierr = SVDSetType(svd_s, SVDCYCLIC); > ierr = SVDCyclicSetExplicitMatrix(svd_s, PETSC_TRUE); // <-----time consuming > // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero > ierr = SVDCyclicGetEPS(svd_s, &eps_s); > ierr = EPSSetType(eps_s, EPSKRYLOVSCHUR); > ierr = EPSGetST(eps_s, &st_s); > ierr = STSetType(st_s, STSINVERT); > > ierr = STGetKSP(st_s, &ksp_s); > ierr = KSPGetPC(ksp_s, &pc_s); > // since we have to deal with bad conditioned problem, we choose direct solver whenever possible > > // direct solver as preconditioner > ierr = KSPSetType (ksp_s, (char*) KSPGMRES); assert(!ierr); > // superlu which use static pivot seems very stable > ierr = PCSetType (pc_s, (char*) PCLU); assert(!ierr); > ierr = PCFactorSetMatSolverPackage (pc_s, "superlu"); assert(!ierr); > > // Set solver parameters at runtime > ierr = SVDSetFromOptions(svd_s); assert(!ierr); > > ierr = SVDSetUp(svd_s); assert(!ierr); > > PetscReal sigma_large=1, sigma_small=1; > PetscInt nconv; > PetscReal error; > > // find the smallest singular value > SVDSolve(svd_s); From jed at 59A2.org Wed Apr 6 14:01:37 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 6 Apr 2011 21:01:37 +0200 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: <32A929AB3D7C460BA6F517A9A29EE0D6@cogendaeda> References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> <32A929AB3D7C460BA6F517A9A29EE0D6@cogendaeda> Message-ID: 2011/4/6 Gong Ding > Ok, I will investigate matrix null space problem. > The matrix comes from nonlinear problem, I wonder if I need to calculate > the eigenvector each time. > Possibly, but it is more likely that the null space is something simple like a constant. > > Several months ago, some one committed DGMRES implementation, which also > dropped smallest eigen value. > It it possible to use (slightly modified) DGMRES as flexable tool for > sigular problem? > I'm not familiar with DGMRES. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Apr 6 14:25:21 2011 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 6 Apr 2011 14:25:21 -0500 Subject: [petsc-users] Is there efficeint method for matrix with one extremely small eigen value? In-Reply-To: References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu> <87k4f730xp.fsf@59A2.org> <32A929AB3D7C460BA6F517A9A29EE0D6@cogendaeda> Message-ID: On Wed, Apr 6, 2011 at 2:01 PM, Jed Brown wrote: > 2011/4/6 Gong Ding > >> Ok, I will investigate matrix null space problem. >> The matrix comes from nonlinear problem, I wonder if I need to calculate >> the eigenvector each time. >> > > Possibly, but it is more likely that the null space is something simple > like a constant. > > >> >> Several months ago, some one committed DGMRES implementation, which also >> dropped smallest eigen value. >> It it possible to use (slightly modified) DGMRES as flexable tool for >> sigular problem? >> > > I'm not familiar with DGMRES. > Deflated GMRES will not help here. This is just the power method, and thus gets the large eigenvalues first. You will not get the null space vector. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dharmareddy84 at gmail.com Wed Apr 6 15:11:37 2011 From: dharmareddy84 at gmail.com (Dharmendar Reddy) Date: Wed, 6 Apr 2011 15:11:37 -0500 Subject: [petsc-users] PETSc Mesh and Fortran Message-ID: Hello, Are there any examples of PETSc mesh usage in a Fortran code. The Examples link on PETSc mesh man page ( http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mesh/index.html) redirects to page not found. I would like to learn usage of PETSc mesh utilities. I have an exodus file for the mesh. Thanks Reddy -- ----------------------------------------------------- Dharmendar Reddy Palle Graduate Student Microelectronics Research center, University of Texas at Austin, 10100 Burnet Road, Bldg. 160 MER 2.608F, TX 78758-4445 e-mail: dharmareddy84 at gmail.com Phone: +1-512-350-9082 United States of America. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaonanavril at gmail.com Wed Apr 6 18:53:20 2011 From: zhaonanavril at gmail.com (NAN ZHAO) Date: Wed, 6 Apr 2011 17:53:20 -0600 Subject: [petsc-users] help on ksp Message-ID: Dear all, I tried to use ksp to solve some problem. I got some Segmentation Violation error. And I got result from the solver as below. I am wondering if the ksp matrix has some error, cause I got the nonzeros allocated wrong. Can anyone dig out some valuable information from the ksp output? Thanks. -------------------------------------------------------------------------------------------------- total: nonzeros=38449, allocated nonzeros=52103 reason code = 2, its = 2546 KSP Object: type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=5000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ilu ILU: 0 levels of fill ILU: factor fill ratio allocated 1 ILU: tolerance for zero pivot 1e-12 out-of-place factorization matrix ordering: natural ILU: factor fill ratio needed 0 Factored matrix follows Matrix Object: type=seqbaij, rows=2903, cols=2903 total: nonzeros=38449, allocated nonzeros=52103 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqbaij, rows=2903, cols=2903 total: nonzeros=38449, allocated nonzeros=52103 block size is 1 Nan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Apr 6 19:04:26 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 6 Apr 2011 19:04:26 -0500 Subject: [petsc-users] help on ksp In-Reply-To: References: Message-ID: <1C5BEBEA-3789-4BAF-B84D-7BE612BD64C8@mcs.anl.gov> Incorrect preallocation should never cause a crash (just possibly slower code). You need to run valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind Barry On Apr 6, 2011, at 6:53 PM, NAN ZHAO wrote: > Dear all, > > I tried to use ksp to solve some problem. I got some Segmentation Violation error. And I got result from the solver as below. I am wondering if the ksp matrix has some error, cause I got the nonzeros allocated wrong. Can anyone dig out some valuable information from the ksp output? Thanks. > -------------------------------------------------------------------------------------------------- > total: nonzeros=38449, allocated nonzeros=52103 > reason code = 2, its = 2546 > KSP Object: > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=5000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ilu > ILU: 0 levels of fill > ILU: factor fill ratio allocated 1 > ILU: tolerance for zero pivot 1e-12 > out-of-place factorization > matrix ordering: natural > ILU: factor fill ratio needed 0 > Factored matrix follows > Matrix Object: > type=seqbaij, rows=2903, cols=2903 > total: nonzeros=38449, allocated nonzeros=52103 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqbaij, rows=2903, cols=2903 > total: nonzeros=38449, allocated nonzeros=52103 > block size is 1 > > Nan From bartlomiej.wach at yahoo.pl Thu Apr 7 08:04:43 2011 From: bartlomiej.wach at yahoo.pl (=?utf-8?B?QmFydMWCb21pZWogVw==?=) Date: Thu, 7 Apr 2011 14:04:43 +0100 (BST) Subject: [petsc-users] Sparse Matrix preallocation and performance In-Reply-To: <87k4f730xp.fsf@59A2.org> Message-ID: <81155.5596.qm@web28304.mail.ukl.yahoo.com> Hello, Wheather I use ? ierr = MatCreate(PETSC_COMM_WORLD,&L);CHKERRQ(ierr); ? ierr = MatSetSizes(L,PETSC_DECIDE,PETSC_DECIDE,n,n);CHKERRQ(ierr); ???????????? MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz); ? ierr = MatSetFromOptions(L);CHKERRQ(ierr); or ? ? ierr = MatCreateSeqAIJ(PETSC_COMM_WORLD,n,n,PETSC_DEFAULT,nnz,&L);CHKERRQ(ierr); ? ierr = MatSetFromOptions(L);CHKERRQ(ierr); ? Gives me ?? Number of mallocs during MatSetValues() is? X On matrix assembly, where X is positive Is this indicating the preallocation or should it be zero and I'm missing something? Moreover, using MatCreateSeqAIJ lowers the performance of MatSetValues Is my code improper? Regards Bart?omiej Wach -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Apr 7 08:15:40 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 7 Apr 2011 08:15:40 -0500 Subject: [petsc-users] Sparse Matrix preallocation and performance In-Reply-To: <81155.5596.qm@web28304.mail.ukl.yahoo.com> References: <87k4f730xp.fsf@59A2.org> <81155.5596.qm@web28304.mail.ukl.yahoo.com> Message-ID: On Thu, Apr 7, 2011 at 8:04 AM, Bart?omiej W wrote: > Hello, > > Wheather I use > > ierr = MatCreate(PETSC_COMM_WORLD,&L);CHKERRQ(ierr); > ierr = MatSetSizes(L,PETSC_DECIDE,PETSC_DECIDE,n,n);CHKERRQ(ierr); > MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz); > ierr = MatSetFromOptions(L);CHKERRQ(ierr); > Move SetPreallocation() after SetFromOptions(). Here is what happens: SetFromOptions() will give the matrix a type, since it does not have one already Since the matrix had no type before SetPreallocation(), the call was ignored Matt > or > > > ierr = > MatCreateSeqAIJ(PETSC_COMM_WORLD,n,n,PETSC_DEFAULT,nnz,&L);CHKERRQ(ierr); > ierr = MatSetFromOptions(L);CHKERRQ(ierr); > > Gives me > > Number of mallocs during MatSetValues() is X > > On matrix assembly, where X is positive > Is this indicating the preallocation or should it be zero and I'm missing > something? > > Moreover, using MatCreateSeqAIJ lowers the performance of MatSetValues > > Is my code improper? > > Regards > Bart?omiej Wach > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Thu Apr 7 08:49:35 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Thu, 7 Apr 2011 09:49:35 -0400 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: Dear Jed, Sorry for late reply. This email box gets piled up... :-) I agree that solves are faster than factorization and GMRES use more MatVecMult, even with exact preconditioner. So I guess PREONLY is meaning direct solve without forming any sub-space and just one forward and backward substitutions? Thanks, Yan On Wed, Apr 6, 2011 at 1:43 AM, Jed Brown wrote: > On Wed, Apr 6, 2011 at 05:27, Matthew Knepley wrote: > >> There should be no difference since direct solves take so long. > > > That is, solves are very fast compared to factorization. > > If you just want to check the residual, Richardson is cheaper than GMRES > because it will require one fewer preconditioner/matrix applications. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From PRaeth at hpti.com Thu Apr 7 08:52:44 2011 From: PRaeth at hpti.com (Raeth, Peter) Date: Thu, 7 Apr 2011 13:52:44 +0000 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> , Message-ID: <3474F869C1954540B771FD9CAEBCB65722B80D98@CORTINA.HPTI.COM> "This email box gets piled up..." As a point of encouragement and appreciation, this is a sign of a much-used product that people are applying to increasingly-sophisticated research. Best, Peter. Peter G. Raeth, Ph.D. Senior Staff Scientist Signal and Image Processing High Performance Technologies, Inc 937-904-5147 praeth at hpti.com ________________________________ From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Ryan Yan [vyan2000 at gmail.com] Sent: Thursday, April 07, 2011 9:49 AM To: PETSc users list Subject: Re: [petsc-users] about pclu Dear Jed, Sorry for late reply. This email box gets piled up... :-) I agree that solves are faster than factorization and GMRES use more MatVecMult, even with exact preconditioner. So I guess PREONLY is meaning direct solve without forming any sub-space and just one forward and backward substitutions? Thanks, Yan On Wed, Apr 6, 2011 at 1:43 AM, Jed Brown > wrote: On Wed, Apr 6, 2011 at 05:27, Matthew Knepley > wrote: There should be no difference since direct solves take so long. That is, solves are very fast compared to factorization. If you just want to check the residual, Richardson is cheaper than GMRES because it will require one fewer preconditioner/matrix applications. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 7 08:55:41 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 7 Apr 2011 15:55:41 +0200 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: On Thu, Apr 7, 2011 at 15:49, Ryan Yan wrote: > I agree that solves are faster than factorization and GMRES > use more MatVecMult, even with exact preconditioner. So I guess > PREONLY is meaning direct solve without forming any sub-space and just > one forward and backward substitutions? > Yes. Because of the way GMRES works with zero initial guess, there will be two MatSolve and one MatMult even when using a direct solver. PREONLY does one MatSolve and zero MatMult. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Thu Apr 7 09:03:11 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Thu, 7 Apr 2011 10:03:11 -0400 Subject: [petsc-users] about pclu In-Reply-To: <3474F869C1954540B771FD9CAEBCB65722B80D98@CORTINA.HPTI.COM> References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> <3474F869C1954540B771FD9CAEBCB65722B80D98@CORTINA.HPTI.COM> Message-ID: Nice Observation. :-) Y > As a point of encouragement and appreciation, this is a sign of a much-used > product that people are applying to increasingly-sophisticated research. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 7 09:08:28 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 7 Apr 2011 16:08:28 +0200 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: On Thu, Apr 7, 2011 at 16:06, Ryan Yan wrote: > Hi Jed, > How is going? :-) > > PREONLY maybe also use two MatSolve? Or is there any magic I did not see. > :-) > Why do you say that? $ ./ex2 -pc_type lu -ksp_type preonly -log_summary |g '^MatSolve' MatSolve 1 1.0 1.1921e-05 1.0 1.22e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 22 0 0 0 0 22 0 0 0 102 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Thu Apr 7 09:21:13 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Thu, 7 Apr 2011 10:21:13 -0400 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: Wait, so is it for PREONLY? And what Matsolve does, solving a triangular system with both L and U available? The reason that I previously guess the count be two is that I think there is a back-substituiton and a forward-substitution involved in solving a linear system using factorization. If a pair of back-substitution and forward-substitution counts 1 MatSolve. Then I think we mean the same thing. Thanks, Yan On Thu, Apr 7, 2011 at 10:08 AM, Jed Brown wrote: > On Thu, Apr 7, 2011 at 16:06, Ryan Yan wrote: > >> Hi Jed, >> How is going? :-) >> >> PREONLY maybe also use two MatSolve? Or is there any magic I did not see. >> :-) >> > > Why do you say that? > > $ ./ex2 -pc_type lu -ksp_type preonly -log_summary |g '^MatSolve' > MatSolve 1 1.0 1.1921e-05 1.0 1.22e+03 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 22 0 0 0 0 22 0 0 0 102 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 7 09:23:41 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 7 Apr 2011 16:23:41 +0200 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: On Thu, Apr 7, 2011 at 16:21, Ryan Yan wrote: > Wait, so is it for PREONLY? And what Matsolve does, solving a triangular > system with both L and U available? > Yup, forward- and back-solves are both done in one "MatSolve". > The reason that I previously guess the count be two is that I think there > is a back-substituiton > and a forward-substitution involved in solving a linear system using > factorization. If a pair of > back-substitution and forward-substitution counts 1 MatSolve. Then I think > we mean the same thing. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Thu Apr 7 09:25:40 2011 From: vyan2000 at gmail.com (Ryan Yan) Date: Thu, 7 Apr 2011 10:25:40 -0400 Subject: [petsc-users] about pclu In-Reply-To: References: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov> Message-ID: It is wonderful to reach a agreement. :-) Cheers, Yan On Thu, Apr 7, 2011 at 10:23 AM, Jed Brown wrote: > On Thu, Apr 7, 2011 at 16:21, Ryan Yan wrote: > >> Wait, so is it for PREONLY? And what Matsolve does, solving a triangular >> system with both L and U available? >> > > Yup, forward- and back-solves are both done in one "MatSolve". > > >> The reason that I previously guess the count be two is that I think there >> is a back-substituiton >> and a forward-substitution involved in solving a linear system using >> factorization. If a pair of >> back-substitution and forward-substitution counts 1 MatSolve. Then I think >> we mean the same thing. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram at ibrae.ac.ru Thu Apr 7 18:17:32 2011 From: ram at ibrae.ac.ru (=?KOI8-R?B?4czFy9PFyiDy0drBzs/X?=) Date: Fri, 8 Apr 2011 03:17:32 +0400 Subject: [petsc-users] How to create and assemble matrices for DA vectors?? Message-ID: Hello. When I create vectors using VecCreate(PETSC_COMM_WORLD,&u); VecSetSizes(u,PETSC_DECIDE, VecSize); VecSetFromOptions(u); VecDuplicate(u,&b); and matrix using MatCreate(PETSC_COMM_WORLD,&A); MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,VecSize,VecSize); MatSetFromOptions(A); PETSc distributes their elements in a proper identical way among processors, so I can use procedures like MatMult(A,u,b); and KSPSolve(ksp,b,x); Ofcourse after matrix assembling and initialization of KSP and PC KSPCreate(PETSC_COMM_WORLD,&ksp); KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN); And thats great and works amazingly! But now I've created DA vectors "u" and "b" and assembled them through the natural grid indexing. And I need to solve the same SLE Au=b, where A is a Laplacian. How should I create and assemble the A matrix according to my DA vector to use the same functionality? Thank you! Alexey Ryazanov ______________________________________ Nuclear Safety Institute of Russian Academy of Sciences -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Apr 7 18:32:47 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 7 Apr 2011 18:32:47 -0500 (CDT) Subject: [petsc-users] How to create and assemble matrices for DA vectors?? In-Reply-To: References: Message-ID: On Fri, 8 Apr 2011, ??????? ??????? wrote: > Hello. > > When I create vectors using > > VecCreate(PETSC_COMM_WORLD,&u); > VecSetSizes(u,PETSC_DECIDE, VecSize); > VecSetFromOptions(u); > VecDuplicate(u,&b); > > and matrix using > > MatCreate(PETSC_COMM_WORLD,&A); > MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,VecSize,VecSize); > MatSetFromOptions(A); > > PETSc distributes their elements in a proper identical way among processors, > so I can use procedures like > > MatMult(A,u,b); > > and > KSPSolve(ksp,b,x); > Ofcourse after matrix assembling and initialization of KSP and PC > > KSPCreate(PETSC_COMM_WORLD,&ksp); > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN); > > And thats great and works amazingly! > > > > > But now I've created DA vectors "u" and "b" and assembled them through the > natural grid indexing. > > And I need to solve the same SLE Au=b, where A is a Laplacian. > > How should I create and assemble the A matrix according to my DA vector to > use the same functionality? Create u,b with DAGetGlobalVector() and A with DAGetMatrix() and they will match the DA. For eg: check: src/snes/examples/tutorials/ex5.c [or some of the examples in src/dm/da/examples] Satish > > Thank you! > > Alexey Ryazanov > ______________________________________ > Nuclear Safety Institute of Russian Academy of Sciences > > From domenico.borzacchiello at univ-st-etienne.fr Fri Apr 8 02:59:52 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Fri, 8 Apr 2011 09:59:52 +0200 (CEST) Subject: [petsc-users] [DMMG] Stokes Solver Message-ID: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> Hi, I'm trying to implement a 3d stokes solver on a simple cartesian staggered grid using the a MAC FV Discretisation. I'm following the example given in /snes/examples/tutorials/ex30.c. As the stokes solver is ready I'll add some more complex constitutive laws. For the moment I'm testing the solver with just 1-level MG and I'd like a few clarification to know if what I'm doing is correct. - The solver solves for u v w p. I'm using a single DA with 4 DOFs and due to the MAC arrangement I have some "extraboundary" nodes for u ( in the y and z directions) v (x & z dir) w ( x & y dir) and p (x,y & z dir). If what I'm getting from ex30.c is right I have to write a simple identity for each of these nodes (i.e. p_extra = anyvalue) as they are not coupled with the rest of the system. I'm doing the same for Dirichlet BCs nodes (i.e. u = Ubound). Is this correct? - How does Petsc deal with the pressure-velocity coupling? Is it correct to try to solve the whole coupled system with DMMG as in ex30.c? At present time I'm getting no convergence by running dmmgsolve (snes) with all the default options on a very small system. 0 SNES Function norm 7.128085632250e+00 Number of Newton iterations = 0 Number of Linear iterations = 0 Average Linear its / Newton = -nan Converged Reason = -3 If I run the same case with a direct solver (pc_type lu) I'm basically getting the same error: RINFO(1) (local estimated flops for the elimination after analysis): [0] 5.42609e+08 RINFO(2) (local estimated flops for the assembly after factorization): [0] 3.83582e+06 RINFO(3) (local estimated flops for the elimination after factorization): [0] 5.44162e+08 INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): [0] 17 INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): [0] 17 INFO(23) (num of pivots eliminated on this processor after factorization): [0] 1372 Number of Newton iterations = 7 Number of Linear iterations = 88 Average Linear its / Newton = 1.257143e+01 Converged Reason = -3 Would you suggest anything to fix the problem? I'm double-checking the user provided function in DMMGSetSNESLocal to see if I made any mistake there. Thank you in advance, Domenico From jed at 59A2.org Fri Apr 8 03:26:07 2011 From: jed at 59A2.org (Jed Brown) Date: Fri, 8 Apr 2011 10:26:07 +0200 Subject: [petsc-users] [DMMG] Stokes Solver In-Reply-To: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> Message-ID: On Fri, Apr 8, 2011 at 09:59, wrote: > Hi, > I'm trying to implement a 3d stokes solver on a simple cartesian staggered > grid using the a MAC FV Discretisation. I'm following the example given in > /snes/examples/tutorials/ex30.c. As the stokes solver is ready I'll add > some more complex constitutive laws. For the moment I'm testing the solver > with just 1-level MG and I'd like a few clarification to know if what I'm > doing is correct. > > - The solver solves for u v w p. I'm using a single DA with 4 DOFs and due > to the MAC arrangement I have some "extraboundary" nodes for u ( in the y > and z directions) v (x & z dir) w ( x & y dir) and p (x,y & z dir). If > what I'm getting from ex30.c is right I have to write a simple identity > for each of these nodes (i.e. p_extra = anyvalue) as they are not coupled > with the rest of the system. I'm doing the same for Dirichlet BCs nodes > (i.e. u = Ubound). Is this correct? > yes > > - How does Petsc deal with the pressure-velocity coupling? Is it correct > to try to solve the whole coupled system with DMMG as in ex30.c? At > present time I'm getting no convergence by running dmmgsolve (snes) with > all the default options on a very small system. > > 0 SNES Function norm 7.128085632250e+00 > Number of Newton iterations = 0 > Number of Linear iterations = 0 > Average Linear its / Newton = -nan > Converged Reason = -3 > You may as well check why the linear solve failed by running with -ksp_converged_reason. There are two challenges for solving Stokes in this way. First, the interpolation operators that the DA gives you are probably not what you want (pressure and velocity should be interpolated differently) and second, the standard smoother of ILU is not expected to work with interlaced velocity and pressure (you would want to either use a "Vanka smoother" that solves small problems associated with each pressure cell and all adjacent velocities, use field-split as a smoother, or (tricky and not guaranteed to work) order the pressures last in each subdomain with ILU as a smoother). Vanka smoothers are very problem-dependent so you would need to write that part yourself. It would be nice to have an example for Stokes. An alternative to coupled multigrid is to use PCFieldSplit at the top level and multigrid for the viscous part separately (see e.g. Elman's many papers on this approach). We don't currently have an example doing PCFieldSplit with geometric multigrid inside the splits, but it should work with petsc-dev if you follow the approach in src/ksp/ksp/examples/tutorials/ex45.c (which does not use DMMG, we are working to fold DMMG's functionality into the existing solver objects). -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Fri Apr 8 04:40:13 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Fri, 8 Apr 2011 11:40:13 +0200 (CEST) Subject: [petsc-users] [DMMG] Stokes Solver In-Reply-To: References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> Message-ID: <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr> Hi Jed Thank you for the very quick reply, running with -ksp_converged_reason and lu preconditioner gives out Linear solve converged due to CONVERGED_RTOL iterations 14 Linear solve converged due to CONVERGED_RTOL iterations 12 Linear solve converged due to CONVERGED_RTOL iterations 13 Linear solve converged due to CONVERGED_RTOL iterations 12 Linear solve converged due to CONVERGED_RTOL iterations 13 Linear solve converged due to CONVERGED_RTOL iterations 12 Linear solve converged due to CONVERGED_RTOL iterations 12 Linear solve did not converge due to DIVERGED_DTOL iterations 1080 RINFO(1) (local estimated flops for the elimination after analysis): [0] 5.42609e+08 RINFO(2) (local estimated flops for the assembly after factorization): [0] 3.83582e+06 RINFO(3) (local estimated flops for the elimination after factorization): [0] 5.44162e+08 INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): [0] 17 INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): [0] 17 INFO(23) (num of pivots eliminated on this processor after factorization): [0] 1372 Number of Newton iterations = 7 Number of Linear iterations = 88 Average Linear its / Newton = 1.257143e+01 Converged Reason = -3 I can't get why it takes more than one Newton iteration even if my system is linear and the linear solver is direct. As for the Vanka smoother would you suggest to implement it with PCSHELL or by defining a new PC type? thanks, Domenico. > On Fri, Apr 8, 2011 at 09:59, > wrote: > >> Hi, >> I'm trying to implement a 3d stokes solver on a simple cartesian >> staggered >> grid using the a MAC FV Discretisation. I'm following the example given >> in >> /snes/examples/tutorials/ex30.c. As the stokes solver is ready I'll add >> some more complex constitutive laws. For the moment I'm testing the >> solver >> with just 1-level MG and I'd like a few clarification to know if what >> I'm >> doing is correct. >> >> - The solver solves for u v w p. I'm using a single DA with 4 DOFs and >> due >> to the MAC arrangement I have some "extraboundary" nodes for u ( in the >> y >> and z directions) v (x & z dir) w ( x & y dir) and p (x,y & z dir). If >> what I'm getting from ex30.c is right I have to write a simple identity >> for each of these nodes (i.e. p_extra = anyvalue) as they are not >> coupled >> with the rest of the system. I'm doing the same for Dirichlet BCs nodes >> (i.e. u = Ubound). Is this correct? >> > > yes > > >> >> - How does Petsc deal with the pressure-velocity coupling? Is it correct >> to try to solve the whole coupled system with DMMG as in ex30.c? At >> present time I'm getting no convergence by running dmmgsolve (snes) with >> all the default options on a very small system. >> >> 0 SNES Function norm 7.128085632250e+00 >> Number of Newton iterations = 0 >> Number of Linear iterations = 0 >> Average Linear its / Newton = -nan >> Converged Reason = -3 >> > > You may as well check why the linear solve failed by running with > -ksp_converged_reason. > > There are two challenges for solving Stokes in this way. First, the > interpolation operators that the DA gives you are probably not what you > want > (pressure and velocity should be interpolated differently) and second, the > standard smoother of ILU is not expected to work with interlaced velocity > and pressure (you would want to either use a "Vanka smoother" that solves > small problems associated with each pressure cell and all adjacent > velocities, use field-split as a smoother, or (tricky and not guaranteed > to > work) order the pressures last in each subdomain with ILU as a smoother). > > Vanka smoothers are very problem-dependent so you would need to write that > part yourself. It would be nice to have an example for Stokes. An > alternative to coupled multigrid is to use PCFieldSplit at the top level > and > multigrid for the viscous part separately (see e.g. Elman's many papers on > this approach). We don't currently have an example doing PCFieldSplit with > geometric multigrid inside the splits, but it should work with petsc-dev > if > you follow the approach in src/ksp/ksp/examples/tutorials/ex45.c (which > does > not use DMMG, we are working to fold DMMG's functionality into the > existing > solver objects). > From jed at 59A2.org Fri Apr 8 04:47:35 2011 From: jed at 59A2.org (Jed Brown) Date: Fri, 8 Apr 2011 11:47:35 +0200 Subject: [petsc-users] [DMMG] Stokes Solver In-Reply-To: <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr> References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr> Message-ID: On Fri, Apr 8, 2011 at 11:40, wrote: > I can't get why it takes more than one Newton iteration even if my system > is linear and the linear solver is direct. > The direct solver should also converge in one iteration. Are you only assembling an approximation of the Jacobian (e.g. using -snes_mf_operator)? If using MFFD, is the system poorly scaled such that the step size is very low accuracy (maybe try -mat_mffd_type ds)? Are the equations singular? Is both the Jacobian and residual evaluation correct? > > As for the Vanka smoother would you suggest to implement it with PCSHELL > or by defining a new PC type? > That is up to you. Defining a new PC type makes it more reusable, but PCShell is a bit quicker to develop. You can start with PCShell and convert it later. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Apr 8 05:36:13 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 8 Apr 2011 12:36:13 +0200 Subject: [petsc-users] Slepc eigen value solver gives strange behavior In-Reply-To: <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu> References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu> Message-ID: <35A4F802-C36B-48AB-B6A6-2C7D806F68AD@dsic.upv.es> El 06/04/2011, a las 05:21, Gong Ding escribi?: > Dear Jose, > will you please also take a look at the SVD code for smallest singular value? > It seems work except the time consuming SVDCyclicSetExplicitMatrix routine. Now in slepc-dev this routine does preallocation, so it should be efficient. Let us know if problems arise. Jose From ram at ibrae.ac.ru Fri Apr 8 06:10:29 2011 From: ram at ibrae.ac.ru (=?KOI8-R?B?4czFy9PFyiDy0drBzs/X?=) Date: Fri, 8 Apr 2011 15:10:29 +0400 Subject: [petsc-users] How to create and assemble matrices for DA vectors?? In-Reply-To: References: Message-ID: Thank you very much, Satish! Ill try it Alexey 8 ?????? 2011 ?. 3:32 ???????????? Satish Balay ???????: > On Fri, 8 Apr 2011, ??????? ??????? wrote: > > > Hello. > > > > When I create vectors using > > > > VecCreate(PETSC_COMM_WORLD,&u); > > VecSetSizes(u,PETSC_DECIDE, VecSize); > > VecSetFromOptions(u); > > VecDuplicate(u,&b); > > > > and matrix using > > > > MatCreate(PETSC_COMM_WORLD,&A); > > MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,VecSize,VecSize); > > MatSetFromOptions(A); > > > > PETSc distributes their elements in a proper identical way among > processors, > > so I can use procedures like > > > > MatMult(A,u,b); > > > > and > > KSPSolve(ksp,b,x); > > Ofcourse after matrix assembling and initialization of KSP and PC > > > > KSPCreate(PETSC_COMM_WORLD,&ksp); > > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN); > > > > And thats great and works amazingly! > > > > > > > > > > But now I've created DA vectors "u" and "b" and assembled them through > the > > natural grid indexing. > > > > And I need to solve the same SLE Au=b, where A is a Laplacian. > > > > How should I create and assemble the A matrix according to my DA vector > to > > use the same functionality? > > Create u,b with DAGetGlobalVector() and A with DAGetMatrix() and they > will match the DA. For eg: check: src/snes/examples/tutorials/ex5.c > [or some of the examples in src/dm/da/examples] > > Satish > > > > > Thank you! > > > > Alexey Ryazanov > > ______________________________________ > > Nuclear Safety Institute of Russian Academy of Sciences > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Fri Apr 8 07:28:50 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Fri, 8 Apr 2011 14:28:50 +0200 (CEST) Subject: [petsc-users] [DMMG] Stokes Solver In-Reply-To: References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr> Message-ID: <3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr> > The direct solver should also converge in one iteration. Are you only > assembling an approximation of the Jacobian (e.g. using > -snes_mf_operator)? > If using MFFD, is the system poorly scaled such that the step size is very > low accuracy (maybe try -mat_mffd_type ds)? Are the equations singular? Is > both the Jacobian and residual evaluation correct? Apparently the equations were singular. I modified the equations describing the Outflow BC by explicitly writing the open boundary condition -mu*(dw/dz)+p=0 (instead of including it in the momentum equation as I was doing before) and the linear solver is now converging within 1 iteration, snes is still diverging though. 0 SNES Function norm 7.128085632250e+00 Linear solve converged due to CONVERGED_RTOL iterations 1 1 SNES Function norm 7.068552744365e+00 Linear solve converged due to CONVERGED_RTOL iterations 1 2 SNES Function norm 7.068535930605e+00 Linear solve converged due to CONVERGED_RTOL iterations 1 3 SNES Function norm 7.068535930605e+00 Linear solve converged due to CONVERGED_RTOL iterations 1 . . (some mumps output here) . Number of Newton iterations = 3 Number of Linear iterations = 4 Average Linear its / Newton = 1.333333e+00 Converged Reason = -6 if I run it with -snes_type tr instead I get 0 SNES Function norm 7.128085632250e+00 Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 1 SNES Function norm 7.081751494639e+00 Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 2 SNES Function norm 7.068482944794e+00 Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 3 SNES Function norm 7.067980457052e+00 Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 4 SNES Function norm 7.067979237888e+00 Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 5 SNES Function norm 7.067979237888e+00 . . . Number of Newton iterations = 4 Number of Linear iterations = 5 Average Linear its / Newton = 1.250000e+00 Converged Reason = 4 I don't define the Jacobian myself I'm just calling DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,0,0) I Assumed that the FD evaluation of Jacobian would be exact since the the Function is linear. From jed at 59A2.org Fri Apr 8 12:28:11 2011 From: jed at 59A2.org (Jed Brown) Date: Fri, 8 Apr 2011 19:28:11 +0200 Subject: [petsc-users] [DMMG] Stokes Solver In-Reply-To: <3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr> References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr> <3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr> Message-ID: On Fri, Apr 8, 2011 at 14:28, wrote: > I Assumed that the FD evaluation of Jacobian would be exact since the the > Function is linear. > Sort of, it's still a problem if your function looks like f(x) = huge + epsilon * x. I suspect either a memory bug (try valgrind) or that your FormFunctionLocal is using internal state or otherwise nonlinear. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdiso at ustc.edu Sat Apr 9 23:17:39 2011 From: gdiso at ustc.edu (Gong Ding) Date: Sun, 10 Apr 2011 12:17:39 +0800 (CST) Subject: [petsc-users] Slepc eigen value solver gives strange behavior In-Reply-To: <35A4F802-C36B-48AB-B6A6-2C7D806F68AD@dsic.upv.es> References: <35A4F802-C36B-48AB-B6A6-2C7D806F68AD@dsic.upv.es> <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es> <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu> <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu> Message-ID: <20302444.45221302409059408.JavaMail.coremail@mail.ustc.edu> > El 06/04/2011, a las 05:21, Gong Ding escribi?: > > > > > Dear Jose, > > > will you please also take a look at the SVD code for smallest singular value? > > > It seems work except the time consuming SVDCyclicSetExplicitMatrix routine. > > > > Now in slepc-dev this routine does preallocation, so it should be efficient. Let us know if problems arise. > > Jose > It works well. Thanks! From gdiso at ustc.edu Sat Apr 9 23:53:04 2011 From: gdiso at ustc.edu (Gong Ding) Date: Sun, 10 Apr 2011 12:53:04 +0800 (CST) Subject: [petsc-users] Finally, pseudo time method eliminated singular problem Message-ID: <33088327.45281302411184479.JavaMail.coremail@mail.ustc.edu> Hi all, In the past several weeks, I am dealing with the nearly singular problem. The structure has a metal connected two semiconductor devices. when two devices are both shutdown with high resistance, the metal connector is floating. This singular problem finally be shifted to well conditioned by simple pseudo time method -- just introducing pseudo time to the nearly floating domain (as a capacity to ground). Now iterative method works well. Here, I must give thanks to Jose Roman, the slepc package gives quick evaluation of eigen value of the jacobian matrix. I can easliy target where the singular arising. And it helps to determine the pseudo time step. And to Matt and Jed, thank you for the idea of null sapce. Pseudo time method is not as efficient as null space dropping. I guess the algorithm to nonlinear singular problem should 1) drop null space within krylov iteration 2) SNES should know the null vector and do a search in the direction of null vector to find the root. Do you think I am in the right way? Gong Ding From gdiso at ustc.edu Sun Apr 10 00:01:16 2011 From: gdiso at ustc.edu (Gong Ding) Date: Sun, 10 Apr 2011 13:01:16 +0800 (CST) Subject: [petsc-users] Patch to release extra memoty to aij matrix Message-ID: <26209988.45291302411676528.JavaMail.coremail@mail.ustc.edu> Hi, This is the patch file to aij.c, which will release excessively preallocated memory at MatAssemblyEnd. It had been tested in the past several months, both for serial and parallel. Hope this patch can be accepted. Gong Ding -------------- next part -------------- A non-text attachment was scrubbed... Name: aij.diff Type: application/octet-stream Size: 1424 bytes Desc: not available URL: From zonexo at gmail.com Sun Apr 10 02:13:17 2011 From: zonexo at gmail.com (TAY wee-beng) Date: Sun, 10 Apr 2011 09:13:17 +0200 Subject: [petsc-users] Minimum size of sparse or dense matrix to use PETSc Message-ID: <4DA1588D.5010406@gmail.com> Hi, I am already using PETSc to solve my momentum and poisson equations. However in some parts of my code, I need to solve a dense (usually) or sparse matrix, which arises from the radial basis function interpolation. Depending on the problem, it can be a big or small matrix. I am thinking whether to use PETSc or just a simple solver. Can you recommend the minimum size of sparse or dense matrix to use PETSc? Thank you. -- Yours sincerely, TAY wee-beng From knepley at gmail.com Sun Apr 10 06:56:50 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 10 Apr 2011 06:56:50 -0500 Subject: [petsc-users] Minimum size of sparse or dense matrix to use PETSc In-Reply-To: <4DA1588D.5010406@gmail.com> References: <4DA1588D.5010406@gmail.com> Message-ID: On Sun, Apr 10, 2011 at 2:13 AM, TAY wee-beng wrote: > Hi, > > I am already using PETSc to solve my momentum and poisson equations. > However in some parts of my code, I need to solve a dense (usually) or > sparse matrix, which arises from the radial basis function interpolation. > Depending on the problem, it can be a big or small matrix. > > I am thinking whether to use PETSc or just a simple solver. > > Can you recommend the minimum size of sparse or dense matrix to use PETSc? > For a dense matrix, we just call LAPACK in serial. And you can can change from dense to sparse if it is sparse enough by changing the matrix type. The only regime to worry about is very large, dense matrices, but I do not think you have those. Matt > Thank you. > > -- > Yours sincerely, > > TAY wee-beng > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Apr 10 07:28:00 2011 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 10 Apr 2011 07:28:00 -0500 Subject: [petsc-users] Finally, pseudo time method eliminated singular problem In-Reply-To: <33088327.45281302411184479.JavaMail.coremail@mail.ustc.edu> References: <33088327.45281302411184479.JavaMail.coremail@mail.ustc.edu> Message-ID: 2011/4/9 Gong Ding > Hi all, > In the past several weeks, I am dealing with the nearly singular problem. > The structure has a metal connected two semiconductor devices. when two > devices are both shutdown with high resistance, > the metal connector is floating. > This singular problem finally be shifted to well conditioned by simple > pseudo time method -- > just introducing pseudo time to the nearly floating domain (as a capacity > to ground). > Now iterative method works well. > > Here, I must give thanks to Jose Roman, the slepc package gives quick > evaluation > of eigen value of the jacobian matrix. I can easliy target where the > singular arising. > And it helps to determine the pseudo time step. > > And to Matt and Jed, thank you for the idea of null sapce. > Pseudo time method is not as efficient as null space dropping. > I guess the algorithm to nonlinear singular problem should > 1) drop null space within krylov iteration > Yes, you can do this using KSPSetNullSpace() http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPSetNullSpace.html > 2) SNES should know the null vector and do a search in the direction of > null vector to find the root. > I am not sure why this is necessary. Since KSP will project out the nullspace in each Newton solve, it should not appear in the update. Unless it is a component of the solution (which would be strange since the Jacobian gives no information about it), in which case you can add that as the initial guess. Matt > Do you think I am in the right way? > > Gong Ding > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Mon Apr 11 06:43:01 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Mon, 11 Apr 2011 13:43:01 +0200 (CEST) Subject: [petsc-users] [DMMG] Stokes Solver In-Reply-To: References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr> <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr> <3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr> <04494be3dc8dc0af4b98e63eff5661f4.squirrel@arcon.univ-st-etienne.fr> Message-ID: Ok, I'll keep that in mind. thank you very much for the explanations. Domenico. > On Mon, Apr 11, 2011 at 13:32, > wrote: > >> Still it's unclear to me why this happens and whether or not defining >> the >> Jacobian instead of computing it by FD may possibly fix it. I'd like to >> stick with FD approximation of Jacobian because I'll add complex >> rheology >> models for which computing the Jacobian analitically won't be an easy >> task. >> > > To use FD, you have to make sure that your equations are well scaled. You > should be able to volume-scale the residual (most PETSc examples do this) > and choose units such that the system is well-scaled independent of grid > resolution. You should do this regardless of whether you use FD, but with > FD, you have half the number of digits to work with before running into > rounding error problems. > From fd.kong at siat.ac.cn Tue Apr 12 01:22:52 2011 From: fd.kong at siat.ac.cn (=?ISO-8859-1?B?ZmRrb25n?=) Date: Tue, 12 Apr 2011 14:22:52 +0800 Subject: [petsc-users] time spent on each level of the solver for multigrid preconditioner Message-ID: Hi every one I uses multigrid preconditioner for my application. Running the code with "Options Database Keys" -pc_mg_log, but can not get time spent on each level of the solver. I want to know time spent on each level respectively. VecMDot 30 1.0 2.8007e-03 2.5 1.61e+05 1.1 0.0e+00 0.0e+00 3.0e+01 0 4 0 0 8 0 4 0 0 9 217 VecNorm 48 1.0 2.3482e-03 2.1 1.07e+05 1.1 0.0e+00 0.0e+00 4.8e+01 0 3 0 0 12 0 3 0 0 15 173 VecScale 39 1.0 3.2115e-04 1.2 4.36e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 513 VecCopy 17 1.0 1.6999e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 125 1.0 5.7936e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 17 1.0 2.6035e-04 1.6 3.80e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 552 VecAYPX 4 1.0 8.7976e-05 1.3 4.47e+03 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 192 VecMAXPY 38 1.0 7.0500e-04 1.1 2.28e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 1223 VecAssemblyBegin 3 1.0 4.5705e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 2 0 0 0 0 3 0 VecAssemblyEnd 3 1.0 4.1962e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 147 1.0 1.8594e-03 1.1 0.00e+00 0.0 1.0e+03 3.8e+02 0.0e+00 0 0 55 19 0 0 0 55 19 0 0 VecScatterEnd 147 1.0 1.4102e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 34 1.0 1.9138e-03 1.5 1.14e+05 1.1 0.0e+00 0.0e+00 3.4e+01 0 3 0 0 9 0 3 0 0 11 225 MatMult 47 1.0 2.6152e-02 1.1 1.34e+06 1.1 4.7e+02 4.0e+02 0.0e+00 0 35 25 9 0 0 35 25 9 0 191 MatMultAdd 4 1.0 2.1584e-03 1.1 5.67e+04 1.2 4.0e+01 2.2e+02 0.0e+00 0 1 2 0 0 0 1 2 0 0 96 MatMultTranspose 8 1.0 4.4453e-03 1.0 1.13e+05 1.2 8.0e+01 2.2e+02 1.6e+01 0 3 4 1 4 0 3 4 1 5 94 MatSolve 50 1.0 3.1454e-02 1.0 1.41e+06 1.2 0.0e+00 0.0e+00 0.0e+00 0 36 0 0 0 0 36 0 0 0 164 MatLUFactorSym 1 1.0 7.4482e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 4.4755e-02 1.0 1.84e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 15 MatILUFactorSym 1 1.0 1.3239e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatAssemblyBegin 4 1.0 2.4263e-0243.8 0.00e+00 0.0 4.5e+01 3.1e+03 6.0e+00 0 0 2 7 2 0 0 2 7 2 0 MatAssemblyEnd 4 1.0 7.4661e-03 1.1 0.00e+00 0.0 6.0e+01 7.7e+01 2.8e+01 0 0 3 0 7 0 0 3 0 9 0 MatGetRowIJ 1 1.0 3.0994e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 3.2248e-03 1.1 0.00e+00 0.0 5.0e+01 1.9e+03 5.0e+00 0 0 3 5 1 0 0 3 5 2 0 MatGetOrdering 1 1.0 1.2500e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatIncreaseOvrlp 1 1.0 1.0622e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatZeroEntries 2 1.0 2.8491e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 6 1.0 1.5023e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 MeshView 6 1.0 1.0528e+00 1.0 0.00e+00 0.0 9.0e+01 2.9e+03 0.0e+00 9 0 5 12 0 9 0 5 12 0 0 MeshGetGlobalScatter 3 1.0 1.6958e-02 1.0 0.00e+00 0.0 3.0e+01 8.8e+01 1.8e+01 0 0 2 0 5 0 0 2 0 6 0 MeshAssembleMatrix 1572 1.0 3.6974e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MeshUpdateOperator 2131 1.0 8.2520e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 2 1 0 0 0 2 0 SectionRealView 2 1.0 6.4932e-0216.4 0.00e+00 0.0 1.2e+01 4.1e+03 0.0e+00 0 0 1 2 0 0 0 1 2 0 0 PCSetUp 3 1.0 5.6296e-02 1.0 1.84e+05 1.2 7.0e+01 1.4e+03 3.0e+01 0 5 4 5 8 0 5 4 5 9 12 PCSetUpOnBlocks 8 1.0 6.5680e-03 1.1 1.84e+05 1.2 0.0e+00 0.0e+00 7.0e+00 0 5 0 0 2 0 5 0 0 2 102 PCApply 4 1.0 1.0816e-01 1.0 3.45e+06 1.2 9.6e+02 3.8e+02 2.0e+02 1 89 52 17 51 1 89 52 17 61 118 KSPGMRESOrthog 30 1.0 3.6988e-03 1.7 3.22e+05 1.1 0.0e+00 0.0e+00 3.0e+01 0 8 0 0 8 0 8 0 0 9 329 KSPSetup 4 1.0 1.2448e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 1.1271e-01 1.0 3.65e+06 1.2 1.0e+03 3.8e+02 2.1e+02 1 94 54 18 54 1 94 54 18 65 120 MeshDestroy 5 1.0 3.2269e-0236.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DistributeMesh 1 1.0 2.0238e-01 1.1 0.00e+00 0.0 2.4e+01 2.3e+03 0.0e+00 2 0 1 3 0 2 0 1 3 0 0 PartitionCreate 2 1.0 4.0964e-0234.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PartitionClosure 2 1.0 8.7453e-024366.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DistributeCoords 2 1.0 4.6407e-02 2.4 0.00e+00 0.0 2.4e+01 3.0e+03 0.0e+00 0 0 1 3 0 0 0 1 3 0 0 DistributeLabels 2 1.0 8.7246e-02 3.1 0.00e+00 0.0 1.8e+01 7.6e+02 0.0e+00 0 0 1 1 0 0 0 1 1 0 0 CreateOverlap 2 1.0 2.5038e-02 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 3 0 0 0 0 3 0 0 DistributeMeshByFineMesh 1 1.0 2.0225e+00 1.0 3.18e+05 0.0 2.4e+01 9.5e+03 0.0e+00 17 2 1 11 0 17 2 1 11 0 0 PartitionByFineMesh 1 1.0 1.2465e+0036561.8 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00 3 2 0 0 0 3 2 0 0 0 0 CreatCoarseCellToFineCell 1 1.0 1.1892e+0099754.2 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00 3 2 0 0 0 3 2 0 0 0 0 ConstructInterpolation 1 1.0 1.7860e-01 1.0 7.53e+04 1.2 3.5e+01 6.3e+02 1.8e+01 2 2 2 1 5 2 2 2 1 6 2 creatMapFromFinePointToCoarseCell 1 1.0 8.4537e-02 1.1 6.63e+04 1.2 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 3 MGSetup Level 1 2 1.0 4.3158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 2 0 0 0 0 2 0 MGSmooth Level 1 16 1.0 9.6284e-02 1.0 3.11e+06 1.2 7.6e+02 4.1e+02 1.7e+02 1 80 41 15 45 1 80 41 15 54 120 MGResid Level 1 4 1.0 2.4343e-03 1.1 1.24e+05 1.1 4.0e+01 4.1e+02 0.0e+00 0 3 2 1 0 0 3 2 1 0 191 MGInterp Level 1 16 1.0 9.3703e-03 1.0 2.22e+05 1.2 1.6e+02 2.2e+02 1.6e+01 0 6 9 2 4 0 6 9 2 5 87 ------------------ Fande Kong ShenZhen Institutes of Advanced Technology Chinese Academy of Sciences -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59a2.org Tue Apr 12 01:43:50 2011 From: jed at 59a2.org (Jed Brown) Date: Tue, 12 Apr 2011 08:43:50 +0200 Subject: [petsc-users] time spent on each level of the solver for multigrid preconditioner In-Reply-To: References: Message-ID: I think -pc_mg_log does what you want in petsc-dev. $ cd petsc/src/snes/examples/tutorials $ ./ex48 -thi_nlevels 3 -log_summary -pc_mg_log [...] MGSetup Level 0 7 1.0 1.2443e-02 1.0 9.84e+04 1.0 0.0e+00 0.0e+00 5.0e+00 4 0 0 0 2 4 0 0 0 4 8 MGSmooth Level 0 78 1.0 9.3937e-04 1.0 1.99e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 212 MGSetup Level 1 7 1.0 4.4374e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00 1.5e+01 1 1 0 0 6 1 1 0 0 11 211 MGSmooth Level 1 104 1.0 1.1142e-02 1.0 8.80e+06 1.0 0.0e+00 0.0e+00 1.0e+00 3 14 0 0 0 3 14 0 0 1 789 MGResid Level 1 52 1.0 9.5296e-04 1.0 9.49e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 996 MGInterp Level 1 156 1.0 6.0565e-03 1.0 1.97e+05 1.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 32 MGSetup Level 2 7 1.0 5.4920e-03 1.0 7.14e+06 1.0 0.0e+00 0.0e+00 1.5e+01 2 11 0 0 6 2 11 0 0 11 1299 MGSmooth Level 2 52 1.0 2.6678e-02 1.0 3.61e+07 1.0 0.0e+00 0.0e+00 1.0e+00 8 56 0 0 0 8 56 0 0 1 1353 MGResid Level 2 26 1.0 2.5156e-03 1.0 3.52e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 6 0 0 0 1 6 0 0 0 1401 MGInterp Level 2 104 1.0 1.4493e-03 1.0 9.06e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 625 -------------- next part -------------- An HTML attachment was scrubbed... URL: From juhaj at iki.fi Tue Apr 12 05:02:47 2011 From: juhaj at iki.fi (Juha =?iso-8859-1?q?J=E4ykk=E4?=) Date: Tue, 12 Apr 2011 10:02:47 +0000 Subject: [petsc-users] TS problem: runge-kutta gives 0 step-length Message-ID: <201104121102.47347.juhaj@iki.fi> Hi list! I have a small problem with running a TS program with -ts_type runge-kutta. It keeps telling me Very small steps: 0.000000 from the very beginning and never gets anywhere. The programs works fine for other TS types (well, at least euler, beuler, cn and gl). I am out of ideas as to why this happens. I even checked the RK source code. Any ideas? Cheers, Juha From knepley at gmail.com Tue Apr 12 05:29:56 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Apr 2011 05:29:56 -0500 Subject: [petsc-users] TS problem: runge-kutta gives 0 step-length In-Reply-To: <201104121102.47347.juhaj@iki.fi> References: <201104121102.47347.juhaj@iki.fi> Message-ID: On Tue, Apr 12, 2011 at 5:02 AM, Juha J?ykk? wrote: > Hi list! > > I have a small problem with running a TS program with -ts_type runge-kutta. > It > keeps telling me > > Very small steps: 0.000000 > > from the very beginning and never gets anywhere. The programs works fine > for > other TS types (well, at least euler, beuler, cn and gl). > > I am out of ideas as to why this happens. I even checked the RK source > code. > Any ideas? > Yes, the debugger to look at what happens when it chooses the new timestep. This is dependent on parameters you pass in (rk->maxerror, rk->p, ts->max_time). Matt > Cheers, > Juha > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Apr 12 08:40:33 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 12 Apr 2011 08:40:33 -0500 Subject: [petsc-users] time spent on each level of the solver for multigrid preconditioner In-Reply-To: References: Message-ID: <74C217CD-9FCE-46A5-9F41-86D27AF96D45@mcs.anl.gov> It is right there: MGSetup Level 1 2 1.0 4.3158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 2 0 0 0 0 2 0 MGSmooth Level 1 16 1.0 9.6284e-02 1.0 3.11e+06 1.2 7.6e+02 4.1e+02 1.7e+02 1 80 41 15 45 1 80 41 15 54 120 MGResid Level 1 4 1.0 2.4343e-03 1.1 1.24e+05 1.1 4.0e+01 4.1e+02 0.0e+00 0 3 2 1 0 0 3 2 1 0 191 MGInterp Level 1 16 1.0 9.3703e-03 1.0 2.22e+05 1.2 1.6e+02 2.2e+02 1.6e+01 0 6 9 2 4 0 6 9 2 5 87 perhaps you are only running with one level and hence only getting one level or information. Or perhaps there is a bug/issue and we don't report for the coarsest level. If it is missing a level please send a bug report to petsc-maint at mcs.anl.gov using a PETSc example for example src/ksp/ksp/examples/tutorials/ex22.c Barry On Apr 12, 2011, at 1:22 AM, fdkong wrote: > Hi every one > I uses multigrid preconditioner for my application. Running the code with "Options Database Keys" -pc_mg_log, but can not get time spent on each level of the solver. I want to know time spent on each level respectively. > > > VecMDot 30 1.0 2.8007e-03 2.5 1.61e+05 1.1 0.0e+00 0.0e+00 3.0e+01 0 4 0 0 8 0 4 0 0 9 217 > VecNorm 48 1.0 2.3482e-03 2.1 1.07e+05 1.1 0.0e+00 0.0e+00 4.8e+01 0 3 0 0 12 0 3 0 0 15 173 > VecScale 39 1.0 3.2115e-04 1.2 4.36e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 513 > VecCopy 17 1.0 1.6999e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 125 1.0 5.7936e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 17 1.0 2.6035e-04 1.6 3.80e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 552 > VecAYPX 4 1.0 8.7976e-05 1.3 4.47e+03 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 192 > VecMAXPY 38 1.0 7.0500e-04 1.1 2.28e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 1223 > VecAssemblyBegin 3 1.0 4.5705e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 2 0 0 0 0 3 0 > VecAssemblyEnd 3 1.0 4.1962e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 147 1.0 1.8594e-03 1.1 0.00e+00 0.0 1.0e+03 3.8e+02 0.0e+00 0 0 55 19 0 0 0 55 19 0 0 > VecScatterEnd 147 1.0 1.4102e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecNormalize 34 1.0 1.9138e-03 1.5 1.14e+05 1.1 0.0e+00 0.0e+00 3.4e+01 0 3 0 0 9 0 3 0 0 11 225 > MatMult 47 1.0 2.6152e-02 1.1 1.34e+06 1.1 4.7e+02 4.0e+02 0.0e+00 0 35 25 9 0 0 35 25 9 0 191 > MatMultAdd 4 1.0 2.1584e-03 1.1 5.67e+04 1.2 4.0e+01 2.2e+02 0.0e+00 0 1 2 0 0 0 1 2 0 0 96 > MatMultTranspose 8 1.0 4.4453e-03 1.0 1.13e+05 1.2 8.0e+01 2.2e+02 1.6e+01 0 3 4 1 4 0 3 4 1 5 94 > MatSolve 50 1.0 3.1454e-02 1.0 1.41e+06 1.2 0.0e+00 0.0e+00 0.0e+00 0 36 0 0 0 0 36 0 0 0 164 > MatLUFactorSym 1 1.0 7.4482e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 2 1.0 4.4755e-02 1.0 1.84e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 15 > MatILUFactorSym 1 1.0 1.3239e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 1 0 0 0 0 1 0 > MatAssemblyBegin 4 1.0 2.4263e-0243.8 0.00e+00 0.0 4.5e+01 3.1e+03 6.0e+00 0 0 2 7 2 0 0 2 7 2 0 > MatAssemblyEnd 4 1.0 7.4661e-03 1.1 0.00e+00 0.0 6.0e+01 7.7e+01 2.8e+01 0 0 3 0 7 0 0 3 0 9 0 > MatGetRowIJ 1 1.0 3.0994e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 1 1.0 3.2248e-03 1.1 0.00e+00 0.0 5.0e+01 1.9e+03 5.0e+00 0 0 3 5 1 0 0 3 5 2 0 > MatGetOrdering 1 1.0 1.2500e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 > MatIncreaseOvrlp 1 1.0 1.0622e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 1 0 0 0 0 1 0 > MatZeroEntries 2 1.0 2.8491e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatView 6 1.0 1.5023e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 > MeshView 6 1.0 1.0528e+00 1.0 0.00e+00 0.0 9.0e+01 2.9e+03 0.0e+00 9 0 5 12 0 9 0 5 12 0 0 > MeshGetGlobalScatter 3 1.0 1.6958e-02 1.0 0.00e+00 0.0 3.0e+01 8.8e+01 1.8e+01 0 0 2 0 5 0 0 2 0 6 0 > MeshAssembleMatrix 1572 1.0 3.6974e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MeshUpdateOperator 2131 1.0 8.2520e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 2 1 0 0 0 2 0 > SectionRealView 2 1.0 6.4932e-0216.4 0.00e+00 0.0 1.2e+01 4.1e+03 0.0e+00 0 0 1 2 0 0 0 1 2 0 0 > PCSetUp 3 1.0 5.6296e-02 1.0 1.84e+05 1.2 7.0e+01 1.4e+03 3.0e+01 0 5 4 5 8 0 5 4 5 9 12 > PCSetUpOnBlocks 8 1.0 6.5680e-03 1.1 1.84e+05 1.2 0.0e+00 0.0e+00 7.0e+00 0 5 0 0 2 0 5 0 0 2 102 > PCApply 4 1.0 1.0816e-01 1.0 3.45e+06 1.2 9.6e+02 3.8e+02 2.0e+02 1 89 52 17 51 1 89 52 17 61 118 > KSPGMRESOrthog 30 1.0 3.6988e-03 1.7 3.22e+05 1.1 0.0e+00 0.0e+00 3.0e+01 0 8 0 0 8 0 8 0 0 9 329 > KSPSetup 4 1.0 1.2448e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 1.1271e-01 1.0 3.65e+06 1.2 1.0e+03 3.8e+02 2.1e+02 1 94 54 18 54 1 94 54 18 65 120 > MeshDestroy 5 1.0 3.2269e-0236.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DistributeMesh 1 1.0 2.0238e-01 1.1 0.00e+00 0.0 2.4e+01 2.3e+03 0.0e+00 2 0 1 3 0 2 0 1 3 0 0 > PartitionCreate 2 1.0 4.0964e-0234.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > PartitionClosure 2 1.0 8.7453e-024366.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DistributeCoords 2 1.0 4.6407e-02 2.4 0.00e+00 0.0 2.4e+01 3.0e+03 0.0e+00 0 0 1 3 0 0 0 1 3 0 0 > DistributeLabels 2 1.0 8.7246e-02 3.1 0.00e+00 0.0 1.8e+01 7.6e+02 0.0e+00 0 0 1 1 0 0 0 1 1 0 0 > CreateOverlap 2 1.0 2.5038e-02 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 3 0 0 0 0 3 0 0 > DistributeMeshByFineMesh 1 1.0 2.0225e+00 1.0 3.18e+05 0.0 2.4e+01 9.5e+03 0.0e+00 17 2 1 11 0 17 2 1 11 0 0 > PartitionByFineMesh 1 1.0 1.2465e+0036561.8 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00 3 2 0 0 0 3 2 0 0 0 0 > CreatCoarseCellToFineCell 1 1.0 1.1892e+0099754.2 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00 3 2 0 0 0 3 2 0 0 0 0 > ConstructInterpolation 1 1.0 1.7860e-01 1.0 7.53e+04 1.2 3.5e+01 6.3e+02 1.8e+01 2 2 2 1 5 2 2 2 1 6 2 > creatMapFromFinePointToCoarseCell 1 1.0 8.4537e-02 1.1 6.63e+04 1.2 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 3 > MGSetup Level 1 2 1.0 4.3158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 2 0 0 0 0 2 0 > MGSmooth Level 1 16 1.0 9.6284e-02 1.0 3.11e+06 1.2 7.6e+02 4.1e+02 1.7e+02 1 80 41 15 45 1 80 41 15 54 120 > MGResid Level 1 4 1.0 2.4343e-03 1.1 1.24e+05 1.1 4.0e+01 4.1e+02 0.0e+00 0 3 2 1 0 0 3 2 1 0 191 > MGInterp Level 1 16 1.0 9.3703e-03 1.0 2.22e+05 1.2 1.6e+02 2.2e+02 1.6e+01 0 6 9 2 4 0 6 9 2 5 87 > > ------------------ > Fande Kong > ShenZhen Institutes of Advanced Technology > Chinese Academy of Sciences > From jed at 59A2.org Tue Apr 12 08:45:23 2011 From: jed at 59A2.org (Jed Brown) Date: Tue, 12 Apr 2011 15:45:23 +0200 Subject: [petsc-users] time spent on each level of the solver for multigrid preconditioner In-Reply-To: <74C217CD-9FCE-46A5-9F41-86D27AF96D45@mcs.anl.gov> References: <74C217CD-9FCE-46A5-9F41-86D27AF96D45@mcs.anl.gov> Message-ID: On Tue, Apr 12, 2011 at 15:40, Barry Smith wrote: > perhaps you are only running with one level and hence only getting one > level or information. Or perhaps there is a bug/issue and we don't report > for the coarsest level. If it is missing a level please send a bug report to > petsc-maint at mcs.anl.gov using a PETSc example > -pc_mg_log was totally broken in 3.1, I fixed it here changeset: 17148:1ab456826813 user: Jed Brown date: Fri Sep 24 11:39:46 2010 +0200 files: src/ksp/pc/impls/mg/fmg.c src/ksp/pc/impls/mg/mg.c src/ksp/pc/impls/mg/mgimpl.h src/ksp/pc/impls/mg/smg.c description: Make PC_MG logging (-pc_mg_log) log each level separately. The old code clearly intended to do this, but the events were in PC_MG, not PC_MG_Levels so all but the finest was leaked (and time from all levels was attributed to the finest level). http://petsc.cs.iit.edu/petsc/petsc-dev/rev/1ab456826813 -------------- next part -------------- An HTML attachment was scrubbed... URL: From khalid_eee at yahoo.com Tue Apr 12 20:33:23 2011 From: khalid_eee at yahoo.com (khalid ashraf) Date: Tue, 12 Apr 2011 18:33:23 -0700 (PDT) Subject: [petsc-users] DMMG KSP Solve time with random initialization Message-ID: <831692.28168.qm@web112615.mail.gq1.yahoo.com> Hi, I use the DMMG to solve the Ax=b. At the initialization part, I either assign a predetermined value to the vectors or a random value as shown in the code below. With the same system size, and no of processors, the random initialization takes significantly more time than the predetermined value. I am attaching the laog summary in both cases. Could you please suggest why the time requirement is so huge (specially KSP Solve) in the random initialization and how I can improve it ? Thanks in advance. ###Code without random value assignment to vectors: u_localptr[k][j][i] = 0.7e-0; v_localptr[k][j][i] = 0.81e-0; w_localptr[k][j][i] = -54e-1; ###Code with random value assignment to vectors: /* PetscRandomCreate(PETSC_COMM_WORLD,&pRandom); PetscRandomSetFromOptions(pRandom); PetscRandomSetType(pRandom,PETSCRAND); PetscRandomSetInterval(pRandom,0.1e-8,1.0e-8); VecSetRandom(u,pRandom); PetscRandomSetInterval(pRandom,-1.e-8,-0.1e-8); VecSetRandom(v,pRandom); //VecSetRandom(w,pRandom); PetscRandomDestroy(pRandom);*/ ###log_summary without random value assignment to vectors: Max Max/Min Avg Total Time (sec): 6.210e-01 1.00071 6.208e-01 Objects: 1.060e+02 1.00000 1.060e+02 Flops: 5.325e+04 1.00000 5.325e+04 1.065e+05 Flops/sec: 8.581e+04 1.00071 8.578e+04 1.716e+05 Memory: 1.412e+06 1.00582 2.815e+06 MPI Messages: 7.600e+01 1.00000 7.600e+01 1.520e+02 MPI Message Lengths: 3.078e+05 1.00000 4.051e+03 6.157e+05 MPI Reductions: 1.250e+02 1.00000 VecView 16 1.0 1.9195e-01 1.0 0.00e+00 0.0 3.6e+01 8.2e+03 7.0e+00 30 0 24 48 6 30 0 24 48 8 0 VecNorm 4 1.0 5.9933e-05 1.4 1.64e+04 1.0 0.0e+00 0.0e+00 4.0e+00 0 31 0 0 3 0 31 0 0 4 547 VecScale 9 1.0 4.0106e-06 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCopy 12 1.0 5.3243e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 7 1.0 3.0005e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 9 1.0 1.3322e-04 1.3 3.69e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 69 0 0 0 0 69 0 0 0 553 VecScatterBegin 53 1.0 1.1030e-03 1.1 0.00e+00 0.0 7.4e+01 4.1e+03 0.0e+00 0 0 49 49 0 0 0 49 49 0 0 VecScatterEnd 53 1.0 3.6766e-0310.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 2.6521e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 3 0 0 0 0 4 0 MatAssemblyEnd 2 1.0 1.0279e-03 1.0 0.00e+00 0.0 4.0e+00 1.0e+03 1.1e+01 0 0 3 1 9 0 0 3 1 12 0 KSPSetup 2 1.0 5.8801e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 4 1.0 4.1139e-03 1.0 1.64e+04 1.0 0.0e+00 0.0e+00 1.0e+01 1 31 0 0 8 1 31 0 0 11 8 PCSetUp 1 1.0 3.5669e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 1 0 0 0 2 1 0 0 0 2 0 ----------------------------------------------------------------------------------------------------------------------- ###log_summary with random initialization to vectors: Time (sec): 4.456e+01 1.00002 4.456e+01 Objects: 1.690e+02 1.00000 1.690e+02 Flops: 1.086e+10 1.00000 1.086e+10 2.172e+10 Flops/sec: 2.437e+08 1.00002 2.437e+08 4.875e+08 Memory: 2.709e+06 1.00302 5.410e+06 MPI Messages: 8.141e+04 1.00000 8.141e+04 1.628e+05 MPI Message Lengths: 3.335e+08 1.00000 4.096e+03 6.669e+08 MPI Reductions: 4.028e+05 1.00000 VecView 16 1.0 2.0461e-01 1.0 0.00e+00 0.0 3.6e+01 8.2e+03 7.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 80000 1.0 2.7737e+00 1.0 2.70e+09 1.0 0.0e+00 0.0e+00 8.0e+04 6 25 0 0 20 6 25 0 0 20 1948 VecNorm 121336 1.0 3.8669e+00 1.0 4.97e+08 1.0 0.0e+00 0.0e+00 1.2e+05 9 5 0 0 30 9 5 0 0 30 257 VecScale 121345 1.0 8.6525e-01 1.0 2.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 574 VecCopy 40012 1.0 9.5324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 201343 1.0 3.3050e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 41345 1.0 4.0391e-01 1.0 1.69e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 839 VecWAXPY 1336 1.0 1.5288e-02 1.0 2.74e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 358 VecMAXPY 121336 1.0 4.6148e+00 1.0 3.03e+09 1.0 0.0e+00 0.0e+00 0.0e+00 10 28 0 0 0 10 28 0 0 0 1313 VecScatterBegin 81389 1.0 6.2763e-01 1.0 0.00e+00 0.0 1.6e+05 4.1e+03 0.0e+00 1 0100100 0 1 0100100 0 0 VecScatterEnd 81389 1.0 7.0998e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSetRandom 2 1.0 2.7497e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 80000 1.0 3.2196e+00 1.0 4.92e+08 1.0 0.0e+00 0.0e+00 8.0e+04 7 5 0 0 20 7 5 0 0 20 305 MatMult 81336 1.0 1.8218e+01 1.0 2.17e+09 1.0 1.6e+05 4.1e+03 0.0e+00 41 20100100 0 41 20100100 0 238 MatSolve 80000 1.0 9.1123e+00 1.0 2.05e+09 1.0 0.0e+00 0.0e+00 0.0e+00 20 19 0 0 0 20 19 0 0 0 450 MatLUFactorNum 1 1.0 7.6804e-04 1.0 3.74e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 97 MatILUFactorSym 1 1.0 7.0408e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 5.7212e-04 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 1.1452e-03 1.0 0.00e+00 0.0 4.0e+00 1.0e+03 1.1e+01 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 1.0453e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 5.4405e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 80000 1.0 6.8689e+00 1.0 5.40e+09 1.0 0.0e+00 0.0e+00 8.0e+04 15 50 0 0 20 15 50 0 0 20 1573 KSPSetup 3 1.0 5.9501e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 4 1.0 4.3836e+01 1.0 1.09e+10 1.0 1.6e+05 4.1e+03 4.0e+05 98100100100100 98100100100100 496 PCSetUp 2 1.0 5.8231e-03 1.0 3.74e+04 1.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 0 0 0 0 0 0 13 PCSetUpOnBlocks 40000 1.0 2.4762e-02 1.0 3.74e+04 1.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 0 3 PCApply 40000 1.0 2.6536e+01 1.0 4.26e+09 1.0 8.0e+04 4.1e+03 3.2e+05 60 39 49 49 79 60 39 49 49 79 321 ------------------------------------------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 13 03:43:26 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 13 Apr 2011 10:43:26 +0200 Subject: [petsc-users] DMMG KSP Solve time with random initialization In-Reply-To: <831692.28168.qm@web112615.mail.gq1.yahoo.com> References: <831692.28168.qm@web112615.mail.gq1.yahoo.com> Message-ID: On Wed, Apr 13, 2011 at 03:33, khalid ashraf wrote: > I use the DMMG to solve the Ax=b. At the initialization part, I either > assign a predetermined value to the vectors or a random value as shown in > the code below. With the same system size, and no of processors, the random > initialization takes significantly more time than the predetermined value. I > am attaching the laog summary in both cases. Could you please suggest why > the time requirement is so huge (specially KSP Solve) in the random > initialization and how I can improve it ? The solve with constant initial state does zero iterations, the solve with random initial state does not converge. You haven't explained what you are solving or what calling sequence you are using, so I'm just going to take a wild guess what's happening. Your matrix is singular, probably because you didn't include boundary conditions, and the right hand side vector is zero. The constant solution is in the null space, therefore the residual is zero to begin with so no iterations are ever done. The null space never gets projected out of the random vector, therefore nothing ever converges and it takes a long time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Wed Apr 13 04:08:25 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Wed, 13 Apr 2011 11:08:25 +0200 (CEST) Subject: [petsc-users] Matrix Sparsity Message-ID: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr> Hi, I'm using the DMMG interface for my code so I'm not directly calling the matrix assembly routines. I tried to retrieve the matrices through the commands + . DMMGGetSNES(dmmg); . SNESGetKSP(snes,&ksp); . KSPGetOperators(ksp,&Amat,&Pmat,&flag); + ,saved them (in Matlab format) and I noticed that a large number of zeros entries were also saved. If I run with -info i get [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 500 How can I make sure that the correct sparsity patter is formed while still using the DMMG interface? The same thing happens either if I use the FD approximation of the jacobian or a specific FormJacobianFunction routine. Thank you, Domenico. From jed at 59A2.org Wed Apr 13 04:23:22 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 13 Apr 2011 11:23:22 +0200 Subject: [petsc-users] Matrix Sparsity In-Reply-To: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr> References: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr> Message-ID: On Wed, Apr 13, 2011 at 11:08, wrote: > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 500 > What discretization are you using? > > How can I make sure that the correct sparsity patter is formed while still > using the DMMG interface? The same thing happens either if I use the FD > approximation of the jacobian or a specific FormJacobianFunction routine. > Did you the stencil width and shape (box or star) correctly? -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Wed Apr 13 05:02:31 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Wed, 13 Apr 2011 12:02:31 +0200 (CEST) Subject: [petsc-users] Matrix Sparsity In-Reply-To: References: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr> Message-ID: > On Wed, Apr 13, 2011 at 11:08, > wrote: > >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 500 >> > > What discretization are you using? > > I'm still testing the code over small grids so I'm using a 9x9x9 grid (1 lev) at present time. >> >> How can I make sure that the correct sparsity patter is formed while >> still >> using the DMMG interface? The same thing happens either if I use the FD >> approximation of the jacobian or a specific FormJacobianFunction >> routine. >> > > Did you the stencil width and shape (box or star) correctly? > I checked it and noticed that I was using an unnecessary stencil width of 2 (Box) and having 4 DOFs gave me a 4x5^3 = 500 non zero entries per row. But even if I set it to 1 and Star it'll result in 4x7 = 28 nnz while I need 17 at most. This means my matrices will always be double the needed size (roughly). How can I control this? From jed at 59A2.org Wed Apr 13 05:11:39 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 13 Apr 2011 12:11:39 +0200 Subject: [petsc-users] Matrix Sparsity In-Reply-To: References: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr> Message-ID: On Wed, Apr 13, 2011 at 12:02, wrote: > I checked it and noticed that I was using an unnecessary stencil width of > 2 (Box) and having 4 DOFs gave me a 4x5^3 = 500 non zero entries per row. > But even if I set it to 1 and Star it'll result in 4x7 = 28 nnz while I > need 17 at most. This means my matrices will always be double the needed > size (roughly). How can I control this? > You can use http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/DM/DMDASetBlockFills.html (named DASetBlockFills in petsc-3.1) with the AIJ matrix format. But the storage costs are quite similar if you store those extra few nonzeros and use the BAIJ format, and then you benefit from faster sparse matrix kernels so the actual run time could be less than using the less regular nonzero structure in AIJ. Also, BAIJ smooths all the components together which makes the smoother stronger, thus you may converge in fewer iterations. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaurish108 at gmail.com Wed Apr 13 10:32:41 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Wed, 13 Apr 2011 11:32:41 -0400 Subject: [petsc-users] format specifiers Message-ID: What are the format specifiers for data types PetscScalar and PetscScalar? Gaurish -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 13 10:40:17 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 13 Apr 2011 17:40:17 +0200 Subject: [petsc-users] format specifiers In-Reply-To: References: Message-ID: On Wed, Apr 13, 2011 at 17:32, Gaurish Telang wrote: > What are the format specifiers for data types PetscScalar and PetscScalar? > Unfortunately there is no specifier. You can use PetscPrintf(comm,"%G + %Gi\n",PetscRealPart(v),PetscImaginaryPart(v)); -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaurish108 at gmail.com Wed Apr 13 12:24:10 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Wed, 13 Apr 2011 13:24:10 -0400 Subject: [petsc-users] format specifiers In-Reply-To: References: Message-ID: Hmm, %f seems to be working fine for these data types. Is there any harm in using it though? On Wed, Apr 13, 2011 at 11:32 AM, Gaurish Telang wrote: > What are the format specifiers for data types PetscScalar and PetscScalar? > > > Gaurish > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 13 12:59:13 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 13 Apr 2011 19:59:13 +0200 Subject: [petsc-users] format specifiers In-Reply-To: References: Message-ID: On Wed, Apr 13, 2011 at 19:24, Gaurish Telang wrote: > Hmm, %f seems to be working fine for these data types. Is there any harm in > using it though? If you are referring to just using plain "%f" (or any single % specifier) to show a PetscScalar, that's a matter of whether you have configured so that PetscScalar is real or compex valued. It will not work when you use complex. If you meant using %f instead of %G in my example above, that is a matter of precision: PetscPrintf converts %G (but not currently variants like %12.5G) to a representation that works for any choice of precision. For example, '%g' for double and float [1], %Lg for long double, %Qe for __float128. Similarly, %[1-9]D is converted to %d or %lld depending the use of 64-bit indices. If you only ever use double and native ints (usually 32-bit), then you don't have to worry about these conversions and you can use whatever you want. [1] The standard specifies that float is promoted to double when calling a function with no prototype or a variadic function. The same rule promotes char and short int to int. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gshy2014 at gmail.com Wed Apr 13 22:48:23 2011 From: gshy2014 at gmail.com (Shiyuan) Date: Wed, 13 Apr 2011 22:48:23 -0500 Subject: [petsc-users] sparse matrix addition Message-ID: Hi, I have two matrices A, B of different nonzero-pattern. Their size is about 60k*60k. I notices that MATAXPY() is extremely slow. However, In matlab, addition of the same two matrices is done in no time. Why is so? Any strategies to speed up the sparse matrices additions? Thanks Shiyuan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 14 03:08:45 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 14 Apr 2011 10:08:45 +0200 Subject: [petsc-users] sparse matrix addition In-Reply-To: References: Message-ID: On Thu, Apr 14, 2011 at 05:48, Shiyuan wrote: > I have two matrices A, B of different nonzero-pattern. Their size is about > 60k*60k. I notices that MATAXPY() is extremely slow. What matrix format and what are you passing for MatStructure? With DIFFERENT_NONZERO_STRUCTURE, petsc-3.1 was not doing preallocation. This is fixed in petsc-dev. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Apr 14 07:46:20 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 14 Apr 2011 07:46:20 -0500 Subject: [petsc-users] sparse matrix addition In-Reply-To: References: Message-ID: On Apr 14, 2011, at 3:08 AM, Jed Brown wrote: > On Thu, Apr 14, 2011 at 05:48, Shiyuan wrote: > I have two matrices A, B of different nonzero-pattern. Their size is about 60k*60k. I notices that MATAXPY() is extremely slow. > > What matrix format and what are you passing for MatStructure? With DIFFERENT_NONZERO_STRUCTURE, petsc-3.1 was not doing preallocation. This is fixed in petsc-dev. In other words, switch to the development version of PETSc http://www.mcs.anl.gov/petsc/petsc-as/developers/index.html and it should be much faster. If it is not much faster than please send mail to petsc-maint at mcs.anl.gov with details of the matrix type AIJ? and ideally sample code and we'll see why it is so slow. Barry From thomas.witkowski at tu-dresden.de Thu Apr 14 08:18:56 2011 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Thu, 14 Apr 2011 15:18:56 +0200 Subject: [petsc-users] FETI-DP Message-ID: <4DA6F440.4000204@tu-dresden.de> Has anybody of you implemented the FETI-DP method in PETSc? I think about to do this for my FEM code, but first I want to evaluate the effort of the implementation. So if some of you could give some comments on it or if there is some code I could reuse, I would be thankful for a short answer! Thomas From jed at 59A2.org Thu Apr 14 09:19:14 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 14 Apr 2011 16:19:14 +0200 Subject: [petsc-users] FETI-DP In-Reply-To: <4DA6F440.4000204@tu-dresden.de> References: <4DA6F440.4000204@tu-dresden.de> Message-ID: On Thu, Apr 14, 2011 at 15:18, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Has anybody of you implemented the FETI-DP method in PETSc? I think about > to do this for my FEM code, but first I want to evaluate the effort of the > implementation. There are a few implementations out there. Probably most notable is Axel Klawonn and Oliver Rheinbach's implementation which has been scaled up to very large problems and computers. My understanding is that Xuemin Tu did some work on BDDC (equivalent to FETI-DP) using PETSc. I am not aware of anyone releasing a working FETI-DP implementation using PETSc, but of course you're welcome to ask these people if they would share code with you. What sort of problems do you want it for (physics and mesh)? How are you currently assembling your systems? A fully general FETI-DP implementation is a lot of work. For a specific class of problems and variant of FETI-DP, it will still take some effort, but should not be too much. There was a start to a FETI-DP implementation in PETSc quite a while ago, but it died due to bitrot and different ideas of how we would like to implement. You can get that code from mercurial: http://petsc.cs.iit.edu/petsc/petsc-dev/rev/021f379b5eea The fundamental ingredient of these methods is a "partially assembled" matrix. For a library implementation, the challenges are 1. How does the user provide the information necessary to decide what the coarse space looks like? (It's different for scalar problems, compressible elasticity, and Stokes, and tricky to do with no geometric information from the user.) The coefficient structure in the problem matters a lot when deciding which coarse basis functions to use, see http://dx.doi.org/10.1016/j.cma.2006.03.023 2. How do you handle primal basis functions with large support (e.g. rigid body modes of a face)? Two choices here: http://www.cs.nyu.edu/cs/faculty/widlund/FETI-DP-elasticity_TR.pdf . 3. How do you make it easy for the user to provide the required matrix? Ideally, the user would just use plain MatSetValuesLocal() and run with -mat_type partially-assembled -pc_type fetidp instead of, say -mat_type baij -pc_type asm. It should work for multiple subdomains per process and subdomains spanning multiple processes. This can now be done by implementing MatGetLocalSubMatrix(). The local blocks of the partially assembled system should be able to use different formats (e.g. SBAIJ). 4. How do you handle more than two levels? This is very important to use more than about 1000 subdomains in 3D because the coarse problem just gets too big (unless the coarse problem happens to be well-conditioned enough that you can use algebraic multigrid). I've wanted to implement FETI-DP in PETSc for almost two years, but it's never been a high priority. I think I now know how to get enough flexibility to make it worthwhile to me. I'd be happy to discuss implementation issues with you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From desire.nuentsa_wakam at inria.fr Thu Apr 14 09:39:43 2011 From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM) Date: Thu, 14 Apr 2011 16:39:43 +0200 Subject: [petsc-users] -info filename Message-ID: <4DA7072F.6020102@inria.fr> Hi, help on -info says : *-info : print informative messages about the calculations* but the optional filename expected is actually a logical value. Is this a known behaviour ?? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 14 09:51:09 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 14 Apr 2011 16:51:09 +0200 Subject: [petsc-users] -info filename In-Reply-To: <4DA7072F.6020102@inria.fr> References: <4DA7072F.6020102@inria.fr> Message-ID: On Thu, Apr 14, 2011 at 16:39, Desire NUENTSA WAKAM < desire.nuentsa_wakam at inria.fr> wrote: > *-info : print informative messages about the > calculations* > but the optional filename expected is actually a logical value. > Is this a known behaviour ?? $ cd petsc-3.1/src/ksp/ksp/examples/tutorials/ $ make ex2 $ ./ex2 -info info.log $ wc info.log.0 49 355 3240 info.log.0 What do you expect? -------------- next part -------------- An HTML attachment was scrubbed... URL: From desire.nuentsa_wakam at inria.fr Thu Apr 14 10:40:33 2011 From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM) Date: Thu, 14 Apr 2011 17:40:33 +0200 Subject: [petsc-users] -info filename In-Reply-To: References: <4DA7072F.6020102@inria.fr> Message-ID: <4DA71571.8070102@inria.fr> Sorry, it has surely been corrected in current releases. I have this in 3.1.p5 %./ex2 -info info.log [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Unknown logical value: info.log! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5 On 04/14/2011 04:51 PM, Jed Brown wrote: > On Thu, Apr 14, 2011 at 16:39, Desire NUENTSA WAKAM > > > wrote: > > *-info : print informative messages about the > calculations* > but the optional filename expected is actually a logical value. > Is this a known behaviour ?? > > > $ cd petsc-3.1/src/ksp/ksp/examples/tutorials/ > $ make ex2 > $ ./ex2 -info info.log > $ wc info.log.0 > 49 355 3240 info.log.0 > > What do you expect? -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Apr 14 10:56:00 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 14 Apr 2011 10:56:00 -0500 (CDT) Subject: [petsc-users] -info filename In-Reply-To: <4DA71571.8070102@inria.fr> References: <4DA7072F.6020102@inria.fr> <4DA71571.8070102@inria.fr> Message-ID: yes - this is fixed in one of the post- 3.1.p5 patches [so upgrading to 3.1.p8 should get rid of this problem] satish On Thu, 14 Apr 2011, Desire NUENTSA WAKAM wrote: > Sorry, it has surely been corrected in current releases. > I have this in 3.1.p5 > %./ex2 -info info.log > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Unknown logical value: info.log! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5 > > > On 04/14/2011 04:51 PM, Jed Brown wrote: > > On Thu, Apr 14, 2011 at 16:39, Desire NUENTSA WAKAM > > > > > wrote: > > > > *-info : print informative messages about the > > calculations* > > but the optional filename expected is actually a logical value. > > Is this a known behaviour ?? > > > > > > $ cd petsc-3.1/src/ksp/ksp/examples/tutorials/ > > $ make ex2 > > $ ./ex2 -info info.log > > $ wc info.log.0 > > 49 355 3240 info.log.0 > > > > What do you expect? > From f.denner09 at imperial.ac.uk Thu Apr 14 12:20:48 2011 From: f.denner09 at imperial.ac.uk (Denner, Fabian) Date: Thu, 14 Apr 2011 18:20:48 +0100 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: Message-ID: Hi, I have a question concerning pre-conditioner/solver pairs for CFD. I'm using the BiCGStab solver with a Jacobi pre-conditioner at present to perform parallel simulations of fluid flow on unstructured grids. It works, however, for large meshes (>100k elements) the solver doesn't scale very well in terms of necessary iterations to reach a certain tolerance. Does anybody have experience on which pre-conditioner works best for parallel CFD simulations using the BiCGStab solver? How is the convergence and stability of the multigrid solver compared to BiCGStab? Best regards, Fabian From jed at 59A2.org Thu Apr 14 12:28:54 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 14 Apr 2011 19:28:54 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: Message-ID: On Thu, Apr 14, 2011 at 19:20, Denner, Fabian wrote: > I have a question concerning pre-conditioner/solver pairs for CFD. I'm > using the BiCGStab solver with a Jacobi pre-conditioner at present to > perform parallel simulations of fluid flow on unstructured grids. > There are lots of methods for CFD. Maybe you could be more specific about what you're solving (laminar, RANS, LES, DNS; compressible?; fully implicit, v-p split implicit, explicit v/implicit p). The mesh quality is also relevant. Do you have aspect ratio 10^6 elements as for wall-resolved LES? > It works, however, for large meshes (>100k elements) the solver doesn't > scale very well in terms of necessary iterations to reach a certain > tolerance. > Does anybody have experience on which pre-conditioner works best for > parallel CFD simulations using the BiCGStab solver? How is the convergence > and stability of the multigrid solver compared to BiCGStab? > 1/(\Delta x) iterations versus 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.denner09 at imperial.ac.uk Thu Apr 14 13:05:21 2011 From: f.denner09 at imperial.ac.uk (Denner, Fabian) Date: Thu, 14 Apr 2011 19:05:21 +0100 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: , Message-ID: I solve fully implicit, incompressible DNS of low/moderate Reynolds number (Re = 1 - 1000) flows. The hexahedral meshes to test the code have an aspect ration of 1, the tetrahedral meshes are Voronoi-based, do not have high aspect ratios and a good quality. ________________________________________ From: five9a2 at gmail.com [five9a2 at gmail.com] On Behalf Of Jed Brown [jed at 59A2.org] Sent: 14 April 2011 18:28 To: PETSc users list Cc: Denner, Fabian Subject: Re: [petsc-users] Pre-conditioner for parallel CFD simulation On Thu, Apr 14, 2011 at 19:20, Denner, Fabian > wrote: I have a question concerning pre-conditioner/solver pairs for CFD. I'm using the BiCGStab solver with a Jacobi pre-conditioner at present to perform parallel simulations of fluid flow on unstructured grids. There are lots of methods for CFD. Maybe you could be more specific about what you're solving (laminar, RANS, LES, DNS; compressible?; fully implicit, v-p split implicit, explicit v/implicit p). The mesh quality is also relevant. Do you have aspect ratio 10^6 elements as for wall-resolved LES? It works, however, for large meshes (>100k elements) the solver doesn't scale very well in terms of necessary iterations to reach a certain tolerance. Does anybody have experience on which pre-conditioner works best for parallel CFD simulations using the BiCGStab solver? How is the convergence and stability of the multigrid solver compared to BiCGStab? 1/(\Delta x) iterations versus 1 From jed at 59A2.org Thu Apr 14 13:16:19 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 14 Apr 2011 20:16:19 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: Message-ID: On Thu, Apr 14, 2011 at 20:05, Denner, Fabian wrote: > I solve fully implicit, incompressible DNS of low/moderate Reynolds number > (Re = 1 - 1000) flows. The hexahedral meshes to test the code have an aspect > ration of 1, the tetrahedral meshes are Voronoi-based, do not have high > aspect ratios and a good quality. Finite element? Inf-sup stable or stabilized? Continuous or discontinuous pressure? This email from last week is relevant http://lists.mcs.anl.gov/pipermail/petsc-users/2011-April/008475.html Coupled multigrid takes some work, you can do algebraic multigrid with PCFieldSplit with relatively little effort. What CFL are you running at? -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.denner09 at imperial.ac.uk Thu Apr 14 13:22:36 2011 From: f.denner09 at imperial.ac.uk (Denner, Fabian) Date: Thu, 14 Apr 2011 19:22:36 +0100 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: , Message-ID: It's Finite Volume, co-located grid arrangement, stabilized, with a continuous pressure field at CFL numbers < 1 (typically 0.3-0.7). Which pre-conditioner would you recommend with CG solvers (BiCGStab) for that sort of problem? ________________________________________ From: five9a2 at gmail.com [five9a2 at gmail.com] On Behalf Of Jed Brown [jed at 59A2.org] Sent: 14 April 2011 19:16 To: Denner, Fabian Cc: PETSc users list Subject: Re: [petsc-users] Pre-conditioner for parallel CFD simulation On Thu, Apr 14, 2011 at 20:05, Denner, Fabian > wrote: I solve fully implicit, incompressible DNS of low/moderate Reynolds number (Re = 1 - 1000) flows. The hexahedral meshes to test the code have an aspect ration of 1, the tetrahedral meshes are Voronoi-based, do not have high aspect ratios and a good quality. Finite element? Inf-sup stable or stabilized? Continuous or discontinuous pressure? This email from last week is relevant http://lists.mcs.anl.gov/pipermail/petsc-users/2011-April/008475.html Coupled multigrid takes some work, you can do algebraic multigrid with PCFieldSplit with relatively little effort. What CFL are you running at? From jed at 59A2.org Thu Apr 14 13:49:18 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 14 Apr 2011 20:49:18 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: Message-ID: On Thu, Apr 14, 2011 at 20:22, Denner, Fabian wrote: > It's Finite Volume, co-located grid arrangement, stabilized, with a > continuous pressure field at CFL numbers < 1 (typically 0.3-0.7). > Much easier, it is likely that a relatively standard coupled multigrid will work. If you order unknowns so they are interlaced (u0,v0,w0,p0,u1,v1,...) and MatSetBlockSize(A,4) and/or use the BAIJ format, you stand a good chance with -pc_type ml and reasonable smoothers. Or you may have access to a geometric hierarchy? Preconditioning with SIMPLE, or (stronger) using SIMPLE as a smoother on multigrid levels should work well. With the CFL number so low (and even lower on coarse levels), you can also skip the SIMPLE procedure and just use the "pressure Poisson" operator from the usual semi-implicit method as a preconditioner (or as a smoother for coupled multigrid, with Jacobi applied to the velocity part). Any of these variants should converge in a small number of iterations independent of resolution. The following does not do any coupled multigrid (which should converge faster, but is more expensive per V-cycle), but should give you a good methods intro. All these algorithms are straightforward to implement using PCFieldSplit. http://dx.doi.org/10.1016/S0021-9991(03)00121-9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.denner09 at imperial.ac.uk Thu Apr 14 14:01:34 2011 From: f.denner09 at imperial.ac.uk (Denner, Fabian) Date: Thu, 14 Apr 2011 20:01:34 +0100 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: , Message-ID: Thanks Jed, I'll have a look on it and see if it works. Best regards, Fabian. ________________________________________ From: five9a2 at gmail.com [five9a2 at gmail.com] On Behalf Of Jed Brown [jed at 59A2.org] Sent: 14 April 2011 19:49 To: Denner, Fabian Cc: PETSc users list Subject: Re: [petsc-users] Pre-conditioner for parallel CFD simulation On Thu, Apr 14, 2011 at 20:22, Denner, Fabian > wrote: It's Finite Volume, co-located grid arrangement, stabilized, with a continuous pressure field at CFL numbers < 1 (typically 0.3-0.7). Much easier, it is likely that a relatively standard coupled multigrid will work. If you order unknowns so they are interlaced (u0,v0,w0,p0,u1,v1,...) and MatSetBlockSize(A,4) and/or use the BAIJ format, you stand a good chance with -pc_type ml and reasonable smoothers. Or you may have access to a geometric hierarchy? Preconditioning with SIMPLE, or (stronger) using SIMPLE as a smoother on multigrid levels should work well. With the CFL number so low (and even lower on coarse levels), you can also skip the SIMPLE procedure and just use the "pressure Poisson" operator from the usual semi-implicit method as a preconditioner (or as a smoother for coupled multigrid, with Jacobi applied to the velocity part). Any of these variants should converge in a small number of iterations independent of resolution. The following does not do any coupled multigrid (which should converge faster, but is more expensive per V-cycle), but should give you a good methods intro. All these algorithms are straightforward to implement using PCFieldSplit. http://dx.doi.org/10.1016/S0021-9991(03)00121-9 From khalid_eee at yahoo.com Fri Apr 15 04:28:10 2011 From: khalid_eee at yahoo.com (khalid ashraf) Date: Fri, 15 Apr 2011 02:28:10 -0700 (PDT) Subject: [petsc-users] DMMG with PBC Message-ID: <442878.22477.qm@web112607.mail.gq1.yahoo.com> Hi, I am running src/ksp/ksp/examples/tutorials/ex22.c I matched the output of single processor and multiprocessor results and it works fine. But I want to use a periodic boundary condition. I make the following changes in the main function and this works fine with this change as well: ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr); ierr = DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,2,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr); However, when I comment out these following lines since I am using a PBC, then the result of 1 proc and multi-proc are not the same. They vary within 5 decimal points and the difference increases with increasing number of processors. /* if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){ v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx); ierr = MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr); } else */ Could you please tell me what is going wrong here. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Apr 15 05:06:38 2011 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 15 Apr 2011 12:06:38 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: Message-ID: <4DA818AE.80900@tu-dresden.de> Jed Brown wrote: > On Thu, Apr 14, 2011 at 20:22, Denner, Fabian > > wrote: > > It's Finite Volume, co-located grid arrangement, stabilized, with > a continuous pressure field at CFL numbers < 1 (typically 0.3-0.7). > > > Much easier, it is likely that a relatively standard coupled multigrid > will work. If you order unknowns so they are interlaced > (u0,v0,w0,p0,u1,v1,...) and MatSetBlockSize(A,4) and/or use the BAIJ > format, you stand a good chance with -pc_type ml and reasonable > smoothers. Or you may have access to a geometric hierarchy? Which package must be installed to make use of "ml" (algebraic multigrid?) as a preconditioner? Thomas From jed at 59A2.org Fri Apr 15 05:08:09 2011 From: jed at 59A2.org (Jed Brown) Date: Fri, 15 Apr 2011 12:08:09 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: <4DA818AE.80900@tu-dresden.de> References: <4DA818AE.80900@tu-dresden.de> Message-ID: On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Which package must be installed to make use of "ml" (algebraic multigrid?) > as a preconditioner? ML, --download-ml also -pc_type hypre (BoomerAMG is default) from, you guessed it, Hypre, --download-hypre -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Apr 15 05:38:56 2011 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 15 Apr 2011 12:38:56 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: <4DA818AE.80900@tu-dresden.de> Message-ID: <4DA82040.3010100@tu-dresden.de> Jed Brown wrote: > On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski > > wrote: > > Which package must be installed to make use of "ml" (algebraic > multigrid?) as a preconditioner? > > > ML, --download-ml > also -pc_type hypre (BoomerAMG is default) from, you guessed it, > Hypre, --download-hypre I tried it, but when I make use of BAIJ matrix format, as you have proposed, I get the following error: Invalid matrix type for ML. ML can only handle AIJ matrices.! So can I make use of algebraic multigrid on block size matrices with this or one of the other packages? Thomas From thomas.witkowski at tu-dresden.de Fri Apr 15 05:40:37 2011 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 15 Apr 2011 12:40:37 +0200 Subject: [petsc-users] FETI-DP In-Reply-To: References: <4DA6F440.4000204@tu-dresden.de> Message-ID: <4DA820A5.8020705@tu-dresden.de> Jed Brown wrote: > On Thu, Apr 14, 2011 at 15:18, Thomas Witkowski > > wrote: > > Has anybody of you implemented the FETI-DP method in PETSc? I > think about to do this for my FEM code, but first I want to > evaluate the effort of the implementation. > > > There are a few implementations out there. Probably most notable is > Axel Klawonn and Oliver Rheinbach's implementation which has been > scaled up to very large problems and computers. My understanding is > that Xuemin Tu did some work on BDDC (equivalent to FETI-DP) using > PETSc. I am not aware of anyone releasing a working FETI-DP > implementation using PETSc, but of course you're welcome to ask these > people if they would share code with you. I know the works of Klawonn and Rheinbach, but was not aware that they have implemented their algorithms with PETSc. > > What sort of problems do you want it for (physics and mesh)? How are > you currently assembling your systems? A fully general FETI-DP > implementation is a lot of work. For a specific class of problems and > variant of FETI-DP, it will still take some effort, but should not be > too much. My work is on a very general finite element toolbox (AMDiS) that solves a broad class of PDEs. The code is already parallelized, i.e., we have real distributed 2D (triangles) and 3D (tetrahedrons) adaptive meshes, mesh partitioning for load balancing with ParMETiS and Zoltan and a PETSc interface. For PETSc, there are two different modes at the moment. Either a so called global matrix solver or a Schur complement approach. The first one assembles one global parallel matrix, which we make most use of for using MUMPs or SuperLU on small and mid size problems. I would like to implement a broad class of different domain decomposition approaches into AMDiS, so that the user can make use of the method that is most appropriate for the problem. > > There was a start to a FETI-DP implementation in PETSc quite a while > ago, but it died due to bitrot and different ideas of how we would > like to implement. You can get that code from mercurial: > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/021f379b5eea Okay, good to know! I will have a look on it, may be I can extract some ideas for my own implementation. > > > The fundamental ingredient of these methods is a "partially assembled" > matrix. For a library implementation, the challenges are What do you mean by "partially assembled"? Do you mean that only a subset of the subdomain nodes must be assembled to a parallel distributed matrix and the most one can be put to a local matrix? > > 1. How does the user provide the information necessary to decide what > the coarse space looks like? (It's different for scalar problems, > compressible elasticity, and Stokes, and tricky to do with no > geometric information from the user.) The coefficient structure in the > problem matters a lot when deciding which coarse basis functions to > use, see http://dx.doi.org/10.1016/j.cma.2006.03.023 Do you think that this is really possible without providing at least some geometric information? At least in my code I can provide arbitrary geometrical information about the nodes to other libraries on very low computation costs. > > 2. How do you handle primal basis functions with large support (e.g. > rigid body modes of a face)? Two choices here: > http://www.cs.nyu.edu/cs/faculty/widlund/FETI-DP-elasticity_TR.pdf . > > 3. How do you make it easy for the user to provide the required > matrix? Ideally, the user would just use plain MatSetValuesLocal() and > run with -mat_type partially-assembled -pc_type fetidp instead of, say > -mat_type baij -pc_type asm. It should work for multiple subdomains > per process and subdomains spanning multiple processes. This can now > be done by implementing MatGetLocalSubMatrix(). The local blocks of > the partially assembled system should be able to use different formats > (e.g. SBAIJ). I like this idea, but it's somehow the same as with PCFieldSplit. To make use of it, I have to provide at least the splits, before I can run this preconditioner. This will be same for FETI-DP. Somehow the user will need to specify the coarse space. To make this in a general way is a very challenging task, from my point of view. > > 4. How do you handle more than two levels? This is very important to > use more than about 1000 subdomains in 3D because the coarse problem > just gets too big (unless the coarse problem happens to be > well-conditioned enough that you can use algebraic multigrid). Good question. Eventually, me code should run on definitely more then 1000 nodes in 3D. We have some PDE's which we would like to run on O(10^5) nodes (phase field crystal equation, which is a 6th order nonlinear parabolic PDE). > > I've wanted to implement FETI-DP in PETSc for almost two years, but > it's never been a high priority. I think I now know how to get enough > flexibility to make it worthwhile to me. I'd be happy to discuss > implementation issues with you. To implement FETI-DP in PETSc in a general way is very challenging but would be a feature of interest for most people how want to run their codes on real large number of nodes. If there are already some guys who have implemented it in PETSc, it would be the best to contact them to discuss these things. Thomas From thomas.witkowski at tu-dresden.de Fri Apr 15 05:51:24 2011 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 15 Apr 2011 12:51:24 +0200 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: References: <4DA818AE.80900@tu-dresden.de> Message-ID: <4DA8232C.6070007@tu-dresden.de> Jed Brown wrote: > On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski > > wrote: > > Which package must be installed to make use of "ml" (algebraic > multigrid?) as a preconditioner? > > > ML, --download-ml > also -pc_type hypre (BoomerAMG is default) from, you guessed it, > Hypre, --download-hypre But it works with MatAIJ with MatSetBlockSize(x). What is the difference between using MatBAIJ and MatAIJ with setting the block size directly? From knepley at gmail.com Fri Apr 15 06:34:18 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Apr 2011 06:34:18 -0500 Subject: [petsc-users] Pre-conditioner for parallel CFD simulation In-Reply-To: <4DA8232C.6070007@tu-dresden.de> References: <4DA818AE.80900@tu-dresden.de> <4DA8232C.6070007@tu-dresden.de> Message-ID: On Fri, Apr 15, 2011 at 5:51 AM, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Jed Brown wrote: > > On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski < >> thomas.witkowski at tu-dresden.de > >> wrote: >> >> Which package must be installed to make use of "ml" (algebraic >> multigrid?) as a preconditioner? >> >> >> ML, --download-ml >> also -pc_type hypre (BoomerAMG is default) from, you guessed it, Hypre, >> --download-hypre >> > But it works with MatAIJ with MatSetBlockSize(x). What is the difference > between using MatBAIJ and MatAIJ with setting the block size directly? > BAIJ changes the internal storage format to make things run faster. ML does not handle this. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Apr 15 06:38:25 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Apr 2011 06:38:25 -0500 Subject: [petsc-users] DMMG with PBC In-Reply-To: <442878.22477.qm@web112607.mail.gq1.yahoo.com> References: <442878.22477.qm@web112607.mail.gq1.yahoo.com> Message-ID: On Fri, Apr 15, 2011 at 4:28 AM, khalid ashraf wrote: > Hi, > I am running src/ksp/ksp/examples/tutorials/ex22.c > I matched the output of single processor and multiprocessor results and it > works fine. > But I want to use a periodic boundary condition. I make the following > changes in the main function and this works fine with this change as well: > > ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr); > ierr = > DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,2,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr); > > > However, when I comment out these following lines since I am using a PBC, > then the result of 1 proc and multi-proc are not the same. They vary within > 5 decimal points and the difference increases with increasing number of > processors. > The periodic operator has a null space. You must put that in the solver DMMGSetNullSpace(), so that it is projected out at each step. Matt > /* if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){ > v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx); > ierr = > MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr); > } else */ > > Could you please tell me what is going wrong here. > > Thanks. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Debao.Shao at brion.com Fri Apr 15 02:16:04 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Fri, 15 Apr 2011 00:16:04 -0700 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> Dear Petsc: I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. My libpetsc.a is built as follows: 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 2, make all; It's very appreciated to get your reply. Thanks a lot, Debao ________________________________ -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Apr 15 08:24:48 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 15 Apr 2011 08:24:48 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> Message-ID: <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> Debao, Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. Barry On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: > Dear Petsc: > > I?m trying on Petsc iterative solver(KSPCG & PCJACOBI), but it?s strange that the two functions ?MatCopy? and ?MatSetValue? consume most of runtime, and the functions were not called frequently, just several times. > > My libpetsc.a is built as follows: > 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 > 2, make all; > > It?s very appreciated to get your reply. > > Thanks a lot, > Debao > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From J.Jaykka at leeds.ac.uk Fri Apr 15 09:17:44 2011 From: J.Jaykka at leeds.ac.uk (Juha =?iso-8859-1?q?J=E4ykk=E4?=) Date: Fri, 15 Apr 2011 14:17:44 +0000 Subject: [petsc-users] strange error Message-ID: <201104151517.44294.J.Jaykka@leeds.ac.uk> Hi list! I keep getting strange errors when running a PETSc code: [6]PETSC ERROR: --------------------- Error Message ------------------------------------ [6]PETSC ERROR: Object is in wrong state! [6]PETSC ERROR: Matrix must be set first! [6]PETSC ERROR: ------------------------------------------------------------------------ This happens on one machine only, but it has OpenMPI just as the others, where it works correctly. More specifically, the error comes from [6]PETSC ERROR: PCSetUp() line 775 in src/ksp/pc/interface/precon.c [6]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c [6]PETSC ERROR: KSPSolve_PREONLY() line 29 in src/ksp/ksp/impls/preonly/preonly.c [6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c [6]PETSC ERROR: PCApply_BJacobi_Singleblock() line 777 in src/ksp/pc/impls/bjacobi/bjacobi.c [6]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c [6]PETSC ERROR: KSPInitialResidual() line 65 in src/ksp/ksp/interface/itres.c [6]PETSC ERROR: KSPSolve_GMRES() line 240 in src/ksp/ksp/impls/gmres/gmres.c [6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c [6]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c [6]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c [6]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c [6]PETSC ERROR: TSStep_BEuler_Nonlinear() line 176 in src/ts/impls/implicit/beuler/beuler.c [6]PETSC ERROR: TSStep() line 1693 in src/ts/interface/ts.c [6]PETSC ERROR: TSSolve() line 1731 in src/ts/interface/ts.c Now, I do realize that the Jacobian and preconditioner matrices must be properly created before calling TSSolve, but, to the best of my knowledge, they are. If they were not, the code should never work. And this always comes from just one MPI rank, as if somehow its memory is corrupt or something. Funny thing is, it only happens under the batch queue system: interactively, on the same machine, it works fine. Any help is appreciated... -Juha From knepley at gmail.com Fri Apr 15 10:13:31 2011 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Apr 2011 10:13:31 -0500 Subject: [petsc-users] strange error In-Reply-To: <201104151517.44294.J.Jaykka@leeds.ac.uk> References: <201104151517.44294.J.Jaykka@leeds.ac.uk> Message-ID: On Fri, Apr 15, 2011 at 9:17 AM, Juha J?ykk? wrote: > Hi list! > > I keep getting strange errors when running a PETSc code: > > [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [6]PETSC ERROR: Object is in wrong state! > [6]PETSC ERROR: Matrix must be set first! > [6]PETSC ERROR: > ------------------------------------------------------------------------ > > This happens on one machine only, but it has OpenMPI just as the others, > where > it works correctly. > > More specifically, the error comes from > > [6]PETSC ERROR: PCSetUp() line 775 in src/ksp/pc/interface/precon.c > [6]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c > [6]PETSC ERROR: KSPSolve_PREONLY() line 29 in > src/ksp/ksp/impls/preonly/preonly.c > [6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c > [6]PETSC ERROR: PCApply_BJacobi_Singleblock() line 777 in > src/ksp/pc/impls/bjacobi/bjacobi.c > [6]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c > [6]PETSC ERROR: KSPInitialResidual() line 65 in > src/ksp/ksp/interface/itres.c > [6]PETSC ERROR: KSPSolve_GMRES() line 240 in > src/ksp/ksp/impls/gmres/gmres.c > [6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c > [6]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c > [6]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [6]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c > [6]PETSC ERROR: TSStep_BEuler_Nonlinear() line 176 in > src/ts/impls/implicit/beuler/beuler.c > [6]PETSC ERROR: TSStep() line 1693 in src/ts/interface/ts.c > [6]PETSC ERROR: TSSolve() line 1731 in src/ts/interface/ts.c > > Now, I do realize that the Jacobian and preconditioner matrices must be > properly created before calling TSSolve, but, to the best of my knowledge, > they are. If they were not, the code should never work. > > And this always comes from just one MPI rank, as if somehow its memory is > corrupt or something. Funny thing is, it only happens under the batch queue > system: interactively, on the same machine, it works fine. > I would try valgrind on it first. Matt > Any help is appreciated... > > -Juha > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.tabak at tudelft.nl Fri Apr 15 12:33:49 2011 From: u.tabak at tudelft.nl (Umut Tabak) Date: Fri, 15 Apr 2011 19:33:49 +0200 Subject: [petsc-users] difference on MATLAB backslash and PETSc/external solvers Message-ID: <4DA8817D.8040702@tudelft.nl> Dear all, I have been testing the factors of a symmetric matrix of size 4225 by 4225 as a preconditioner for a pcg type iteration in MATLAB(Since I have these factors from an eigenvalue extraction process). I extract the main matrix from a commercial code and the sparsity pattern is not that optimum for the moment. Using the built in profiler, I tried to check the performance of the implementation, most of my time was spent in the forward-backward substitutions resulting from the preconditioner usage, namely, the p = M^{-1} r operation in the pcg algorithm, which was expected. Moreover, I conducted a series of simple tests with the same operator matrix and right hand side in PETSc and with the external direct solver interfaces, I ended up some differences in the solution phases. I timed the process with PetscGetTime function. The related part of the code is given as std::cout << "First solve ... " << std::endl; ierr = PetscGetTime(&t1);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); ierr = PetscGetTime(&t2);CHKERRQ(ierr); std::cout << "Code took " << t2-t1 << " seconds .. " << std::endl; std::cout << "Second solve ... " << std::endl; ierr = PetscGetTime(&t3);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,y);CHKERRQ(ierr); ierr = PetscGetTime(&t4);CHKERRQ(ierr); std::cout << "Code took " << t4-t3 << " seconds .. " << std::endl; std::cout << "Third solve ... " << std::endl; ierr = PetscGetTime(&t5);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,z);CHKERRQ(ierr); ierr = PetscGetTime(&t6);CHKERRQ(ierr); std::cout << "Code took " << t6-t5 << " seconds .. " << std::endl; Basically, I do a first solve where the factorization is done( just to be sure, KSPSolve is the step where the factorization is done, right?) Then, the results for different factorization and forward backward substitutions are given below, when I use the ksp object the second time, it basically does a forward-backward solve, right? Then I compare the results with the MATLAB backslash which is also using UMFPACK as far as I can read from the documentation. PETSc is built in the default mode, which is the debug mode. Can this be the reason of this? Any explanations are welcome on this. Actually the difference on the umfpack result was interesting to me. Here are the results: umfpack: --------------------------------------------------- First solve ... Code took 1.94039 seconds .. Second solve ... Code took 0.0264909 seconds .. Third solve ... Code took 0.0264909 seconds .. mumps: --------------------------------------------------- First solve ... Code took 1.40669 seconds .. Second solve ... Code took 0.0235441 seconds .. Third solve ... Code took 0.023541 seconds .. superlu: --------------------------------------------------- First solve ... Code took 3.20602 seconds .. Second solve ... Code took 0.0487978 seconds .. Third solve ... Code took 0.048856 seconds .. spooles: ---------------------------------------------------- First solve ... Code took 1.43427 seconds .. Second solve ... Code took 0.0536189 seconds .. Third solve ... Code took 0.053726 seconds .. PETSc ---------------------------------------------------- First solve ... Code took 1.3292 seconds .. Second solve ... Code took 0.0116079 seconds .. Third solve ... Code took 0.011915 seconds .. MATLAB by cputime function ----------------------------------------------------- A \ b (native backslash) 3.800000000000006e-01 (with a bit fluctuation ) and by using the Factorize package also written by Timothy Davis; t = cputime; factorOpA =factorize(OpA); factorOpA \ rhsA; cputime-t 6.300000000000026e-01 and a forward backward substitution using factorOpA t = cputime; factorOpA \ rhsA; cputime-t 9.999999999998010e-03 Best, Umut -- If I have a thousand ideas and only one turns out to be good, I am satisfied. Alfred Nobel From bsmith at mcs.anl.gov Fri Apr 15 13:00:46 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 15 Apr 2011 13:00:46 -0500 Subject: [petsc-users] difference on MATLAB backslash and PETSc/external solvers In-Reply-To: <4DA8817D.8040702@tudelft.nl> References: <4DA8817D.8040702@tudelft.nl> Message-ID: > Then I compare the results with the MATLAB backslash which is also using UMFPACK as far as I can read from the documentation. PETSc is built in the default mode, which is the debug mode. What do you mean in the debug mode? To do any fair comparisons of times you should have ./configure PETSc with --with-debugging=0 > Can this be the reason of this? Reason of what? Different solvers will give different solution times, this is perfectly normal. In fact sometimes for different matrices different solvers will actually be best for different matrices? Are you asking why UMFPack in Matlab does much faster (it seems) than the one used by PETSc? The likely answer is that the Matlab folks tweak the hell out of it to get good performance and when Tim Davis says "UMFPACK is in Matlab" actually means he gave something to Matlab and they improved it. It is unlikely that Matlab just uses the downloadable open source UMFPACK that PETSc uses. If this does not answer your question please rephrase. Barry On Apr 15, 2011, at 12:33 PM, Umut Tabak wrote: > Dear all, > > I have been testing the factors of a symmetric matrix of size 4225 by 4225 as a preconditioner for a pcg type iteration in MATLAB(Since I have these factors from an eigenvalue extraction process). I extract the main matrix from a commercial code and the sparsity pattern is not that optimum for the moment. > > Using the built in profiler, I tried to check the performance of the implementation, most of my time was spent in the forward-backward substitutions resulting from the preconditioner usage, namely, the p = M^{-1} r operation in the pcg algorithm, which was expected. > > Moreover, I conducted a series of simple tests with the same operator matrix and right hand side in PETSc and with the external direct solver interfaces, I ended up some differences in the solution phases. I timed the process with PetscGetTime function. The related part of the code is given as > > std::cout << "First solve ... " << std::endl; > ierr = PetscGetTime(&t1);CHKERRQ(ierr); > ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); > ierr = PetscGetTime(&t2);CHKERRQ(ierr); > std::cout << "Code took " << t2-t1 << " seconds .. " << std::endl; > std::cout << "Second solve ... " << std::endl; > ierr = PetscGetTime(&t3);CHKERRQ(ierr); > ierr = KSPSolve(ksp,b,y);CHKERRQ(ierr); > ierr = PetscGetTime(&t4);CHKERRQ(ierr); > std::cout << "Code took " << t4-t3 << " seconds .. " << std::endl; > std::cout << "Third solve ... " << std::endl; > ierr = PetscGetTime(&t5);CHKERRQ(ierr); > ierr = KSPSolve(ksp,b,z);CHKERRQ(ierr); > ierr = PetscGetTime(&t6);CHKERRQ(ierr); > std::cout << "Code took " << t6-t5 << " seconds .. " << std::endl; > > Basically, I do a first solve where the factorization is done( just to be sure, KSPSolve is the step where the factorization is done, right?) Then, the results for different factorization and forward backward substitutions are given below, when I use the ksp object the second time, it basically does a forward-backward solve, right? Then I compare the results with the MATLAB backslash which is also using UMFPACK as far as I can read from the documentation. PETSc is built in the default mode, which is the debug mode. Can this be the reason of this? Any explanations are welcome on this. > Actually the difference on the umfpack result was interesting to me. > > Here are the results: > > umfpack: > --------------------------------------------------- > First solve ... > Code took 1.94039 seconds .. > Second solve ... > Code took 0.0264909 seconds .. > Third solve ... > Code took 0.0264909 seconds .. > > mumps: > --------------------------------------------------- > First solve ... > Code took 1.40669 seconds .. > Second solve ... > Code took 0.0235441 seconds .. > Third solve ... > Code took 0.023541 seconds .. > > superlu: > --------------------------------------------------- > First solve ... > Code took 3.20602 seconds .. > Second solve ... > Code took 0.0487978 seconds .. > Third solve ... > Code took 0.048856 seconds .. > > spooles: > ---------------------------------------------------- > First solve ... > Code took 1.43427 seconds .. > Second solve ... > Code took 0.0536189 seconds .. > Third solve ... > Code took 0.053726 seconds .. > > PETSc > ---------------------------------------------------- > First solve ... > Code took 1.3292 seconds .. > Second solve ... > Code took 0.0116079 seconds .. > Third solve ... > Code took 0.011915 seconds .. > > MATLAB by cputime function > ----------------------------------------------------- > A \ b (native backslash) > 3.800000000000006e-01 (with a bit fluctuation ) > > and by using the Factorize package also written by Timothy Davis; > > t = cputime; factorOpA =factorize(OpA); factorOpA \ rhsA; cputime-t > 6.300000000000026e-01 > > and a forward backward substitution using factorOpA > > t = cputime; factorOpA \ rhsA; cputime-t > 9.999999999998010e-03 > > Best, > Umut > > -- > If I have a thousand ideas and only one turns out to be good, > I am satisfied. > Alfred Nobel > From u.tabak at tudelft.nl Fri Apr 15 13:02:56 2011 From: u.tabak at tudelft.nl (Umut Tabak) Date: Fri, 15 Apr 2011 20:02:56 +0200 Subject: [petsc-users] difference on MATLAB backslash and PETSc/external solvers In-Reply-To: References: <4DA8817D.8040702@tudelft.nl> Message-ID: <4DA88850.5000900@tudelft.nl> On 04/15/2011 08:00 PM, Barry Smith wrote: > What do you mean in the debug mode? To do any fair comparisons of times you should have ./configure PETSc with --with-debugging=0 > > Dear Barry, Thx. Ok, this is what I meant. > Reason of what? > > The difference... > Different solvers will give different solution times, this is perfectly normal. In fact sometimes for different matrices different solvers will actually be best for different matrices? > > Are you asking why UMFPack in Matlab does much faster (it seems) than the one used by PETSc? The likely answer is that the Matlab folks tweak the hell out of it to get good performance and when Tim Davis says "UMFPACK is in Matlab" actually means he gave something to Matlab and they improved it. ok, fine, this answers my question. > It is unlikely that Matlab just uses the downloadable open source UMFPACK that PETSc uses. > From ram at ibrae.ac.ru Sat Apr 16 15:26:47 2011 From: ram at ibrae.ac.ru (=?KOI8-R?B?4czFy9PFyiDy0drBzs/X?=) Date: Sun, 17 Apr 2011 00:26:47 +0400 Subject: [petsc-users] How to create and assemble matrices for DA vectors?? In-Reply-To: References: Message-ID: > > >> Create u,b with DAGetGlobalVector() and A with DAGetMatrix() and they >> will match the DA. For eg: check: src/snes/examples/tutorials/ex5.c >> [or some of the examples in src/dm/da/examples] >> >> Satish >> >> Hello again! 1. Please tell me, what's the principal difference between procedures DAGetGlobalVector and DACreateGlobalVector? I cant catch it from man pages. 2. As I can read from DAGetMatrix man page, this procedure: Creates a matrix with the correct parallel layout and nonzero structure required for computing the Jacobian on a function defined using the stencil set in the DA Notes: This properly preallocates the number of nonzeros in the sparse matrix so you do not need to do it yourself. By default it also sets the nonzero structure and puts in the zero entries. To prevent setting the nonzero pattern call DASetMatPreallocateOnly<../DA/DASetMatPreallocateOnly.html#DASetMatPreallocateOnly> () So I use DASetMatPreallocateOnly. But I dont need a Jacobian. I need a matrix of my linear system with its original number of nonzeros per row and its original nonzero pattern. So I use MatSetValues and MatAsseblyBegin/End to assemble it. And -info key on runtime tells me that there were additional mallocs during runtime. As it said in manual, this is very expensive to allocate memory dynamically. MatMPIAIJSetPreallocation doesnt help me. How should I preallocate memory for DAMatrix? Thank you! Alexey Ryazanov ______________________________________ Nuclear Safety Institute of Russian Academy of Sciences -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Sat Apr 16 15:47:19 2011 From: jed at 59A2.org (Jed Brown) Date: Sat, 16 Apr 2011 22:47:19 +0200 Subject: [petsc-users] How to create and assemble matrices for DA vectors?? In-Reply-To: References: Message-ID: On Sat, Apr 16, 2011 at 22:26, ??????? ??????? wrote: > 1. Please tell me, what's the principal difference between procedures > DAGetGlobalVector and DACreateGlobalVector? I cant catch it from man pages. > Use DACreateGlobalVector() if you ownership of the vector, you call VecDestroy() when you are done with it. Use DAGetGlobalVector() to let the DA manage the lifetime of the vector, call DARestoreGlobalVector() when you are done with it. The DA does not actually destroy the vector, it keeps it around and just gives it back next time you call DAGetGlobalVector(). This is usually what you want for "work" vectors. > > 2. As I can read from DAGetMatrix man page, this procedure: > > Creates a matrix with the correct parallel layout and nonzero structure > required for computing the Jacobian on a function defined using the stencil > set in the > DA > > Notes: This properly preallocates the number of nonzeros in the sparse > matrix so you do not need to do it yourself. > > By default it also sets the nonzero structure and puts in the zero entries. > To prevent setting the nonzero pattern call DASetMatPreallocateOnly > () > > So I use DASetMatPreallocateOnly. > Why would you want to do that? > But I dont need a Jacobian. I need a matrix of my linear system > But that _is_ the Jacobian of the residual function (f(x) = A*x - b for linear problems). This language is used frequently in PETSc. > with its original number of nonzeros per row and its original nonzero > pattern. > What do you mean by "original"? You are setting values that have not been preallocated, perhaps because the stencil you defined for the DA is different from the one you are using during assembly. -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico.borzacchiello at univ-st-etienne.fr Sun Apr 17 16:16:38 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Sun, 17 Apr 2011 23:16:38 +0200 (CEST) Subject: [petsc-users] Setting intepolation and restriction operators in DMMG Message-ID: <9a75fdd120436247a0adcf63f0fe3e08.squirrel@arcon.univ-st-etienne.fr> Hi, I'm still in the process of coding my Stokes solver (FV+MAC Discretisation) with DMMG. So far the I've been testing it with direct solvers to check that everything was right for the functional and jacobian creation and it runs fine both in parallel and sequential mode. I now need to implement an iterative solver that will most likely be multigrid, and since I am using fully staggered arrangement I need to define my own grid transfer operators because the interpolation/restriction will be different for u v w p. How can I do this? Would you suggest another solution strategy instead? thank you Domenico. From ilyascfd at gmail.com Mon Apr 18 08:34:16 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Mon, 18 Apr 2011 16:34:16 +0300 Subject: [petsc-users] local row calculation in 3D Message-ID: Hi, In ex14f.F in KSP, "row" variable is calculated either 349: do 30 j=ys,ys+ym-1 350: ... 351: do 40 i=xs,xs+xm-1 352: row = i - gxs + (j - gys)*gxm + 1 or 442: do 50 j=ys,ys+ym-1 443: ... 444: row = (j - gys)*gxm + xs - gxs 445: do 60 i=xs,xs+xm-1 446: row = row + 1 How can I calculate "row" in 3D ? I tried this; do k=zs,zs+zm-1 do j=ys,ys+ym-1 do i=xs,xs+xm-1 row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 It does not work for certain number of processors. Thanks, Ilyas -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Apr 18 08:40:17 2011 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Apr 2011 08:40:17 -0500 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas wrote: > Hi, > > In ex14f.F in KSP, "row" variable is calculated either > These are very old. I suggest you use the FormFunctionLocal() approach in ex5f.F which does not calculate global row numbers when using a DA. Matt > 349: do 30 j=ys,ys+ym-1 > 350: ... > 351: do 40 i=xs,xs+xm-1 > 352: row = i - gxs + (j - gys)*gxm + 1 > > or > > 442: do 50 j=ys,ys+ym-1 > 443: ... > 444: row = (j - gys)*gxm + xs - gxs > 445: do 60 i=xs,xs+xm-1 > 446: row = row + 1 > > How can I calculate "row" in 3D ? > > I tried this; > > do k=zs,zs+zm-1 > do j=ys,ys+ym-1 > do i=xs,xs+xm-1 > > row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 > > It does not work for certain number of processors. > > > Thanks, > > Ilyas > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilyascfd at gmail.com Mon Apr 18 08:54:19 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Mon, 18 Apr 2011 16:54:19 +0300 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: Hi, Thank you for your suggestion. I will take it into account. Since changing this structure in my "massive" code may take too much time, I would like to know that how "row" is calculated in 3D, independently from processor numbers. Regards, Ilyas 2011/4/18 Matthew Knepley > On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas wrote: > >> Hi, >> >> In ex14f.F in KSP, "row" variable is calculated either >> > > These are very old. I suggest you use the FormFunctionLocal() approach in > ex5f.F which > does not calculate global row numbers when using a DA. > > Matt > > >> 349: do 30 j=ys,ys+ym-1 >> 350: ... >> 351: do 40 i=xs,xs+xm-1 >> 352: row = i - gxs + (j - gys)*gxm + 1 >> >> or >> >> 442: do 50 j=ys,ys+ym-1 >> 443: ... >> 444: row = (j - gys)*gxm + xs - gxs >> 445: do 60 i=xs,xs+xm-1 >> 446: row = row + 1 >> >> How can I calculate "row" in 3D ? >> >> I tried this; >> >> do k=zs,zs+zm-1 >> do j=ys,ys+ym-1 >> do i=xs,xs+xm-1 >> >> row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 >> >> It does not work for certain number of processors. >> >> >> Thanks, >> >> Ilyas >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Mon Apr 18 09:25:19 2011 From: rlmackie862 at gmail.com (Randall Mackie) Date: Mon, 18 Apr 2011 07:25:19 -0700 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: Here's how I do it: do kk=zs,zs+zm-1 do jj=ys,ys+ym-1 do ii=xs,xs+xm-1 row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym Good luck, Randy M. On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas wrote: > Hi, > Thank you for your suggestion. I will take it into account. > Since changing this structure in my "massive" code may take too much time, > I would like to know that how "row" is calculated in 3D, independently from > processor numbers. > > Regards, > Ilyas > > 2011/4/18 Matthew Knepley > >> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas wrote: >> >>> Hi, >>> >>> In ex14f.F in KSP, "row" variable is calculated either >>> >> >> These are very old. I suggest you use the FormFunctionLocal() approach in >> ex5f.F which >> does not calculate global row numbers when using a DA. >> >> Matt >> >> >>> 349: do 30 j=ys,ys+ym-1 >>> 350: ... >>> 351: do 40 i=xs,xs+xm-1 >>> 352: row = i - gxs + (j - gys)*gxm + 1 >>> >>> or >>> >>> 442: do 50 j=ys,ys+ym-1 >>> 443: ... >>> 444: row = (j - gys)*gxm + xs - gxs >>> 445: do 60 i=xs,xs+xm-1 >>> 446: row = row + 1 >>> >>> How can I calculate "row" in 3D ? >>> >>> I tried this; >>> >>> do k=zs,zs+zm-1 >>> do j=ys,ys+ym-1 >>> do i=xs,xs+xm-1 >>> >>> row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 >>> >>> It does not work for certain number of processors. >>> >>> >>> Thanks, >>> >>> Ilyas >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Mon Apr 18 21:42:48 2011 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Mon, 18 Apr 2011 20:42:48 -0600 Subject: [petsc-users] Problem in PCShellSetApply() Message-ID: Hi, I faced a problem when I used the function PCShellSetApply to set a composite PC in Petsc 3.1. There was no problem when I compiled my code but I got the following message when I run the code. ------------------------------------------------------------------------------------------------------------ [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: No support for this operation for this object type! [1]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct solver! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 ............................... [7]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c [7]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c [7]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c [7]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c [7]PETSC ERROR: PCApply_Composite_Additive() line 102 in src/ksp/pc/impls/composite/composite.c [7]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c [7]PETSC ERROR: PCApplyBAorAB() line 582 in src/ksp/pc/interface/precon.c [7]PETSC ERROR: GMREScycle() line 161 in src/ksp/ksp/impls/gmres/gmres.c [7]PETSC ERROR: KSPSolve_GMRES() line 241 in src/ksp/ksp/impls/gmres/gmres.c [7]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c [1]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c [1]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c .............................. -------------------------------------------------------------------------------------------------------------- I found that the function PCShellSetApply did not call the user defined function CoarseSolvePCApply (I called PCShellSetApply like this: ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr); ). Can anyone tell me what may be the problem? The related code attached. -----------------------The related code----------------------------------- PetscErrorCode SetupPreconditioner(JoabCtx* ctx, SNES snes) { JoabParameters *parameters = &ctx->parameters; JoabView *view = &ctx->view; JoabGrid *grid = &ctx->grid; JoabGrid *coarsegrid = &ctx->coarsegrid; JoabAlgebra *algebra = &ctx->algebra; JoabAlgebra *coarsealgebra = &ctx->coarsealgebra; KSP fineksp; PC finepc, coarsepc; PC asmpc; PC coarsesolve; Vec fineones; PetscInt veclength; PetscScalar *values; int i; PetscErrorCode ierr; PetscFunctionBegin; ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr); ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr); ierr = KSPCreate(PETSC_COMM_WORLD,&ctx->coarseksp);CHKERRQ(ierr); ierr = SetMyKSPDefaults(ctx->coarseksp);CHKERRQ(ierr); ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr); ierr = KSPSetOperators(ctx->coarseksp, coarsealgebra->H, coarsealgebra->H, SAME_NONZERO_PATTERN);CHKERRQ(ierr); if (parameters->geometric_asm) { ierr = KSPGetPC(ctx->coarseksp, &coarsepc);CHKERRQ(ierr); ierr = PCASMSetOverlap(coarsepc,0);CHKERRQ(ierr); ierr = PCASMSetLocalSubdomains(coarsepc,1,&coarsegrid->df_global_asm, PETSC_NULL);CHKERRQ(ierr); } ierr = PCSetType(finepc,PCCOMPOSITE);CHKERRQ(ierr); ierr = PCCompositeAddPC(finepc,PCSHELL);CHKERRQ(ierr); ierr = PCCompositeAddPC(finepc,PCASM);CHKERRQ(ierr); /* set up asm (fine) part of two-level preconditioner */ ierr = PCCompositeGetPC(finepc,1,&asmpc);CHKERRQ(ierr); if (parameters->geometric_asm) { ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr); ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr); ierr = PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm, PETSC_NULL);CHKERRQ(ierr); } ierr = SetMyPCDefaults(asmpc);CHKERRQ(ierr); ierr = PCSetFromOptions(asmpc);CHKERRQ(ierr); /* set up coarse solve part of two-level preconditioner */ ierr = PCCompositeGetPC(finepc,0,&coarsesolve);CHKERRQ(ierr); ierr = PCShellSetContext(coarsesolve,ctx);CHKERRQ(ierr); ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr); PetscFunctionReturn(0); } PetscErrorCode CoarseSolvePCApply(PC pc, Vec xin, Vec xout) { JoabCtx *ctx; PetscErrorCode ierr; JoabParameters *parameters; JoabView *view; JoabGrid *finegrid; JoabGrid *coarsegrid; JoabAlgebra *finealgebra; JoabAlgebra *coarsealgebra; PetscInt its; PetscLogDouble t1,t2,v1,v2; KSPConvergedReason reason; PetscFunctionBegin; ierr = PetscPrintf(PETSC_COMM_WORLD,"Setup coarse level preconditioner.....\n");CHKERRQ(ierr); ierr = PCShellGetContext(pc,(void**)&ctx);CHKERRQ(ierr); parameters = &ctx->parameters; view = &ctx->view; finegrid = &ctx->grid; coarsegrid = &ctx->coarsegrid; finealgebra = &ctx->algebra; coarsealgebra = &ctx->coarsealgebra; ierr = PetscGetTime(&t1);CHKERRQ(ierr); parameters->whichlevel = COARSE_GRID; /* restrict fine grid to coarse grid */ ierr = PetscGetTime(&v1);CHKERRQ(ierr); ierr = VecSet(coarsealgebra->predictedShape_opt,0.0);CHKERRQ(ierr);CHKERRQ(ierr); ierr = ApplyRestriction(ctx,xin,coarsealgebra->predictedShape_opt);CHKERRQ(ierr); ierr = VecSet(coarsealgebra->solutionShape_opt,0.0);CHKERRQ(ierr); ierr = KSPSetTolerances(ctx->coarseksp,1e-6,1e-14,PETSC_DEFAULT,1000);CHKERRQ(ierr); ierr = KSPSolve(ctx->coarseksp, coarsealgebra->predictedShape_opt, coarsealgebra->solutionShape_opt);CHKERRQ(ierr); /* interpolate coarse grid to fine grid */ ierr = MatInterpolate(ctx->Interp,coarsealgebra->solutionShape_opt,xout);CHKERRQ(ierr); PetscFunctionReturn(0); } Regards, Rongliang -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Apr 18 22:03:54 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 18 Apr 2011 22:03:54 -0500 Subject: [petsc-users] Problem in PCShellSetApply() In-Reply-To: References: Message-ID: On Apr 18, 2011, at 9:42 PM, Rongliang Chen wrote: > Hi, > > I faced a problem when I used the function PCShellSetApply to set a composite PC in Petsc 3.1. There was no problem when I compiled my code but I got the following message when I run the code. > ------------------------------------------------------------------------------------------------------------ > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message ------------------------------------ > [1]PETSC ERROR: No support for this operation for this object type! > [1]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct solver! > [1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 > ............................... > [7]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c > [7]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c > [7]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c > [7]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c > [7]PETSC ERROR: PCApply_Composite_Additive() line 102 in src/ksp/pc/impls/composite/composite.c > [7]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c > [7]PETSC ERROR: PCApplyBAorAB() line 582 in src/ksp/pc/interface/precon.c > [7]PETSC ERROR: GMREScycle() line 161 in src/ksp/ksp/impls/gmres/gmres.c > [7]PETSC ERROR: KSPSolve_GMRES() line 241 in src/ksp/ksp/impls/gmres/gmres.c > [7]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c > [1]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [1]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c > .............................. > -------------------------------------------------------------------------------------------------------------- > > I found that the function PCShellSetApply did not call the user defined function CoarseSolvePCApply (I called PCShellSetApply like this: ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr); ). Can anyone tell me what may be the problem? The related code attached. > > -----------------------The related code----------------------------------- > PetscErrorCode SetupPreconditioner(JoabCtx* ctx, SNES snes) > { > JoabParameters *parameters = &ctx->parameters; > JoabView *view = &ctx->view; > JoabGrid *grid = &ctx->grid; > JoabGrid *coarsegrid = &ctx->coarsegrid; > JoabAlgebra *algebra = &ctx->algebra; > JoabAlgebra *coarsealgebra = &ctx->coarsealgebra; > > KSP fineksp; > PC finepc, coarsepc; > PC asmpc; > PC coarsesolve; > Vec fineones; > PetscInt veclength; > PetscScalar *values; > int i; > > PetscErrorCode ierr; > > PetscFunctionBegin; > ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr); > ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr); > ierr = KSPCreate(PETSC_COMM_WORLD,&ctx->coarseksp);CHKERRQ(ierr); > > ierr = SetMyKSPDefaults(ctx->coarseksp);CHKERRQ(ierr); > ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr); > ierr = KSPSetOperators(ctx->coarseksp, coarsealgebra->H, coarsealgebra->H, > SAME_NONZERO_PATTERN);CHKERRQ(ierr); > if (parameters->geometric_asm) { > ierr = KSPGetPC(ctx->coarseksp, &coarsepc);CHKERRQ(ierr); > ierr = PCASMSetOverlap(coarsepc,0);CHKERRQ(ierr); > ierr = PCASMSetLocalSubdomains(coarsepc,1,&coarsegrid->df_global_asm, PETSC_NULL);CHKERRQ(ierr); > } > > ierr = PCSetType(finepc,PCCOMPOSITE);CHKERRQ(ierr); > ierr = PCCompositeAddPC(finepc,PCSHELL);CHKERRQ(ierr); > ierr = PCCompositeAddPC(finepc,PCASM);CHKERRQ(ierr); > > /* set up asm (fine) part of two-level preconditioner */ > ierr = PCCompositeGetPC(finepc,1,&asmpc);CHKERRQ(ierr); > if (parameters->geometric_asm) { Is this flag set so it actually sets the type to PCASM > ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr); > ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr); > ierr = PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm, PETSC_NULL);CHKERRQ(ierr); > } > ierr = SetMyPCDefaults(asmpc);CHKERRQ(ierr); What is this setting the solver to? One of the two composite PC's is being set to LU and it is parallel so it fails. My guess is that SetMyPCDefaults() is setting the the PCType to LU. If not you'll need to track through your code, perhaps put a break point in PCSetType() and see where the type is being set to LU. Barry > ierr = PCSetFromOptions(asmpc);CHKERRQ(ierr); > > /* set up coarse solve part of two-level preconditioner */ > ierr = PCCompositeGetPC(finepc,0,&coarsesolve);CHKERRQ(ierr); > ierr = PCShellSetContext(coarsesolve,ctx);CHKERRQ(ierr); > ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr); > > PetscFunctionReturn(0); > } > > PetscErrorCode CoarseSolvePCApply(PC pc, Vec xin, Vec xout) > { > JoabCtx *ctx; > PetscErrorCode ierr; > JoabParameters *parameters; > JoabView *view; > JoabGrid *finegrid; > JoabGrid *coarsegrid; > JoabAlgebra *finealgebra; > JoabAlgebra *coarsealgebra; > PetscInt its; > PetscLogDouble t1,t2,v1,v2; > KSPConvergedReason reason; > > PetscFunctionBegin; > ierr = PetscPrintf(PETSC_COMM_WORLD,"Setup coarse level preconditioner.....\n");CHKERRQ(ierr); > ierr = PCShellGetContext(pc,(void**)&ctx);CHKERRQ(ierr); > parameters = &ctx->parameters; > view = &ctx->view; > finegrid = &ctx->grid; > coarsegrid = &ctx->coarsegrid; > finealgebra = &ctx->algebra; > coarsealgebra = &ctx->coarsealgebra; > > ierr = PetscGetTime(&t1);CHKERRQ(ierr); > > parameters->whichlevel = COARSE_GRID; > > /* restrict fine grid to coarse grid */ > ierr = PetscGetTime(&v1);CHKERRQ(ierr); > ierr = VecSet(coarsealgebra->predictedShape_opt,0.0);CHKERRQ(ierr);CHKERRQ(ierr); > ierr = ApplyRestriction(ctx,xin,coarsealgebra->predictedShape_opt);CHKERRQ(ierr); > > ierr = VecSet(coarsealgebra->solutionShape_opt,0.0);CHKERRQ(ierr); > ierr = KSPSetTolerances(ctx->coarseksp,1e-6,1e-14,PETSC_DEFAULT,1000);CHKERRQ(ierr); > ierr = KSPSolve(ctx->coarseksp, coarsealgebra->predictedShape_opt, coarsealgebra->solutionShape_opt);CHKERRQ(ierr); > > /* interpolate coarse grid to fine grid */ > ierr = MatInterpolate(ctx->Interp,coarsealgebra->solutionShape_opt,xout);CHKERRQ(ierr); > > PetscFunctionReturn(0); > } > > > Regards, > Rongliang > > From khalid_eee at yahoo.com Tue Apr 19 01:34:13 2011 From: khalid_eee at yahoo.com (khalid ashraf) Date: Mon, 18 Apr 2011 23:34:13 -0700 (PDT) Subject: [petsc-users] DMMG with PBC In-Reply-To: References: <442878.22477.qm@web112607.mail.gq1.yahoo.com> Message-ID: <846336.71604.qm@web112609.mail.gq1.yahoo.com> Hi Matt, I wrote the following line after DMMGSetKSP() in ex22.c ierr =DMMGSetNullSpace(dmmg,PETSC_TRUE,0,PETSC_NULL); Still I get the difference in the values calculated by single processor and the multiple processors. The two input values for b that I used are 1 and 10 for all the elements in the vector. I am using this code in one of my programs where I assign a random number to b. I get the discrepancy between single and multiple processors there as well. Thanks. Khalid ________________________________ From: Matthew Knepley To: PETSc users list Cc: khalid ashraf Sent: Fri, April 15, 2011 4:38:25 AM Subject: Re: [petsc-users] DMMG with PBC On Fri, Apr 15, 2011 at 4:28 AM, khalid ashraf wrote: Hi, >I am running src/ksp/ksp/examples/tutorials/ex22.c >I matched the output of single processor and multiprocessor results and it works >fine. >But I want to use a periodic boundary condition. I make the following changes in >the main function and this works fine with this change as well: > > > ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr); > ierr = >DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,2,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr); > > > > > >However, when I comment out these following lines since I am using a PBC, then >the result of 1 proc and multi-proc are not the same. They vary within 5 decimal >points and the difference increases with increasing number of processors. The periodic operator has a null space. You must put that in the solver DMMGSetNullSpace(), so that it is projected out at each step. Matt /* if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){ > v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx); > ierr = >MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr); > } else */ > > >Could you please tell me what is going wrong here. > > >Thanks. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilyascfd at gmail.com Tue Apr 19 02:00:14 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Tue, 19 Apr 2011 10:00:14 +0300 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: Hi Randy, Thank you for your answer. I have already done it. You can see it in my first e-mail. It does not work properly for all number of processors. For certain number of processors, it works correctly, not for all number of processors. For example, for 1,2,or 3 processors, it's ok. For 4 processors, it gives wrong location, so on. "Problem" occurs in 3rd dimension ( (kk-gzs)*gxm*gym ) Here is another suggestion (I have not tried yet) ; do kk=zs,zs+zm-1 do jj=ys,ys+ym-1 do ii=xs,xs+xm-1 row=ii-gxs + (jj-gys)*MX + (kk-gzs)*MX*MY MX,MY,MZ are global dimensions.This is also what I do serially Do you think that it is correct or any other suggestions? Regards, Ilyas. 2011/4/18 Randall Mackie > Here's how I do it: > > do kk=zs,zs+zm-1 > do jj=ys,ys+ym-1 > do ii=xs,xs+xm-1 > > row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym > > > Good luck, > > Randy M. > > > > On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas wrote: > >> Hi, >> Thank you for your suggestion. I will take it into account. >> Since changing this structure in my "massive" code may take too much >> time, >> I would like to know that how "row" is calculated in 3D, independently >> from processor numbers. >> >> Regards, >> Ilyas >> >> 2011/4/18 Matthew Knepley >> >>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas wrote: >>> >>>> Hi, >>>> >>>> In ex14f.F in KSP, "row" variable is calculated either >>>> >>> >>> These are very old. I suggest you use the FormFunctionLocal() approach in >>> ex5f.F which >>> does not calculate global row numbers when using a DA. >>> >>> Matt >>> >>> >>>> 349: do 30 j=ys,ys+ym-1 >>>> 350: ... >>>> 351: do 40 i=xs,xs+xm-1 >>>> 352: row = i - gxs + (j - gys)*gxm + 1 >>>> >>>> or >>>> >>>> 442: do 50 j=ys,ys+ym-1 >>>> 443: ... >>>> 444: row = (j - gys)*gxm + xs - gxs >>>> 445: do 60 i=xs,xs+xm-1 >>>> 446: row = row + 1 >>>> >>>> How can I calculate "row" in 3D ? >>>> >>>> I tried this; >>>> >>>> do k=zs,zs+zm-1 >>>> do j=ys,ys+ym-1 >>>> do i=xs,xs+xm-1 >>>> >>>> row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 >>>> >>>> It does not work for certain number of processors. >>>> >>>> >>>> Thanks, >>>> >>>> Ilyas >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Tue Apr 19 04:40:18 2011 From: jed at 59A2.org (Jed Brown) Date: Tue, 19 Apr 2011 11:40:18 +0200 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: On Tue, Apr 19, 2011 at 09:00, ilyas ilyas wrote: > It does not work properly for all number of processors. The "row" in ex14f.F is a local row, not a global row. The ComputeJacobian in that file manually translates local rows to global rows using the map returned by DAGetGlobalIndices(). Now you should just call MatSetValuesLocal() with the local indices, or even easier, MatSetValuesStencil(). There is no easy way to determine the global index on your own (it is hundreds of lines of code). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Tue Apr 19 04:42:40 2011 From: jed at 59A2.org (Jed Brown) Date: Tue, 19 Apr 2011 11:42:40 +0200 Subject: [petsc-users] DMMG with PBC In-Reply-To: <846336.71604.qm@web112609.mail.gq1.yahoo.com> References: <442878.22477.qm@web112607.mail.gq1.yahoo.com> <846336.71604.qm@web112609.mail.gq1.yahoo.com> Message-ID: On Tue, Apr 19, 2011 at 08:34, khalid ashraf wrote: > Still I get the difference in the values calculated by single processor and > the multiple processors. How much of a difference? http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/faq.html#different -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Tue Apr 19 09:09:52 2011 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 19 Apr 2011 07:09:52 -0700 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: You are right! I just didn't read all the way to the end of your email. Sorry about that. So here is a little more code that does it correctly: PetscInt, pointer :: ltog(:) call DAGetGlobalIndicesF90(da,nloc,ltog,ierr); CHKERRQ(ierr) do kk=zs,zs+zm-1 do jj=ys,ys+ym-1 do ii=xs,xs+xm-1 row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym grow=ltog(3*row + 1) [all your code here] call MatSetValues(A,i1,grow,ic,col,v,INSERT_VALUES, . ierr); CHKERRQ(ierr) [more code here] call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) Hope this is a little more helpful. As Jed points out, there are other ways to do the same thing (and probably more efficiently than what I've outlined here). Randy M. On Tue, Apr 19, 2011 at 12:00 AM, ilyas ilyas wrote: > Hi Randy, > > Thank you for your answer. > > I have already done it. You can see it in my first e-mail. > > It does not work properly for all number of processors. > For certain number of processors, it works correctly, > not for all number of processors. > For example, for 1,2,or 3 processors, it's ok. > For 4 processors, it gives wrong location, so on. > "Problem" occurs in 3rd dimension ( (kk-gzs)*gxm*gym ) > > Here is another suggestion (I have not tried yet) ; > > do kk=zs,zs+zm-1 > do jj=ys,ys+ym-1 > do ii=xs,xs+xm-1 > > row=ii-gxs + (jj-gys)*MX + (kk-gzs)*MX*MY > > MX,MY,MZ are global dimensions.This is also what I do serially > > Do you think that it is correct or any other suggestions? > > Regards, > Ilyas. > > 2011/4/18 Randall Mackie > >> Here's how I do it: >> >> do kk=zs,zs+zm-1 >> do jj=ys,ys+ym-1 >> do ii=xs,xs+xm-1 >> >> row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym >> >> >> Good luck, >> >> Randy M. >> >> >> >> On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas wrote: >> >>> Hi, >>> Thank you for your suggestion. I will take it into account. >>> Since changing this structure in my "massive" code may take too much >>> time, >>> I would like to know that how "row" is calculated in 3D, independently >>> from processor numbers. >>> >>> Regards, >>> Ilyas >>> >>> 2011/4/18 Matthew Knepley >>> >>>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas wrote: >>>> >>>>> Hi, >>>>> >>>>> In ex14f.F in KSP, "row" variable is calculated either >>>>> >>>> >>>> These are very old. I suggest you use the FormFunctionLocal() approach >>>> in ex5f.F which >>>> does not calculate global row numbers when using a DA. >>>> >>>> Matt >>>> >>>> >>>>> 349: do 30 j=ys,ys+ym-1 >>>>> 350: ... >>>>> 351: do 40 i=xs,xs+xm-1 >>>>> 352: row = i - gxs + (j - gys)*gxm + 1 >>>>> >>>>> or >>>>> >>>>> 442: do 50 j=ys,ys+ym-1 >>>>> 443: ... >>>>> 444: row = (j - gys)*gxm + xs - gxs >>>>> 445: do 60 i=xs,xs+xm-1 >>>>> 446: row = row + 1 >>>>> >>>>> How can I calculate "row" in 3D ? >>>>> >>>>> I tried this; >>>>> >>>>> do k=zs,zs+zm-1 >>>>> do j=ys,ys+ym-1 >>>>> do i=xs,xs+xm-1 >>>>> >>>>> row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 >>>>> >>>>> It does not work for certain number of processors. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Ilyas >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From khalid_eee at yahoo.com Tue Apr 19 18:26:00 2011 From: khalid_eee at yahoo.com (khalid ashraf) Date: Tue, 19 Apr 2011 16:26:00 -0700 (PDT) Subject: [petsc-users] DMMG with PBC In-Reply-To: References: Message-ID: <369201.93548.qm@web112604.mail.gq1.yahoo.com> >>How much of a difference? With the applied XYZPeriodic, if I keep the follwoing lines if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){ v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx); ierr = MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr); } else Then the error between 1proc and 4 procs is only after 5th decimal point. However, if I comment out the above lines then the results are completely different for 1 and 4 procs. I am attaching the output of some last data points of a 10X10X8 grid. 1 proc output: -4.40214 -4.39202 -4.38693 -4.38547 -4.38687 -4.39047 4 proc output: 0.000188031 0.000169784 0.000157229 0.000178713 0.000179637 0.000188031 0.000169784 0.000157229 0.000178713 0.000179637 I am attaching the faulty code here for your review. Thanks. Khalid static char help[] = "Solves 3D Laplacian using multigrid.\n\n"; #include "petscda.h" #include "petscksp.h" #include "petscdmmg.h" #include "myHeaderfile.h" extern PetscErrorCode ComputeMatrix(DMMG,Mat,Mat); extern PetscErrorCode ComputeRHS(DMMG,Vec); #undef __FUNCT__ #define __FUNCT__ "main" int main(int argc,char **argv) { PetscErrorCode ierr; DMMG *dmmg; PetscReal norm; DA da; PetscInitialize(&argc,&argv,(char *)0,help); ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr); ierr = DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,8,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr); ierr = DMMGSetDM(dmmg,(DM)da);CHKERRQ(ierr); // ierr = DADestroy(da);CHKERRQ(ierr); ierr = DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix);CHKERRQ(ierr); ierr =DMMGSetNullSpace(dmmg,PETSC_TRUE,0,PETSC_NULL); ierr = DMMGSetUp(dmmg);CHKERRQ(ierr); ierr = DMMGSolve(dmmg);CHKERRQ(ierr); ierr = MatMult(DMMGGetJ(dmmg),DMMGGetx(dmmg),DMMGGetr(dmmg));CHKERRQ(ierr); ierr = VecAXPY(DMMGGetr(dmmg),-1.0,DMMGGetRHS(dmmg));CHKERRQ(ierr); ierr = VecNorm(DMMGGetr(dmmg),NORM_2,&norm);CHKERRQ(ierr); /* ierr = PetscPrintf(PETSC_COMM_WORLD,"Residual norm %G\n",norm);CHKERRQ(ierr); */ ierr=VecView_VTK(DMMGGetx(dmmg),"X",&appctx); ierr = DMMGDestroy(dmmg);CHKERRQ(ierr); ierr = PetscFinalize();CHKERRQ(ierr); return 0; } #undef __FUNCT__ #define __FUNCT__ "ComputeRHS" PetscErrorCode ComputeRHS(DMMG dmmg,Vec b) { PetscErrorCode ierr; PetscInt mx,my,mz; PetscScalar h; PetscFunctionBegin; ierr = DAGetInfo((DA)dmmg->dm,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr); h = 10.0/((mx-1)*(my-1)*(mz-1)); ierr = VecSet(b,h);CHKERRQ(ierr); PetscFunctionReturn(0); } #undef __FUNCT__ #define __FUNCT__ "ComputeMatrix" PetscErrorCode ComputeMatrix(DMMG dmmg,Mat jac,Mat B) { DA da = (DA)dmmg->dm; PetscErrorCode ierr; PetscInt i,j,k,mx,my,mz,xm,ym,zm,xs,ys,zs; PetscScalar v[7],Hx,Hy,Hz,HxHydHz,HyHzdHx,HxHzdHy; MatStencil row,col[7]; ierr = DAGetInfo(da,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr); Hx = 1.0 / (PetscReal)(mx-1); Hy = 1.0 / (PetscReal)(my-1); Hz = 1.0 / (PetscReal)(mz-1); HxHydHz = Hx*Hy/Hz; HxHzdHy = Hx*Hz/Hy; HyHzdHx = Hy*Hz/Hx; ierr = DAGetCorners(da,&xs,&ys,&zs,&xm,&ym,&zm);CHKERRQ(ierr); for (k=zs; k From knepley at gmail.com Tue Apr 19 18:31:17 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Apr 2011 18:31:17 -0500 Subject: [petsc-users] DMMG with PBC In-Reply-To: <369201.93548.qm@web112604.mail.gq1.yahoo.com> References: <369201.93548.qm@web112604.mail.gq1.yahoo.com> Message-ID: On Tue, Apr 19, 2011 at 6:26 PM, khalid ashraf wrote: > >>How much of a difference? > With the applied XYZPeriodic, if I keep the follwoing lines > > if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){ > v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx); > ierr = > MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr); > } else > > Then the error between 1proc and 4 procs is only after 5th decimal point. > However, if I comment out the > above lines then the results are completely different for 1 and 4 procs. > Yes, without this the problem is rank deficient. I suspect that your 5th decimal place difference comes from either a) different partitions b) different convergence (stop on a different iterate) or c) parallel reordering (but it seems big). Matt > I am attaching the output of some last data points of a 10X10X8 grid. > 1 proc output: > -4.40214 > -4.39202 > -4.38693 > -4.38547 > -4.38687 > -4.39047 > > 4 proc output: > 0.000188031 > 0.000169784 > 0.000157229 > 0.000178713 > 0.000179637 > 0.000188031 > 0.000169784 > 0.000157229 > 0.000178713 > 0.000179637 > > I am attaching the faulty code here for your review. > > Thanks. > > Khalid > > static char help[] = "Solves 3D Laplacian using multigrid.\n\n"; > > #include "petscda.h" > #include "petscksp.h" > #include "petscdmmg.h" > #include "myHeaderfile.h" > > extern PetscErrorCode ComputeMatrix(DMMG,Mat,Mat); > extern PetscErrorCode ComputeRHS(DMMG,Vec); > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc,char **argv) > { > PetscErrorCode ierr; > DMMG *dmmg; > PetscReal norm; > DA da; > > PetscInitialize(&argc,&argv,(char *)0,help); > > ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr); > ierr = > DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,8,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr); > ierr = DMMGSetDM(dmmg,(DM)da);CHKERRQ(ierr); > // ierr = DADestroy(da);CHKERRQ(ierr); > > ierr = DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix);CHKERRQ(ierr); > ierr =DMMGSetNullSpace(dmmg,PETSC_TRUE,0,PETSC_NULL); > > ierr = DMMGSetUp(dmmg);CHKERRQ(ierr); > ierr = DMMGSolve(dmmg);CHKERRQ(ierr); > > ierr = > MatMult(DMMGGetJ(dmmg),DMMGGetx(dmmg),DMMGGetr(dmmg));CHKERRQ(ierr); > ierr = VecAXPY(DMMGGetr(dmmg),-1.0,DMMGGetRHS(dmmg));CHKERRQ(ierr); > ierr = VecNorm(DMMGGetr(dmmg),NORM_2,&norm);CHKERRQ(ierr); > /* ierr = PetscPrintf(PETSC_COMM_WORLD,"Residual norm > %G\n",norm);CHKERRQ(ierr); */ > ierr=VecView_VTK(DMMGGetx(dmmg),"X",&appctx); > > ierr = DMMGDestroy(dmmg);CHKERRQ(ierr); > ierr = PetscFinalize();CHKERRQ(ierr); > > return 0; > } > > #undef __FUNCT__ > #define __FUNCT__ "ComputeRHS" > PetscErrorCode ComputeRHS(DMMG dmmg,Vec b) > { > PetscErrorCode ierr; > PetscInt mx,my,mz; > PetscScalar h; > > PetscFunctionBegin; > ierr = DAGetInfo((DA)dmmg->dm,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr); > h = 10.0/((mx-1)*(my-1)*(mz-1)); > ierr = VecSet(b,h);CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > #undef __FUNCT__ > #define __FUNCT__ "ComputeMatrix" > PetscErrorCode ComputeMatrix(DMMG dmmg,Mat jac,Mat B) > { > DA da = (DA)dmmg->dm; > PetscErrorCode ierr; > PetscInt i,j,k,mx,my,mz,xm,ym,zm,xs,ys,zs; > PetscScalar v[7],Hx,Hy,Hz,HxHydHz,HyHzdHx,HxHzdHy; > MatStencil row,col[7]; > > ierr = DAGetInfo(da,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr); > Hx = 1.0 / (PetscReal)(mx-1); Hy = 1.0 / (PetscReal)(my-1); Hz = 1.0 / > (PetscReal)(mz-1); > HxHydHz = Hx*Hy/Hz; HxHzdHy = Hx*Hz/Hy; HyHzdHx = Hy*Hz/Hx; > ierr = DAGetCorners(da,&xs,&ys,&zs,&xm,&ym,&zm);CHKERRQ(ierr); > > for (k=zs; k for (j=ys; j for(i=xs; i row.i = i; row.j = j; row.k = k; > /* if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){ > v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx); > ierr = > MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr); > } else */ > { > v[0] = -HxHydHz;col[0].i = i; col[0].j = j; col[0].k = k-1; > v[1] = -HxHzdHy;col[1].i = i; col[1].j = j-1; col[1].k = k; > v[2] = -HyHzdHx;col[2].i = i-1; col[2].j = j; col[2].k = k; > v[3] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);col[3].i = row.i; > col[3].j = row.j; col[3].k = row.k; > v[4] = -HyHzdHx;col[4].i = i+1; col[4].j = j; col[4].k = k; > v[5] = -HxHzdHy;col[5].i = i; col[5].j = j+1; col[5].k = k; > v[6] = -HxHydHz;col[6].i = i; col[6].j = j; col[6].k = k+1; > ierr = > MatSetValuesStencil(B,1,&row,7,col,v,INSERT_VALUES);CHKERRQ(ierr); > } > } > } > } > ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > return 0; > } > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Debao.Shao at brion.com Tue Apr 19 20:31:11 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Tue, 19 Apr 2011 18:31:11 -0700 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> Hi, Barry: Thanks for the reply. I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. Any suggestions? Thanks, Debao -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, April 15, 2011 9:25 PM To: PETSc users list Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime Debao, Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. Barry On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: > Dear Petsc: > > I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. > > My libpetsc.a is built as follows: > 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 > 2, make all; > > It's very appreciated to get your reply. > > Thanks a lot, > Debao > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From bsmith at mcs.anl.gov Tue Apr 19 20:40:05 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Apr 2011 20:40:05 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> Message-ID: <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: > Hi, Barry: > > Thanks for the reply. > > I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). Barry PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. > > Any suggestions? > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Friday, April 15, 2011 9:25 PM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > Debao, > > Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. > > Barry > > On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: > >> Dear Petsc: >> >> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >> >> My libpetsc.a is built as follows: >> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >> 2, make all; >> >> It's very appreciated to get your reply. >> >> Thanks a lot, >> Debao >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From Debao.Shao at brion.com Tue Apr 19 20:50:14 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Tue, 19 Apr 2011 18:50:14 -0700 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> Here is my sample code: ierr = MatZeroEntries( M ); assert( ierr == 0); ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? Thanks, Debao -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, April 20, 2011 9:40 AM To: PETSc users list Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: > Hi, Barry: > > Thanks for the reply. > > I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). Barry PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. > > Any suggestions? > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Friday, April 15, 2011 9:25 PM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > Debao, > > Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. > > Barry > > On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: > >> Dear Petsc: >> >> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >> >> My libpetsc.a is built as follows: >> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >> 2, make all; >> >> It's very appreciated to get your reply. >> >> Thanks a lot, >> Debao >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From Debao.Shao at brion.com Tue Apr 19 21:08:49 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Tue, 19 Apr 2011 19:08:49 -0700 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E69C8@EX03> Dear Barry: If I add "MatSetOption(C,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE)" before " MatZeroEntries", need I reset it back when doing MatCopy? I'm a freshman to PETSC, your reply is very appreciated. Thanks, Debao -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Debao Shao Sent: Wednesday, April 20, 2011 9:50 AM To: PETSc users list Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime Here is my sample code: ierr = MatZeroEntries( M ); assert( ierr == 0); ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? Thanks, Debao -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, April 20, 2011 9:40 AM To: PETSc users list Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: > Hi, Barry: > > Thanks for the reply. > > I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). Barry PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. > > Any suggestions? > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Friday, April 15, 2011 9:25 PM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > Debao, > > Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. > > Barry > > On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: > >> Dear Petsc: >> >> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >> >> My libpetsc.a is built as follows: >> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >> 2, make all; >> >> It's very appreciated to get your reply. >> >> Thanks a lot, >> Debao >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From knepley at gmail.com Tue Apr 19 21:10:29 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Apr 2011 21:10:29 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69C8@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <384FF55F15E3E447802DC8CCA85696980AAB3E69C8@EX03> Message-ID: On Tue, Apr 19, 2011 at 9:08 PM, Debao Shao wrote: > Dear Barry: > > If I add "MatSetOption(C,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE)" before " > MatZeroEntries", need I reset it back when doing MatCopy? > No. Matt > I'm a freshman to PETSC, your reply is very appreciated. > > Thanks, > Debao > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] On Behalf Of Debao Shao > Sent: Wednesday, April 20, 2011 9:50 AM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 > percentage of runtime > > Here is my sample code: > ierr = MatZeroEntries( M ); assert( ierr == 0); > ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); > ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); > > I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called > MatAssembly***, Is the usage wrong, or, how to deal with the problem? > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, April 20, 2011 9:40 AM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 > percentage of runtime > > > On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: > > > Hi, Barry: > > > > Thanks for the reply. > > > > I preallocated enough space for the sparse matrix, but I found > mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax > less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is > called frequently when doing MatSetValues again to the matrix. > > Are you using MatZeroRows()? If so call MatSetOption(mat, > MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that > structure. > > If you are not using MatZeroRows() then apparently the first time you > set values in there and call MatAssemblyEnd() you have left many locations > that later will be filled unfilled and so they are eliminated at MatAssembly > time. You must make sure that all potentially nonzero locations get a value > put in initially (put zero for the locations that you don't yet have a > value for) before you first call MatAssemblyEnd(). > > Barry > > PETSc matrices have no way of retaining extra locations you preallocated > for unless you put something (like 0) in there. > > > > > Any suggestions? > > > > Thanks, > > Debao > > -----Original Message----- > > From: petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > > Sent: Friday, April 15, 2011 9:25 PM > > To: PETSc users list > > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 > percentage of runtime > > > > > > Debao, > > > > Please see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assemblyIt should resolve the difficulties. > > > > Barry > > > > On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: > > > >> Dear Petsc: > >> > >> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange > that the two functions "MatCopy" and "MatSetValue" consume most of runtime, > and the functions were not called frequently, just several times. > >> > >> My libpetsc.a is built as follows: > >> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 > -with-info=0 > >> 2, make all; > >> > >> It's very appreciated to get your reply. > >> > >> Thanks a lot, > >> Debao > >> > >> -- The information contained in this communication and any attachments > is confidential and may be privileged, and is for the sole use of the > intended recipient(s). Any unauthorized review, use, disclosure or > distribution is prohibited. Unless explicitly stated otherwise in the body > of this communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. ASML is neither liable for the > proper and complete transmission of the information contained in this > communication, nor for any delay in its receipt. > > > > > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. ASML is neither liable for the > proper and complete transmission of the information contained in this > communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. ASML is neither liable for the > proper and complete transmission of the information contained in this > communication, nor for any delay in its receipt. > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are > doing so at your own risk. If you are not the intended recipient, please > notify the sender immediately by replying to this message and destroy all > copies of this message and any attachments. ASML is neither liable for the > proper and complete transmission of the information contained in this > communication, nor for any delay in its receipt. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Apr 19 21:58:30 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Apr 2011 21:58:30 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> Message-ID: <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> On Apr 19, 2011, at 8:50 PM, Debao Shao wrote: > Here is my sample code: > ierr = MatZeroEntries( M ); assert( ierr == 0); > ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); > ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); > > I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. Barry I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them. > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, April 20, 2011 9:40 AM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: > >> Hi, Barry: >> >> Thanks for the reply. >> >> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. > > Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. > > If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). > > Barry > > PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. > >> >> Any suggestions? >> >> Thanks, >> Debao >> -----Original Message----- >> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Friday, April 15, 2011 9:25 PM >> To: PETSc users list >> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >> >> >> Debao, >> >> Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. >> >> Barry >> >> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: >> >>> Dear Petsc: >>> >>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >>> >>> My libpetsc.a is built as follows: >>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >>> 2, make all; >>> >>> It's very appreciated to get your reply. >>> >>> Thanks a lot, >>> Debao >>> >>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >> >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From vijay.m at gmail.com Tue Apr 19 22:08:46 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Tue, 19 Apr 2011 22:08:46 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> Message-ID: > ? Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. Barry, just to confirm, are you saying that MatZeroEntries would nullify the preallocation completely if called before AssmeblyBegin/End ? I have been doing this quite often before a linear system assembly and have not noticed extra mallocs during the process. Is there something that I am misunderstanding in the above statement ? I would much appreciate if you can clarify. Thanks, Vijay On Tue, Apr 19, 2011 at 9:58 PM, Barry Smith wrote: > > On Apr 19, 2011, at 8:50 PM, Debao Shao wrote: > >> Here is my sample code: >> ?ierr = MatZeroEntries( M ); assert( ierr ?== 0); >> ?ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); >> ?ierr = MatCopy( ms->M, mStorage->M, ?DIFFERENT_NONZERO_PATTERN ); >> >> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? > > ? Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. > > ? ?Barry > > I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. ?Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them. > >> >> Thanks, >> Debao >> -----Original Message----- >> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Wednesday, April 20, 2011 9:40 AM >> To: PETSc users list >> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >> >> >> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: >> >>> Hi, Barry: >>> >>> Thanks for the reply. >>> >>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. >> >> ? Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. >> >> ? ?If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. ?You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have ?a value for) before you first call MatAssemblyEnd(). >> >> ? Barry >> >> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. >> >>> >>> Any suggestions? >>> >>> Thanks, >>> Debao >>> -----Original Message----- >>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >>> Sent: Friday, April 15, 2011 9:25 PM >>> To: PETSc users list >>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >>> >>> >>> ?Debao, >>> >>> ? ? Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. >>> >>> ? Barry >>> >>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: >>> >>>> Dear Petsc: >>>> >>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >>>> >>>> My libpetsc.a is built as follows: >>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >>>> 2, make all; >>>> >>>> It's very appreciated to get your reply. >>>> >>>> Thanks a lot, >>>> Debao >>>> >>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >>> >>> >>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >> >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > From Debao.Shao at brion.com Tue Apr 19 22:21:34 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Tue, 19 Apr 2011 20:21:34 -0700 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03> Hi, Barry: I'm confused, 1), if we can't use "MatZeroEntries" before MatAssembly, then, how do we do initialization for M? 2), if we can't use "MatCopy" before MatAssembly, then, how to fill up M from another matrix? Can you give a sample code for the right usage? Thanks very much. Regards, Debao -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, April 20, 2011 10:59 AM To: PETSc users list Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime On Apr 19, 2011, at 8:50 PM, Debao Shao wrote: > Here is my sample code: > ierr = MatZeroEntries( M ); assert( ierr == 0); > ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); > ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); > > I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. Barry I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them. > > Thanks, > Debao > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, April 20, 2011 9:40 AM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: > >> Hi, Barry: >> >> Thanks for the reply. >> >> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. > > Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. > > If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). > > Barry > > PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. > >> >> Any suggestions? >> >> Thanks, >> Debao >> -----Original Message----- >> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Friday, April 15, 2011 9:25 PM >> To: PETSc users list >> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >> >> >> Debao, >> >> Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. >> >> Barry >> >> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: >> >>> Dear Petsc: >>> >>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >>> >>> My libpetsc.a is built as follows: >>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >>> 2, make all; >>> >>> It's very appreciated to get your reply. >>> >>> Thanks a lot, >>> Debao >>> >>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >> >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From jed at 59A2.org Wed Apr 20 05:05:54 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 20 Apr 2011 12:05:54 +0200 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> Message-ID: On Wed, Apr 20, 2011 at 05:08, Vijay S. Mahadevan wrote: > Barry, just to confirm, are you saying that MatZeroEntries would > nullify the preallocation completely if called before > AssmeblyBegin/End ? > I can't think of a way that would happen. It may zero more entries than necessary, but it shouldn't forget the preallocation. Note that the preallocation in DMGetMatrix() and many other libraries actually insert explicit zeros in the locations it has preallocated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 20 05:07:15 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 20 Apr 2011 12:07:15 +0200 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> Message-ID: On Wed, Apr 20, 2011 at 03:50, Debao Shao wrote: > ierr = MatZeroEntries( M ); assert( ierr == 0); > ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); > ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); > Is M somehow related to ms->M or mStorage->M? What do you actually want to do? -------------- next part -------------- An HTML attachment was scrubbed... URL: From vijay.m at gmail.com Wed Apr 20 06:46:35 2011 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Wed, 20 Apr 2011 06:46:35 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> Message-ID: Thanks for the clarification Jed. But zeroing more entries than necessary would still trigger malloc calls. Would it not ? Vijay On Apr 20, 2011 5:05 AM, "Jed Brown" wrote: > On Wed, Apr 20, 2011 at 05:08, Vijay S. Mahadevan wrote: > >> Barry, just to confirm, are you saying that MatZeroEntries would >> nullify the preallocation completely if called before >> AssmeblyBegin/End ? >> > > I can't think of a way that would happen. It may zero more entries than > necessary, but it shouldn't forget the preallocation. Note that the > preallocation in DMGetMatrix() and many other libraries actually insert > explicit zeros in the locations it has preallocated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Wed Apr 20 06:55:25 2011 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Wed, 20 Apr 2011 13:55:25 +0200 Subject: [petsc-users] FETI-DP In-Reply-To: References: <4DA6F440.4000204@tu-dresden.de> Message-ID: <4DAEC9AD.2010509@tu-dresden.de> There one small thing on the implementation details of the FETI-DP, I cannot figure out. Maybe some of you could help me to understand it, though it is not directly related to PETSc. Non of the publications says something about how to distribute the Lagrange multipliers over the processors. Is there any good way to do it or can it done arbitrarily? And should be the jump operators B^i be directly assembled or should they be implemented in a matrix-free way? I'm confuse because in the work of Klawoon/Rheinbach, it is claimed that the following operator can be solved in a pure local way: F = \sum_{i=1}^{N} B^i inv(K_BB^i) trans(B^i) With B^i the jump operators and K_BB^i the discretization of the sub domains with the primal nodes. From the notation it follows that EACH local solve takes the whole vector of Lagrange multipliers. But this is not applicable for a good parallel implementation. Any hint on this topic would be helpful for me to understand this problem. Thomas Jed Brown wrote: > On Thu, Apr 14, 2011 at 15:18, Thomas Witkowski > > wrote: > > Has anybody of you implemented the FETI-DP method in PETSc? I > think about to do this for my FEM code, but first I want to > evaluate the effort of the implementation. > > > There are a few implementations out there. Probably most notable is > Axel Klawonn and Oliver Rheinbach's implementation which has been > scaled up to very large problems and computers. My understanding is > that Xuemin Tu did some work on BDDC (equivalent to FETI-DP) using > PETSc. I am not aware of anyone releasing a working FETI-DP > implementation using PETSc, but of course you're welcome to ask these > people if they would share code with you. > > > What sort of problems do you want it for (physics and mesh)? How are > you currently assembling your systems? A fully general FETI-DP > implementation is a lot of work. For a specific class of problems and > variant of FETI-DP, it will still take some effort, but should not be > too much. > > There was a start to a FETI-DP implementation in PETSc quite a while > ago, but it died due to bitrot and different ideas of how we would > like to implement. You can get that code from mercurial: > > http://petsc.cs.iit.edu/petsc/petsc-dev/rev/021f379b5eea > > > The fundamental ingredient of these methods is a "partially assembled" > matrix. For a library implementation, the challenges are > > 1. How does the user provide the information necessary to decide what > the coarse space looks like? (It's different for scalar problems, > compressible elasticity, and Stokes, and tricky to do with no > geometric information from the user.) The coefficient structure in the > problem matters a lot when deciding which coarse basis functions to > use, see http://dx.doi.org/10.1016/j.cma.2006.03.023 > > 2. How do you handle primal basis functions with large support (e.g. > rigid body modes of a face)? Two choices here: > http://www.cs.nyu.edu/cs/faculty/widlund/FETI-DP-elasticity_TR.pdf . > > 3. How do you make it easy for the user to provide the required > matrix? Ideally, the user would just use plain MatSetValuesLocal() and > run with -mat_type partially-assembled -pc_type fetidp instead of, say > -mat_type baij -pc_type asm. It should work for multiple subdomains > per process and subdomains spanning multiple processes. This can now > be done by implementing MatGetLocalSubMatrix(). The local blocks of > the partially assembled system should be able to use different formats > (e.g. SBAIJ). > > 4. How do you handle more than two levels? This is very important to > use more than about 1000 subdomains in 3D because the coarse problem > just gets too big (unless the coarse problem happens to be > well-conditioned enough that you can use algebraic multigrid). > > > I've wanted to implement FETI-DP in PETSc for almost two years, but > it's never been a high priority. I think I now know how to get enough > flexibility to make it worthwhile to me. I'd be happy to discuss > implementation issues with you. From jed at 59A2.org Wed Apr 20 07:02:43 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 20 Apr 2011 14:02:43 +0200 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> Message-ID: On Wed, Apr 20, 2011 at 13:46, Vijay S. Mahadevan wrote: > Thanks for the clarification Jed. But zeroing more entries than necessary > would still trigger malloc calls. Would it not ? It won't zero more than you allocated but it might zero more than you will actually insert. It doesn't matter. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 20 07:43:46 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 20 Apr 2011 14:43:46 +0200 Subject: [petsc-users] FETI-DP In-Reply-To: <4DAEC9AD.2010509@tu-dresden.de> References: <4DA6F440.4000204@tu-dresden.de> <4DAEC9AD.2010509@tu-dresden.de> Message-ID: Thomas, we should move this discussion to petsc-dev, are you subscribed to that list? On Wed, Apr 20, 2011 at 13:55, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > There one small thing on the implementation details of the FETI-DP, I > cannot figure out. Maybe some of you could help me to understand it, though > it is not directly related to PETSc. Non of the publications says something > about how to distribute the Lagrange multipliers over the processors. Is > there any good way to do it or can it done arbitrarily? > All their work that I have seen assumes a fully redundant set of Lagrange multipliers. In that context, each Lagrange multiplier only ever couples two subdomains together. Either process can then take ownership of that single Lagrange multiplier. > And should be the jump operators B^i be directly assembled or should they > be implemented in a matrix-free way? > Usually these constraints are sparse so I think it is no problem to assume that they are always assembled. > I'm confuse because in the work of Klawoon/Rheinbach, it is claimed that > the following operator can be solved in a pure local way: > > F = \sum_{i=1}^{N} B^i inv(K_BB^i) trans(B^i) > Did they use "F" for this thing? Usually F is the FETI-DP operator which involves a Schur complement of the entire partially assembled operator in the dual space. In any case, this thing is not purely local since the jump operators B^i need neighboring values so it has the same communication as a MatMult. > With B^i the jump operators and K_BB^i the discretization of the sub > domains with the primal nodes. > I think you mean "with the primal nodes removed". > From the notation it follows that EACH local solve takes the whole vector > of Lagrange multipliers. But this is not applicable for a good parallel > implementation. Any hint on this topic would be helpful for me to understand > this problem. > I can't tell from their papers how B is stored. It would be natural to simply store B as a normal assembled matrix with a standard row partition of the Lagrange multipliers. Then you would apply the subdomain solve operator using MatMultTranspose(B,XLambdaGlobal,XGlobal); for (i=0; i From bsmith at mcs.anl.gov Wed Apr 20 08:13:36 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 20 Apr 2011 08:13:36 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> Message-ID: On Apr 19, 2011, at 10:08 PM, Vijay S. Mahadevan wrote: >> Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. > > Barry, just to confirm, are you saying that MatZeroEntries would > nullify the preallocation completely if called before > AssmeblyBegin/End ? I have been doing this quite often before a linear > system assembly and have not noticed extra mallocs during the process. > Is there something that I am misunderstanding in the above statement ? My mistake. Yes if you call MatZeroEntries() on a matrix you have not started putting values in it will not destroy the preallocation information. Barry > I would much appreciate if you can clarify. > > Thanks, > Vijay > > On Tue, Apr 19, 2011 at 9:58 PM, Barry Smith wrote: >> >> On Apr 19, 2011, at 8:50 PM, Debao Shao wrote: >> >>> Here is my sample code: >>> ierr = MatZeroEntries( M ); assert( ierr == 0); >>> ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); >>> ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); >>> >>> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? >> >> Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. >> >> Barry >> >> I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them. >> >>> >>> Thanks, >>> Debao >>> -----Original Message----- >>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >>> Sent: Wednesday, April 20, 2011 9:40 AM >>> To: PETSc users list >>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >>> >>> >>> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: >>> >>>> Hi, Barry: >>>> >>>> Thanks for the reply. >>>> >>>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. >>> >>> Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. >>> >>> If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). >>> >>> Barry >>> >>> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. >>> >>>> >>>> Any suggestions? >>>> >>>> Thanks, >>>> Debao >>>> -----Original Message----- >>>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >>>> Sent: Friday, April 15, 2011 9:25 PM >>>> To: PETSc users list >>>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >>>> >>>> >>>> Debao, >>>> >>>> Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. >>>> >>>> Barry >>>> >>>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: >>>> >>>>> Dear Petsc: >>>>> >>>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >>>>> >>>>> My libpetsc.a is built as follows: >>>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >>>>> 2, make all; >>>>> >>>>> It's very appreciated to get your reply. >>>>> >>>>> Thanks a lot, >>>>> Debao >>>>> >>>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >>>> >>>> >>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >>> >>> >>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >> >> From bsmith at mcs.anl.gov Wed Apr 20 08:18:56 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 20 Apr 2011 08:18:56 -0500 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03> Message-ID: <3FAF21D8-DC2B-40EE-A79B-1FCF4DCAABD6@mcs.anl.gov> On Apr 19, 2011, at 10:21 PM, Debao Shao wrote: > Hi, Barry: > > I'm confused, > 1), if we can't use "MatZeroEntries" before MatAssembly, then, how do we do initialization for M? When you create a sparse matrix it automatically has no non-zero values in it so there is no reason to call MatZeroEntries() on it. But I was wrong it is ok to call MatZeroEntries() on it and it will not destroy the preallocation > 2), if we can't use "MatCopy" before MatAssembly, then, how to fill up M from another matrix? You can copy, say A, to M with MatCopy() but M will get the same nonzero structure as A, if you provided "extra" preallocation information in M that will be lost in the copy. So it is not efficient to copy into a matrix M and then start putting as bunch of new nonzero locations into M. Barry > > Can you give a sample code for the right usage? Thanks very much. > > Regards, > Debao > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, April 20, 2011 10:59 AM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > On Apr 19, 2011, at 8:50 PM, Debao Shao wrote: > >> Here is my sample code: >> ierr = MatZeroEntries( M ); assert( ierr == 0); >> ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); >> ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); >> >> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? > > Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. > > Barry > > I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them. > >> >> Thanks, >> Debao >> -----Original Message----- >> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Wednesday, April 20, 2011 9:40 AM >> To: PETSc users list >> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >> >> >> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: >> >>> Hi, Barry: >>> >>> Thanks for the reply. >>> >>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. >> >> Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. >> >> If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). >> >> Barry >> >> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. >> >>> >>> Any suggestions? >>> >>> Thanks, >>> Debao >>> -----Original Message----- >>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >>> Sent: Friday, April 15, 2011 9:25 PM >>> To: PETSc users list >>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >>> >>> >>> Debao, >>> >>> Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. >>> >>> Barry >>> >>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: >>> >>>> Dear Petsc: >>>> >>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >>>> >>>> My libpetsc.a is built as follows: >>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >>>> 2, make all; >>>> >>>> It's very appreciated to get your reply. >>>> >>>> Thanks a lot, >>>> Debao >>>> >>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >>> >>> >>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >> >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From domenico.borzacchiello at univ-st-etienne.fr Wed Apr 20 10:32:31 2011 From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr) Date: Wed, 20 Apr 2011 17:32:31 +0200 (CEST) Subject: [petsc-users] DaSetGetMatrix Message-ID: <02985683a2f41fe8ced5b122db8f4a1c.squirrel@arcon.univ-st-etienne.fr> Hi, I'm running my code (3D Stokes Solver with MAC arrangement) pretty fine so far with a FieldSplit/Schur Preconditioning. I had to write my own DAGetMatrix routine cause it was using too much memory (5 times the required size) for the matrices. The code works with 1 2 3 5 etc procs (I presume with any number of procs for which are only possible 1D cartesian topologies of communicators i.e. any prime numbers) then if I run with 4 procs for example it stops when assembling the MPIAIJ matrix with the following error: [1]PETSC ERROR: Nonconforming object sizes! [1]PETSC ERROR: Local scatter sizes don't match! What could be causing the error? Thank you, Domenico here's the getmatrix function I'm using #undef __FUNCT__ #define __FUNCT__ "DAGetMatrix_User_2" PetscErrorCode DAGetMatrix_User_2(DA da,const MatType mtype,Mat *J) { PetscErrorCode ierr; Mat A; PetscInt xm,ym,zm,dim,dof,starts[3],dims[3]; const MatType Atype; void (*aij)(void)=PETSC_NULL,(*baij)(void)=PETSC_NULL,(*sbaij)(void)=PETSC_NULL; ISLocalToGlobalMapping ltog,ltogb; PetscFunctionBegin; ierr = DAGetInfo(da,&dim, 0,0,0, 0,0,0,&dof,0,0,0);CHKERRQ(ierr); if (dim != 3) SETERRQ(PETSC_ERR_ARG_WRONG,"Expected DA to be 3D"); ierr = DAGetCorners(da,0,0,0,&zm,&ym,&xm);CHKERRQ(ierr); ierr = DAGetISLocalToGlobalMapping(da,<og);CHKERRQ(ierr); ierr = DAGetISLocalToGlobalMappingBlck(da,<ogb);CHKERRQ(ierr); ierr = MatCreate(((PetscObject)da)->comm,&A);CHKERRQ(ierr); ierr = MatSetSizes(A,dof*xm*ym*zm,dof*xm*ym*zm,PETSC_DETERMINE,PETSC_DETERMINE);CHKERRQ(ierr); ierr = MatSetType(A,mtype);CHKERRQ(ierr); ierr = MatSetFromOptions(A);CHKERRQ(ierr); ierr = MatSeqAIJSetPreallocation(A,17,PETSC_NULL);CHKERRQ(ierr); ierr = MatMPIAIJSetPreallocation(A,17,PETSC_NULL,12,PETSC_NULL);CHKERRQ(ierr); ierr = MatSeqBAIJSetPreallocation(A,dof,7,PETSC_NULL);CHKERRQ(ierr); ierr = MatMPIBAIJSetPreallocation(A,dof,7,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr); ierr = MatSeqSBAIJSetPreallocation(A,dof,4,PETSC_NULL);CHKERRQ(ierr); ierr = MatMPISBAIJSetPreallocation(A,dof,4,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr); ierr = MatSetDA(A,da); ierr = MatSetFromOptions(A); ierr = MatGetType(A,&Atype); ierr = MatSetBlockSize(A,dof);CHKERRQ(ierr); ierr = MatSetLocalToGlobalMapping(A,ltog);CHKERRQ(ierr); ierr = MatSetLocalToGlobalMappingBlock(A,ltogb);CHKERRQ(ierr); ierr = DAGetGhostCorners(da,&starts[0],&starts[1],&starts[2],&dims[0],&dims[1],&dims[2]);CHKERRQ(ierr); ierr = MatSetStencil(A,dim,dims,starts,dof);CHKERRQ(ierr); *J = A; PetscFunctionReturn(0); } From agrayver at gfz-potsdam.de Wed Apr 20 10:31:56 2011 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Wed, 20 Apr 2011 17:31:56 +0200 Subject: [petsc-users] complexity of solvers Message-ID: <4DAEFC6C.30906@gfz-potsdam.de> Hello, Probably my question might seem stupid, but I don't know better place to ask. I came across with paper in one of the referenced journal where authors claim that LU decomposition has complexity of O(n^1.5) and one solution using factorized matrix can be calculated in O(n*logn). They have sparse matrix with 13 nnz per row. What I've thought so far is that the complexity of the LU decomposition depends on the sparsity of the matrix and in worst case of dense matrix can be estimated as O(n^3). I have not seen any estimates of the LU decomposition complexity for sparse matrices. Is that possible at all? I also always assume the same situation for iterative solvers with the worst case of O(n^2) when the matrix is dense. Regards, Alexander From jed at 59A2.org Wed Apr 20 10:45:05 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 20 Apr 2011 17:45:05 +0200 Subject: [petsc-users] complexity of solvers In-Reply-To: <4DAEFC6C.30906@gfz-potsdam.de> References: <4DAEFC6C.30906@gfz-potsdam.de> Message-ID: On Wed, Apr 20, 2011 at 17:31, Alexander Grayver wrote: > I came across with paper in one of the referenced journal where authors > claim that LU decomposition has complexity of > O(n^1.5) and one solution using factorized matrix can be calculated in > O(n*logn). > These are the bounds for 2D problems with optimal ordering. For 3D, the bounds are O(n^2) time and O(n^{4/3}) space. Alan George, Joseph Liu, Computer Solution of Large Sparse Positive Definite Systems, Prentice-Hall, Englewood Cliffs, NJ, 1981. S.C. Eisenstat, M.H. Schultz, A.H. Sherman, Applications of an element model for Gaussian elimination, in: Sparse Matrix Computations (Proc. Symp., Argonne Nat. Lab., Lemont, Ill., 1975), Academic Press, New York, 1976, pp. 85?96. -------------- next part -------------- An HTML attachment was scrubbed... URL: From agrayver at gfz-potsdam.de Wed Apr 20 11:05:49 2011 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Wed, 20 Apr 2011 18:05:49 +0200 Subject: [petsc-users] complexity of solvers In-Reply-To: References: <4DAEFC6C.30906@gfz-potsdam.de> Message-ID: <4DAF045D.7060609@gfz-potsdam.de> Thanks for references, Jed! Yes, they have 2D problem. Regards, Alexander On 20.04.2011 17:45, Jed Brown wrote: > On Wed, Apr 20, 2011 at 17:31, Alexander Grayver > > wrote: > > I came across with paper in one of the referenced journal where > authors claim that LU decomposition has complexity of > O(n^1.5) and one solution using factorized matrix can be > calculated in O(n*logn). > > > These are the bounds for 2D problems with optimal ordering. For 3D, > the bounds are O(n^2) time and O(n^{4/3}) space. > > Alan George, Joseph Liu, Computer Solution of Large Sparse Positive > Definite Systems, Prentice-Hall, Englewood Cliffs, NJ, 1981. > > S.C. Eisenstat, M.H. Schultz, A.H. Sherman, Applications of an element > model for Gaussian elimination, in: Sparse Matrix Computations (Proc. > Symp., Argonne Nat. Lab., Lemont, Ill., 1975), Academic Press, New > York, 1976, pp. 85?96. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Apr 20 11:31:03 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 20 Apr 2011 11:31:03 -0500 Subject: [petsc-users] DaSetGetMatrix In-Reply-To: <02985683a2f41fe8ced5b122db8f4a1c.squirrel@arcon.univ-st-etienne.fr> References: <02985683a2f41fe8ced5b122db8f4a1c.squirrel@arcon.univ-st-etienne.fr> Message-ID: <8F411CF4-69B6-47E3-B563-75AF52F77BBA@mcs.anl.gov> Please send a complete error report with the entire error message to petsc-maint at mcs.anl.gov without the information we cannot even begin to guess what the issue is. Barry On Apr 20, 2011, at 10:32 AM, domenico.borzacchiello at univ-st-etienne.fr wrote: > Hi, > > I'm running my code (3D Stokes Solver with MAC arrangement) pretty fine so > far with a FieldSplit/Schur Preconditioning. > > I had to write my own DAGetMatrix routine cause it was using too much > memory (5 times the required size) for the matrices. The code works with > 1 2 3 5 etc procs (I presume with any number of procs for which are only > possible 1D cartesian topologies of communicators i.e. any prime numbers) > then if I run with 4 procs for example it stops when assembling the MPIAIJ > matrix with the following error: > > [1]PETSC ERROR: Nonconforming object sizes! > [1]PETSC ERROR: Local scatter sizes don't match! > > What could be causing the error? > > Thank you, > Domenico > > here's the getmatrix function I'm using > > #undef __FUNCT__ > #define __FUNCT__ "DAGetMatrix_User_2" > PetscErrorCode DAGetMatrix_User_2(DA da,const MatType mtype,Mat *J) > { > PetscErrorCode ierr; > Mat A; > PetscInt xm,ym,zm,dim,dof,starts[3],dims[3]; > const MatType Atype; > void > (*aij)(void)=PETSC_NULL,(*baij)(void)=PETSC_NULL,(*sbaij)(void)=PETSC_NULL; > ISLocalToGlobalMapping ltog,ltogb; > > PetscFunctionBegin; > ierr = DAGetInfo(da,&dim, 0,0,0, 0,0,0,&dof,0,0,0);CHKERRQ(ierr); > if (dim != 3) SETERRQ(PETSC_ERR_ARG_WRONG,"Expected DA to be 3D"); > > ierr = DAGetCorners(da,0,0,0,&zm,&ym,&xm);CHKERRQ(ierr); > ierr = DAGetISLocalToGlobalMapping(da,<og);CHKERRQ(ierr); > ierr = DAGetISLocalToGlobalMappingBlck(da,<ogb);CHKERRQ(ierr); > ierr = MatCreate(((PetscObject)da)->comm,&A);CHKERRQ(ierr); > ierr = > MatSetSizes(A,dof*xm*ym*zm,dof*xm*ym*zm,PETSC_DETERMINE,PETSC_DETERMINE);CHKERRQ(ierr); > ierr = MatSetType(A,mtype);CHKERRQ(ierr); > ierr = MatSetFromOptions(A);CHKERRQ(ierr); > ierr = MatSeqAIJSetPreallocation(A,17,PETSC_NULL);CHKERRQ(ierr); > ierr = > MatMPIAIJSetPreallocation(A,17,PETSC_NULL,12,PETSC_NULL);CHKERRQ(ierr); > ierr = MatSeqBAIJSetPreallocation(A,dof,7,PETSC_NULL);CHKERRQ(ierr); > ierr = > MatMPIBAIJSetPreallocation(A,dof,7,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr); > ierr = MatSeqSBAIJSetPreallocation(A,dof,4,PETSC_NULL);CHKERRQ(ierr); > ierr = > MatMPISBAIJSetPreallocation(A,dof,4,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr); > ierr = MatSetDA(A,da); > ierr = MatSetFromOptions(A); > ierr = MatGetType(A,&Atype); > ierr = MatSetBlockSize(A,dof);CHKERRQ(ierr); > ierr = MatSetLocalToGlobalMapping(A,ltog);CHKERRQ(ierr); > ierr = MatSetLocalToGlobalMappingBlock(A,ltogb);CHKERRQ(ierr); > ierr = > DAGetGhostCorners(da,&starts[0],&starts[1],&starts[2],&dims[0],&dims[1],&dims[2]);CHKERRQ(ierr); > ierr = MatSetStencil(A,dim,dims,starts,dof);CHKERRQ(ierr); > *J = A; > PetscFunctionReturn(0); > } > From Debao.Shao at brion.com Wed Apr 20 20:23:43 2011 From: Debao.Shao at brion.com (Debao Shao) Date: Wed, 20 Apr 2011 18:23:43 -0700 Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime In-Reply-To: <3FAF21D8-DC2B-40EE-A79B-1FCF4DCAABD6@mcs.anl.gov> References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03> <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03> <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03> <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov> <384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03> <3FAF21D8-DC2B-40EE-A79B-1FCF4DCAABD6@mcs.anl.gov> Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E6BE4@EX03> Understand, Barry, Thanks a lot. -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, April 20, 2011 9:19 PM To: PETSc users list Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime On Apr 19, 2011, at 10:21 PM, Debao Shao wrote: > Hi, Barry: > > I'm confused, > 1), if we can't use "MatZeroEntries" before MatAssembly, then, how do we do initialization for M? When you create a sparse matrix it automatically has no non-zero values in it so there is no reason to call MatZeroEntries() on it. But I was wrong it is ok to call MatZeroEntries() on it and it will not destroy the preallocation > 2), if we can't use "MatCopy" before MatAssembly, then, how to fill up M from another matrix? You can copy, say A, to M with MatCopy() but M will get the same nonzero structure as A, if you provided "extra" preallocation information in M that will be lost in the copy. So it is not efficient to copy into a matrix M and then start putting as bunch of new nonzero locations into M. Barry > > Can you give a sample code for the right usage? Thanks very much. > > Regards, > Debao > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, April 20, 2011 10:59 AM > To: PETSc users list > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime > > > On Apr 19, 2011, at 8:50 PM, Debao Shao wrote: > >> Here is my sample code: >> ierr = MatZeroEntries( M ); assert( ierr == 0); >> ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES ); >> ierr = MatCopy( ms->M, mStorage->M, DIFFERENT_NONZERO_PATTERN ); >> >> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem? > > Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in. > > Barry > > I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them. > >> >> Thanks, >> Debao >> -----Original Message----- >> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Wednesday, April 20, 2011 9:40 AM >> To: PETSc users list >> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >> >> >> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote: >> >>> Hi, Barry: >>> >>> Thanks for the reply. >>> >>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix. >> >> Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. >> >> If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have a value for) before you first call MatAssemblyEnd(). >> >> Barry >> >> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there. >> >>> >>> Any suggestions? >>> >>> Thanks, >>> Debao >>> -----Original Message----- >>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith >>> Sent: Friday, April 15, 2011 9:25 PM >>> To: PETSc users list >>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime >>> >>> >>> Debao, >>> >>> Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties. >>> >>> Barry >>> >>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote: >>> >>>> Dear Petsc: >>>> >>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times. >>>> >>>> My libpetsc.a is built as follows: >>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0 >>>> 2, make all; >>>> >>>> It's very appreciated to get your reply. >>>> >>>> Thanks a lot, >>>> Debao >>>> >>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >>> >>> >>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. >> >> >> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. > > > -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt. From zonexo at gmail.com Thu Apr 21 07:07:06 2011 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 21 Apr 2011 14:07:06 +0200 Subject: [petsc-users] Improving performance for parallel CFD simulation In-Reply-To: References: <4DA818AE.80900@tu-dresden.de> <4DA8232C.6070007@tu-dresden.de> Message-ID: <4DB01DEA.5050009@gmail.com> Hi, Since there is a similar topic on improving performance of CFD using PETSc earlier, I hope to improve my current CFD solver performance too. I wrote it in Fortran90 with MPI. It is a Immersed Boundary Method (IBM) Navier-Stokes Cartesian grid solver in 2D, although I hope to extend it to 3D in the future. I solve the NS equations using fractional step which results in 2 equations - the momentum and Poisson equations. They are then linearized into systems of equations. I currently solve the momentum solver using PETSc with KSPBCGS. For the Poisson equation, I was using hypre's BoomerAMG. But I have changed to using the geometric multigrid solver from hypre since it's slightly faster. Currently, I am dividing my grid along the y direction for MPI into equal size for each processor. I guess this is not very efficient since beyond 4 processors, the scaling factor drops. I think implementing the distributed array should increase performance, is that so? I wonder how difficult it is because most examples are in C and I am not so used to that. I am also using staggered grid but I will most likely changed to a collocated grid arrangement. What other suggestions do you have to improve the solver's performance using PETSc? Thank you very much. Yours sincerely, TAY wee-beng From knepley at gmail.com Thu Apr 21 07:28:45 2011 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 21 Apr 2011 07:28:45 -0500 Subject: [petsc-users] Improving performance for parallel CFD simulation In-Reply-To: <4DB01DEA.5050009@gmail.com> References: <4DA818AE.80900@tu-dresden.de> <4DA8232C.6070007@tu-dresden.de> <4DB01DEA.5050009@gmail.com> Message-ID: On Thu, Apr 21, 2011 at 7:07 AM, TAY wee-beng wrote: > Hi, > > Since there is a similar topic on improving performance of CFD using PETSc > earlier, I hope to improve my current CFD solver performance too. > > I wrote it in Fortran90 with MPI. It is a Immersed Boundary Method (IBM) > Navier-Stokes Cartesian grid solver in 2D, although I hope to extend it to > 3D in the future. I solve the NS equations using fractional step which > results in 2 equations - the momentum and Poisson equations. They are then > linearized into systems of equations. > > I currently solve the momentum solver using PETSc with KSPBCGS. For the > Poisson equation, I was using hypre's BoomerAMG. But I have changed to using > the geometric multigrid solver from hypre since it's slightly faster. > > Currently, I am dividing my grid along the y direction for MPI into equal > size for each processor. I guess this is not very efficient since beyond 4 > processors, the scaling factor drops. > > I think implementing the distributed array should increase performance, is > that so? I wonder how difficult it is because most examples are in C and I > am not so used to that. I am also using staggered grid but I will most > likely changed to a collocated grid arrangement. > It will definitely improve scalability. I don't think conversion should be that hard. Matt What other suggestions do you have to improve the solver's performance using > PETSc? > > Thank you very much. > > Yours sincerely, > > TAY wee-beng > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vishy at stat.purdue.edu Thu Apr 21 09:39:40 2011 From: vishy at stat.purdue.edu (S V N Vishwanathan) Date: Thu, 21 Apr 2011 10:39:40 -0400 Subject: [petsc-users] Question on writing a large matrix Message-ID: <87k4ensm6r.wl%vishy@stat.purdue.edu> Hi I am using the attached code to convert a matrix from a rather inefficient ascii format (each line is a row and contains a series of idx:val pairs) to the PETSc binary format. Some of the matrices that I am working with are rather huge (50GB ascii file) and cannot be assembled on a single processor. When I use the attached code the matrix assembly across machines seems to be fairly fast. However, dumping the assembled matrix out to disk seems to be painfully slow. Any suggestions on how to speed things up will be deeply appreciated. vishy -------------- next part -------------- A non-text attachment was scrubbed... Name: libsvm-to-binary.cpp Type: application/octet-stream Size: 15762 bytes Desc: not available URL: From aron.ahmadia at kaust.edu.sa Thu Apr 21 09:44:35 2011 From: aron.ahmadia at kaust.edu.sa (Aron Ahmadia) Date: Thu, 21 Apr 2011 17:44:35 +0300 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: <87k4ensm6r.wl%vishy@stat.purdue.edu> References: <87k4ensm6r.wl%vishy@stat.purdue.edu> Message-ID: Hi Vish, What is 'painfully slow'. Do you have a profile or an estimate in terms of GB/s? Have you taken a look at your process's memory allocation and checked to see if it is swapping? My first guess would be that you are exceeding RAM and your program is thrashing as parts of the page table get swapped to and from disk mid-run. A On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan wrote: > Hi > > I am using the attached code to convert a matrix from a rather > inefficient ascii format (each line is a row and contains a series of > idx:val pairs) to the PETSc binary format. Some of the matrices that I > am working with are rather huge (50GB ascii file) and cannot be > assembled on a single processor. When I use the attached code the matrix > assembly across machines seems to be fairly fast. However, dumping the > assembled matrix out to disk seems to be painfully slow. Any suggestions > on how to speed things up will be deeply appreciated. > > vishy > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vishy at stat.purdue.edu Thu Apr 21 11:59:10 2011 From: vishy at stat.purdue.edu (S V N Vishwanathan) Date: Thu, 21 Apr 2011 12:59:10 -0400 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: References: <87k4ensm6r.wl%vishy@stat.purdue.edu> Message-ID: <87hb9rsfq9.wl%vishy@stat.purdue.edu> > What is 'painfully slow'. ?Do you have a profile or an estimate in > terms of GB/s? ?Have you taken a look at your process's memory > allocation and checked to see if it is swapping? ?My first guess would > be that you are exceeding RAM and your program is thrashing as parts > of the page table get swapped to and from disk mid-run. A single machine does not have enough memory to hold the entire matrix. That is why I have to assemble it in parallel. When distributed across 8 machines the assembly seemed to finish in under an hr. However, my program tried to write the matrix to file since yesterday night and eventually crashed. The log just indicated [1]PETSC ERROR: Caught signal number 1 Hang up: Some other process (or the batch system) has told this process to end Most likely because it tried to allocate a large chunk of memory and failed. I investigated using a smaller matrix and ran the code with the -info flag (see below). What worries me are these lines: Writing data in binary format to adult9.train.x .... >>>> I call MatView in my code here [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281 Is MatView reconstructing the matrix at the root node? In that case the program will definitely fail due to lack of memory. Please let me know if I you need any other information or if I can run any other tests to help investigate. vishy mpiexec -n 2 ./libsvm-to-binary -in ../LibSVM/biclass/adult9/adult9.train.txt -data adult9.train.x -labels adult9.train.y -info [0] PetscInitialize(): PETSc successfully started: number of processors = 2 [1] PetscInitialize(): PETSc successfully started: number of processors = 2 [1] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu [0] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu No libsvm test file specified! Reading libsvm train file at ../LibSVM/biclass/adult9/adult9.train.txt [0] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt [1] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374780 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374782 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780 [0] PetscCommDuplicate(): returning tag 2147483642 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782 [1] PetscCommDuplicate(): returning tag 2147483642 [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374780 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [0] PetscCommDuplicate(): returning tag 2147483646 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780 [1] PetscCommDuplicate(): returning tag 2147483646 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 0 [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 124; storage space: 225806 unneeded,0 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 [1] Mat_CheckInode(): Found 3257 nodes of 16281. Limit used: 5. Using Inode routines [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 124; storage space: 0 unneeded,225786 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14 [0] Mat_CheckInode(): Found 16280 nodes out of 16280 rows. Not using Inode routines [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [0] PetscCommDuplicate(): returning tag 2147483645 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [0] PetscCommDuplicate(): returning tag 2147483638 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780 [1] PetscCommDuplicate(): returning tag 2147483645 [1] PetscCommDuplicate(): returning tag 2147483638 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780 [1] PetscCommDuplicate(): returning tag 2147483644 [1] PetscCommDuplicate(): returning tag 2147483637 [0] PetscCommDuplicate(): returning tag 2147483644 [0] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): returning tag 2147483632 [0] PetscCommDuplicate(): returning tag 2147483632 [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterCreate(): General case: MPI to Seq [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 0; storage space: 0 unneeded,0 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 Writing data in binary format to adult9.train.x [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780 [0] PetscCommDuplicate(): returning tag 2147483628 [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 123; storage space: 18409 unneeded,225806 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782 [1] PetscCommDuplicate(): returning tag 2147483628 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689 [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780 [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782 [1] PetscCommDuplicate(): returning tag 2147483627 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374777 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374777 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374777 Writing labels in binary format to adult9.train.y [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780 [0] PetscCommDuplicate(): returning tag 2147483627 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374782 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374782 [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374782 [1] PetscFinalize(): PetscFinalize() called [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780 [0] PetscFinalize(): PetscFinalize() called > > On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan wrote: > > Hi > > I am using the attached code to convert a matrix from a rather > inefficient ascii format (each line is a row and contains a series of > idx:val pairs) to the PETSc binary format. Some of the matrices that I > am working with are rather huge (50GB ascii file) and cannot be > assembled on a single processor. When I use the attached code the matrix > assembly across machines seems to be fairly fast. However, dumping the > assembled matrix out to disk seems to be painfully slow. Any suggestions > on how to speed things up will be deeply appreciated. > > vishy > > From bsmith at mcs.anl.gov Thu Apr 21 12:25:33 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 21 Apr 2011 12:25:33 -0500 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: <87hb9rsfq9.wl%vishy@stat.purdue.edu> References: <87k4ensm6r.wl%vishy@stat.purdue.edu> <87hb9rsfq9.wl%vishy@stat.purdue.edu> Message-ID: <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> On Apr 21, 2011, at 11:59 AM, S V N Vishwanathan wrote: > >> What is 'painfully slow'. Do you have a profile or an estimate in >> terms of GB/s? Have you taken a look at your process's memory >> allocation and checked to see if it is swapping? My first guess would >> be that you are exceeding RAM and your program is thrashing as parts >> of the page table get swapped to and from disk mid-run. > > A single machine does not have enough memory to hold the entire > matrix. That is why I have to assemble it in parallel. When distributed > across 8 machines the assembly seemed to finish in under an hr. It has not assembled the matrix in an hour. It is working all night to assemble the matrix, the problem is that you are not preallocating the nonzeros per row with MatMPIAIJSetPreallocation() when pre allocation is correct it will always print 0 for Number of mallocs. The actual writing of the parallel matrix to the binary file will take at most minutes. Barry > However, > my program tried to write the matrix to file since yesterday night and > eventually crashed. The log just indicated > > [1]PETSC ERROR: Caught signal number 1 Hang up: Some other process (or the batch system) has told this process to end > > Most likely because it tried to allocate a large chunk of memory and > failed. > > I investigated using a smaller matrix and ran the code with the -info > flag (see below). What worries me are these lines: > > Writing data in binary format to adult9.train.x > .... >>>> I call MatView in my code here > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281 > > Is MatView reconstructing the matrix at the root node? In that case the > program will definitely fail due to lack of memory. > > Please let me know if I you need any other information or if I can run > any other tests to help investigate. > > vishy > > > > > mpiexec -n 2 ./libsvm-to-binary -in ../LibSVM/biclass/adult9/adult9.train.txt -data adult9.train.x -labels adult9.train.y -info > > [0] PetscInitialize(): PETSc successfully started: number of processors = 2 > [1] PetscInitialize(): PETSc successfully started: number of processors = 2 > [1] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu > [0] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu > No libsvm test file specified! > > Reading libsvm train file at ../LibSVM/biclass/adult9/adult9.train.txt > [0] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt > [1] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374780 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374782 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780 > [0] PetscCommDuplicate(): returning tag 2147483642 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782 > [1] PetscCommDuplicate(): returning tag 2147483642 > [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 2147483647 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374780 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 > [0] PetscCommDuplicate(): returning tag 2147483646 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780 > [1] PetscCommDuplicate(): returning tag 2147483646 > [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 0 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 124; storage space: 225806 unneeded,0 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 > [1] Mat_CheckInode(): Found 3257 nodes of 16281. Limit used: 5. Using Inode routines > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 124; storage space: 0 unneeded,225786 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14 > [0] Mat_CheckInode(): Found 16280 nodes out of 16280 rows. Not using Inode routines > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 > [0] PetscCommDuplicate(): returning tag 2147483645 > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > [0] PetscCommDuplicate(): returning tag 2147483638 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780 > [1] PetscCommDuplicate(): returning tag 2147483645 > [1] PetscCommDuplicate(): returning tag 2147483638 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780 > [1] PetscCommDuplicate(): returning tag 2147483644 > [1] PetscCommDuplicate(): returning tag 2147483637 > [0] PetscCommDuplicate(): returning tag 2147483644 > [0] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): returning tag 2147483632 > [0] PetscCommDuplicate(): returning tag 2147483632 > [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter > [0] VecScatterCreate(): General case: MPI to Seq > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 0; storage space: 0 unneeded,0 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 > > Writing data in binary format to adult9.train.x > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780 > [0] PetscCommDuplicate(): returning tag 2147483628 > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 123; storage space: 18409 unneeded,225806 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782 > [1] PetscCommDuplicate(): returning tag 2147483628 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689 > [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780 > [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782 > [1] PetscCommDuplicate(): returning tag 2147483627 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689 > [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374777 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374777 > [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374777 > > Writing labels in binary format to adult9.train.y > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780 > [0] PetscCommDuplicate(): returning tag 2147483627 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374782 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374782 > [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374782 > [1] PetscFinalize(): PetscFinalize() called > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780 > [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780 > [0] PetscFinalize(): PetscFinalize() called > > >> >> On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan wrote: >> >> Hi >> >> I am using the attached code to convert a matrix from a rather >> inefficient ascii format (each line is a row and contains a series of >> idx:val pairs) to the PETSc binary format. Some of the matrices that I >> am working with are rather huge (50GB ascii file) and cannot be >> assembled on a single processor. When I use the attached code the matrix >> assembly across machines seems to be fairly fast. However, dumping the >> assembled matrix out to disk seems to be painfully slow. Any suggestions >> on how to speed things up will be deeply appreciated. >> >> vishy >> >> > From longmin.ran at gmail.com Fri Apr 22 05:31:03 2011 From: longmin.ran at gmail.com (Longmin RAN) Date: Fri, 22 Apr 2011 12:31:03 +0200 Subject: [petsc-users] "-mat_superlu_colperm MMD_AT_PLUS_A" causes the program to hang Message-ID: Dear all, I'm using superlu within petsc to solve systems with symmetric sparse matrix. In superlu manual I read that MMD_AT_PLUS_A column permutation, together with little diagonal pivot threshold, should be used for symmetric mode. So I launch my program with the following options: -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu -mat_superlu_diagpivotthresh 0.001 -mat_superlu_symmetricmode TRUE -mat_superlu_colperm MMD_AT_PLUS_A It seems that "-mat_superlu_colperm MMD_AT_PLUS_A" causes the program to hang: when I deleted this option, my calculation is executed correctly. But It's always interesting to be able to use the column permutation option. Do you guys have any ideas ? Cheers, Longmin From hzhang at mcs.anl.gov Fri Apr 22 09:56:11 2011 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 22 Apr 2011 09:56:11 -0500 Subject: [petsc-users] "-mat_superlu_colperm MMD_AT_PLUS_A" causes the program to hang In-Reply-To: References: Message-ID: Longmin : I cannot reproduce it with petsc example: petsc-dev/src/ksp/ksp/examples/tutorials>./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu -mat_superlu_diagpivotthresh 0.001 -mat_superlu_symmetricmode TRUE -mat_superlu_colperm MMD_AT_PLUS_A Norm of error < 1.e-12 iterations 1 Please use a debugger to check where it hangs. Then use valgrind to check posible memory corruption. Hong > > I'm using superlu within petsc to solve systems with symmetric sparse > matrix. In superlu manual I read that MMD_AT_PLUS_A column > permutation, together with little diagonal pivot threshold, should be > used for symmetric mode. So I launch my program with the following > options: > ?-ksp_type preonly > ?-pc_type lu > ?-pc_factor_mat_solver_package superlu > ?-mat_superlu_diagpivotthresh 0.001 > ?-mat_superlu_symmetricmode TRUE > ?-mat_superlu_colperm MMD_AT_PLUS_A > > It seems that "-mat_superlu_colperm MMD_AT_PLUS_A" causes the program > to hang: when I deleted this option, my calculation is executed > correctly. But It's always interesting to be able to use the column > permutation option. Do you guys have any ideas ? > > > Cheers, > > Longmin > From gaurish108 at gmail.com Fri Apr 22 15:44:55 2011 From: gaurish108 at gmail.com (Gaurish Telang) Date: Fri, 22 Apr 2011 16:44:55 -0400 Subject: [petsc-users] how good is PETSc+GPU's ? Message-ID: I would like to know how well PETSc works with GPU's and the kind of Speed-ups one can get if one uses PETSc along with GPU's. Has it been used for scientific studies so far? [ I think this bit of information ( http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#gpus ) has been that way for a long time, and hence the above question. This 2010 article( http://www.mcs.anl.gov/petsc/petsc-2/features/gpus.pdf ) does not mention any comparative studies of the preliminary implementation of PETSc for use with GPU's with other software libraries. ] Sincere thanks, Gaurish -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Apr 22 16:15:52 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 22 Apr 2011 16:15:52 -0500 Subject: [petsc-users] how good is PETSc+GPU's ? In-Reply-To: References: Message-ID: On Apr 22, 2011, at 3:44 PM, Gaurish Telang wrote: > I would like to know how well PETSc works with GPU's and the kind of Speed-ups one can get if one uses PETSc along with GPU's. > > Has it been used for scientific studies so far? > > [ I think this bit of information ( http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#gpus ) has been that way for a long time, and hence the above question. > > This 2010 article( http://www.mcs.anl.gov/petsc/petsc-2/features/gpus.pdf ) does n > ot mention any comparative studies of the preliminary implementation of PETSc for use with GPU's with other software libraries. ] > There are no such studies. PETSc uses the CUSP and THRUST libraries of Nvidia on the GPU therefor the performance will be the same as using CUSP and THRUST directly or of any other library that uses CUSP and THRUST. Just like with regular CPUs the performance of sparse matrix iterative methods (floating point speedwise) is determined by the hardware so there won't be much difference between different libraries that do the "right thing". If you are trying to decide between two packages to use for solving some algebraic systems you need to compare them yourself, you cannot rely on what people say. If you are deciding between using a package and doing it yourself you might as well use the package since you can always add whatever custom stuff yourself if you think it is better, so there is really no downside to using a package. Barry > Sincere thanks, > > Gaurish From vishy at stat.purdue.edu Sat Apr 23 12:36:35 2011 From: vishy at stat.purdue.edu (S V N Vishwanathan) Date: Sat, 23 Apr 2011 13:36:35 -0400 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> References: <87k4ensm6r.wl%vishy@stat.purdue.edu> <87hb9rsfq9.wl%vishy@stat.purdue.edu> <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> Message-ID: <87vcy4g998.wl%vishy@stat.purdue.edu> Barry, > It has not assembled the matrix in an hour. It is working all night > to assemble the matrix, the problem is that you are not > preallocating the nonzeros per row with MatMPIAIJSetPreallocation() > when pre allocation is correct it will always print 0 for Number of > mallocs. The actual writing of the parallel matrix to the binary > file will take at most minutes. You were absolutely right! I had not set the preallocation properly and hence the code was painfully slow. I fixed that issue (see attached code) and now it runs much faster. However, I am having a different problem now. When I run the code for smaller matrices (less than a million rows) everything works well. However, when working with large matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to file dies with the following message: Fatal error in MPI_Recv: Other MPI error Any hints on how to solve this problem or are deeply appreciated. vishy The output of running the code with the -info flag is as follows: [0] PetscInitialize(): PETSc successfully started: number of processors = 4 [0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu [3] PetscInitialize(): PETSc successfully started: number of processors = 4 [3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu No libsvm test file specified! Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [2] PetscInitialize(): PETSc successfully started: number of processors = 4 [2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu [3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [1] PetscInitialize(): PETSc successfully started: number of processors = 4 [1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu [1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt m=100000 m=200000 m=300000 m=400000 m=500000 m=600000 m=700000 m=800000 m=900000 m=1000000 m=1100000 m=1200000 m=1300000 m=1400000 m=1500000 m=1600000 m=1700000 m=1800000 m=1900000 m=2000000 m=2100000 m=2200000 m=2300000 m=2400000 m=2500000 m=2600000 m=2700000 m=2800000 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [2] PetscCommDuplicate(): returning tag 2147483647 [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [3] PetscCommDuplicate(): returning tag 2147483647 [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [2] PetscCommDuplicate(): returning tag 2147483647 [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [3] PetscCommDuplicate(): returning tag 2147483647 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [3] PetscCommDuplicate(): returning tag 2147483642 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [2] PetscCommDuplicate(): returning tag 2147483642 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [1] PetscCommDuplicate(): returning tag 2147483642 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [0] PetscCommDuplicate(): returning tag 2147483642 [0] MatSetUpPreallocation(): Warning not preallocating matrix storage [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483646 [2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [2] PetscCommDuplicate(): returning tag 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [2] PetscCommDuplicate(): returning tag 2147483646 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [1] PetscCommDuplicate(): returning tag 2147483646 [3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [3] PetscCommDuplicate(): returning tag 2147483647 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [3] PetscCommDuplicate(): returning tag 2147483646 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 0 [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [3] PetscCommDuplicate(): returning tag 2147483645 [3] PetscCommDuplicate(): returning tag 2147483638 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [2] PetscCommDuplicate(): returning tag 2147483645 [2] PetscCommDuplicate(): returning tag 2147483638 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483645 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [0] PetscCommDuplicate(): returning tag 2147483638 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [1] PetscCommDuplicate(): returning tag 2147483645 [1] PetscCommDuplicate(): returning tag 2147483638 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483644 [0] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [1] PetscCommDuplicate(): returning tag 2147483644 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [3] PetscCommDuplicate(): returning tag 2147483644 [3] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): returning tag 2147483637 [0] PetscCommDuplicate(): returning tag 2147483632 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [2] PetscCommDuplicate(): returning tag 2147483644 [2] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): returning tag 2147483632 [2] PetscCommDuplicate(): returning tag 2147483632 [3] PetscCommDuplicate(): returning tag 2147483632 [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterCreate(): General case: MPI to Seq [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [2] PetscCommDuplicate(): returning tag 2147483628 [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [3] PetscCommDuplicate(): returning tag 2147483628 [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [1] PetscCommDuplicate(): returning tag 2147483628 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [0] PetscCommDuplicate(): returning tag 2147483628 APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1) -------------- next part -------------- A non-text attachment was scrubbed... Name: libsvm-to-binary.cpp Type: application/octet-stream Size: 15449 bytes Desc: not available URL: From vishy at stat.purdue.edu Sat Apr 23 12:36:52 2011 From: vishy at stat.purdue.edu (S V N Vishwanathan) Date: Sat, 23 Apr 2011 13:36:52 -0400 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> References: <87k4ensm6r.wl%vishy@stat.purdue.edu> <87hb9rsfq9.wl%vishy@stat.purdue.edu> <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> Message-ID: <87tydog98r.wl%vishy@stat.purdue.edu> Barry, > It has not assembled the matrix in an hour. It is working all night > to assemble the matrix, the problem is that you are not > preallocating the nonzeros per row with MatMPIAIJSetPreallocation() > when pre allocation is correct it will always print 0 for Number of > mallocs. The actual writing of the parallel matrix to the binary > file will take at most minutes. You were absolutely right! I had not set the preallocation properly and hence the code was painfully slow. I fixed that issue (see attached code) and now it runs much faster. However, I am having a different problem now. When I run the code for smaller matrices (less than a million rows) everything works well. However, when working with large matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to file dies with the following message: Fatal error in MPI_Recv: Other MPI error Any hints on how to solve this problem or are deeply appreciated. vishy The output of running the code with the -info flag is as follows: [0] PetscInitialize(): PETSc successfully started: number of processors = 4 [0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu [3] PetscInitialize(): PETSc successfully started: number of processors = 4 [3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu No libsvm test file specified! Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [2] PetscInitialize(): PETSc successfully started: number of processors = 4 [2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu [3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt [1] PetscInitialize(): PETSc successfully started: number of processors = 4 [1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu [1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt m=100000 m=200000 m=300000 m=400000 m=500000 m=600000 m=700000 m=800000 m=900000 m=1000000 m=1100000 m=1200000 m=1300000 m=1400000 m=1500000 m=1600000 m=1700000 m=1800000 m=1900000 m=2000000 m=2100000 m=2200000 m=2300000 m=2400000 m=2500000 m=2600000 m=2700000 m=2800000 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [2] PetscCommDuplicate(): returning tag 2147483647 [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [3] PetscCommDuplicate(): returning tag 2147483647 [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [2] PetscCommDuplicate(): returning tag 2147483647 [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 [3] PetscCommDuplicate(): returning tag 2147483647 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [3] PetscCommDuplicate(): returning tag 2147483642 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [2] PetscCommDuplicate(): returning tag 2147483642 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [1] PetscCommDuplicate(): returning tag 2147483642 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [0] PetscCommDuplicate(): returning tag 2147483642 [0] MatSetUpPreallocation(): Warning not preallocating matrix storage [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [0] PetscCommDuplicate(): returning tag 2147483647 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483646 [2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [2] PetscCommDuplicate(): returning tag 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [1] PetscCommDuplicate(): returning tag 2147483647 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [2] PetscCommDuplicate(): returning tag 2147483646 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [1] PetscCommDuplicate(): returning tag 2147483646 [3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 [3] PetscCommDuplicate(): returning tag 2147483647 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [3] PetscCommDuplicate(): returning tag 2147483646 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [0] MatStashScatterBegin_Private(): No of messages: 0 [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 [1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [3] PetscCommDuplicate(): returning tag 2147483645 [3] PetscCommDuplicate(): returning tag 2147483638 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [2] PetscCommDuplicate(): returning tag 2147483645 [2] PetscCommDuplicate(): returning tag 2147483638 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483645 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [0] PetscCommDuplicate(): returning tag 2147483638 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [1] PetscCommDuplicate(): returning tag 2147483645 [1] PetscCommDuplicate(): returning tag 2147483638 [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [0] PetscCommDuplicate(): returning tag 2147483644 [0] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [1] PetscCommDuplicate(): returning tag 2147483644 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [3] PetscCommDuplicate(): returning tag 2147483644 [3] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): returning tag 2147483637 [0] PetscCommDuplicate(): returning tag 2147483632 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 [2] PetscCommDuplicate(): returning tag 2147483644 [2] PetscCommDuplicate(): returning tag 2147483637 [1] PetscCommDuplicate(): returning tag 2147483632 [2] PetscCommDuplicate(): returning tag 2147483632 [3] PetscCommDuplicate(): returning tag 2147483632 [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterCreate(): General case: MPI to Seq [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [2] PetscCommDuplicate(): returning tag 2147483628 [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [3] PetscCommDuplicate(): returning tag 2147483628 [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [1] PetscCommDuplicate(): returning tag 2147483628 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 [0] PetscCommDuplicate(): returning tag 2147483628 APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1) -------------- next part -------------- A non-text attachment was scrubbed... Name: libsvm-to-binary.cpp Type: application/octet-stream Size: 15449 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat Apr 23 13:39:35 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Apr 2011 13:39:35 -0500 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: <87vcy4g998.wl%vishy@stat.purdue.edu> References: <87k4ensm6r.wl%vishy@stat.purdue.edu> <87hb9rsfq9.wl%vishy@stat.purdue.edu> <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> <87vcy4g998.wl%vishy@stat.purdue.edu> Message-ID: <92BAD5B4-F576-4FA5-B4DD-6A3666851CAD@mcs.anl.gov> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#64-bit-indices On Apr 23, 2011, at 12:36 PM, S V N Vishwanathan wrote: > Barry, > >> It has not assembled the matrix in an hour. It is working all night >> to assemble the matrix, the problem is that you are not >> preallocating the nonzeros per row with MatMPIAIJSetPreallocation() >> when pre allocation is correct it will always print 0 for Number of >> mallocs. The actual writing of the parallel matrix to the binary >> file will take at most minutes. > > You were absolutely right! I had not set the preallocation properly and > hence the code was painfully slow. I fixed that issue (see attached > code) and now it runs much faster. However, I am having a different > problem now. When I run the code for smaller matrices (less than a > million rows) everything works well. However, when working with large > matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to > file dies with the following message: > > Fatal error in MPI_Recv: Other MPI error > > Any hints on how to solve this problem or are deeply appreciated. > > vishy > > The output of running the code with the -info flag is as follows: > > [0] PetscInitialize(): PETSc successfully started: number of processors = 4 > [0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu > [3] PetscInitialize(): PETSc successfully started: number of processors = 4 > [3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu > No libsvm test file specified! > > Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [2] PetscInitialize(): PETSc successfully started: number of processors = 4 > [2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu > [3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [1] PetscInitialize(): PETSc successfully started: number of processors = 4 > [1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu > [1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > m=100000 > m=200000 > m=300000 > m=400000 > m=500000 > m=600000 > m=700000 > m=800000 > m=900000 > m=1000000 > m=1100000 > m=1200000 > m=1300000 > m=1400000 > m=1500000 > m=1600000 > m=1700000 > m=1800000 > m=1900000 > m=2000000 > m=2100000 > m=2200000 > m=2300000 > m=2400000 > m=2500000 > m=2600000 > m=2700000 > m=2800000 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [2] PetscCommDuplicate(): returning tag 2147483647 > [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [3] PetscCommDuplicate(): returning tag 2147483647 > [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [2] PetscCommDuplicate(): returning tag 2147483647 > [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [3] PetscCommDuplicate(): returning tag 2147483647 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [3] PetscCommDuplicate(): returning tag 2147483642 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [2] PetscCommDuplicate(): returning tag 2147483642 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [1] PetscCommDuplicate(): returning tag 2147483642 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [0] PetscCommDuplicate(): returning tag 2147483642 > [0] MatSetUpPreallocation(): Warning not preallocating matrix storage > [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [0] PetscCommDuplicate(): returning tag 2147483646 > [2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [2] PetscCommDuplicate(): returning tag 2147483647 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [2] PetscCommDuplicate(): returning tag 2147483646 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [1] PetscCommDuplicate(): returning tag 2147483646 > [3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [3] PetscCommDuplicate(): returning tag 2147483647 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [3] PetscCommDuplicate(): returning tag 2147483646 > [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 0 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [3] PetscCommDuplicate(): returning tag 2147483645 > [3] PetscCommDuplicate(): returning tag 2147483638 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [2] PetscCommDuplicate(): returning tag 2147483645 > [2] PetscCommDuplicate(): returning tag 2147483638 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [0] PetscCommDuplicate(): returning tag 2147483645 > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > [0] PetscCommDuplicate(): returning tag 2147483638 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [1] PetscCommDuplicate(): returning tag 2147483645 > [1] PetscCommDuplicate(): returning tag 2147483638 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [0] PetscCommDuplicate(): returning tag 2147483644 > [0] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [1] PetscCommDuplicate(): returning tag 2147483644 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [3] PetscCommDuplicate(): returning tag 2147483644 > [3] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): returning tag 2147483637 > [0] PetscCommDuplicate(): returning tag 2147483632 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [2] PetscCommDuplicate(): returning tag 2147483644 > [2] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): returning tag 2147483632 > [2] PetscCommDuplicate(): returning tag 2147483632 > [3] PetscCommDuplicate(): returning tag 2147483632 > [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter > [0] VecScatterCreate(): General case: MPI to Seq > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [2] PetscCommDuplicate(): returning tag 2147483628 > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [3] PetscCommDuplicate(): returning tag 2147483628 > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [1] PetscCommDuplicate(): returning tag 2147483628 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > > Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [0] PetscCommDuplicate(): returning tag 2147483628 > APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1) > > From bsmith at mcs.anl.gov Sat Apr 23 13:39:35 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Apr 2011 13:39:35 -0500 Subject: [petsc-users] Question on writing a large matrix In-Reply-To: <87vcy4g998.wl%vishy@stat.purdue.edu> References: <87k4ensm6r.wl%vishy@stat.purdue.edu> <87hb9rsfq9.wl%vishy@stat.purdue.edu> <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov> <87vcy4g998.wl%vishy@stat.purdue.edu> Message-ID: <92BAD5B4-F576-4FA5-B4DD-6A3666851CAD@mcs.anl.gov> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#64-bit-indices On Apr 23, 2011, at 12:36 PM, S V N Vishwanathan wrote: > Barry, > >> It has not assembled the matrix in an hour. It is working all night >> to assemble the matrix, the problem is that you are not >> preallocating the nonzeros per row with MatMPIAIJSetPreallocation() >> when pre allocation is correct it will always print 0 for Number of >> mallocs. The actual writing of the parallel matrix to the binary >> file will take at most minutes. > > You were absolutely right! I had not set the preallocation properly and > hence the code was painfully slow. I fixed that issue (see attached > code) and now it runs much faster. However, I am having a different > problem now. When I run the code for smaller matrices (less than a > million rows) everything works well. However, when working with large > matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to > file dies with the following message: > > Fatal error in MPI_Recv: Other MPI error > > Any hints on how to solve this problem or are deeply appreciated. > > vishy > > The output of running the code with the -info flag is as follows: > > [0] PetscInitialize(): PETSc successfully started: number of processors = 4 > [0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu > [3] PetscInitialize(): PETSc successfully started: number of processors = 4 > [3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu > No libsvm test file specified! > > Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [2] PetscInitialize(): PETSc successfully started: number of processors = 4 > [2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu > [3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > [1] PetscInitialize(): PETSc successfully started: number of processors = 4 > [1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu > [1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt > m=100000 > m=200000 > m=300000 > m=400000 > m=500000 > m=600000 > m=700000 > m=800000 > m=900000 > m=1000000 > m=1100000 > m=1200000 > m=1300000 > m=1400000 > m=1500000 > m=1600000 > m=1700000 > m=1800000 > m=1900000 > m=2000000 > m=2100000 > m=2200000 > m=2300000 > m=2400000 > m=2500000 > m=2600000 > m=2700000 > m=2800000 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [2] PetscCommDuplicate(): returning tag 2147483647 > [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [3] PetscCommDuplicate(): returning tag 2147483647 > [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688 > [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784 > [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [2] PetscCommDuplicate(): returning tag 2147483647 > [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647 > [3] PetscCommDuplicate(): returning tag 2147483647 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [3] PetscCommDuplicate(): returning tag 2147483642 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [2] PetscCommDuplicate(): returning tag 2147483642 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [1] PetscCommDuplicate(): returning tag 2147483642 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [0] PetscCommDuplicate(): returning tag 2147483642 > [0] MatSetUpPreallocation(): Warning not preallocating matrix storage > [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [0] PetscCommDuplicate(): returning tag 2147483647 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [0] PetscCommDuplicate(): returning tag 2147483646 > [2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [2] PetscCommDuplicate(): returning tag 2147483647 > [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [1] PetscCommDuplicate(): returning tag 2147483647 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [2] PetscCommDuplicate(): returning tag 2147483646 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [1] PetscCommDuplicate(): returning tag 2147483646 > [3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647 > [3] PetscCommDuplicate(): returning tag 2147483647 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [3] PetscCommDuplicate(): returning tag 2147483646 > [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 0 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289 > [1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [3] PetscCommDuplicate(): returning tag 2147483645 > [3] PetscCommDuplicate(): returning tag 2147483638 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [2] PetscCommDuplicate(): returning tag 2147483645 > [2] PetscCommDuplicate(): returning tag 2147483638 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [0] PetscCommDuplicate(): returning tag 2147483645 > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > [0] PetscCommDuplicate(): returning tag 2147483638 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [1] PetscCommDuplicate(): returning tag 2147483645 > [1] PetscCommDuplicate(): returning tag 2147483638 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [0] PetscCommDuplicate(): returning tag 2147483644 > [0] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [1] PetscCommDuplicate(): returning tag 2147483644 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [3] PetscCommDuplicate(): returning tag 2147483644 > [3] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): returning tag 2147483637 > [0] PetscCommDuplicate(): returning tag 2147483632 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783 > [2] PetscCommDuplicate(): returning tag 2147483644 > [2] PetscCommDuplicate(): returning tag 2147483637 > [1] PetscCommDuplicate(): returning tag 2147483632 > [2] PetscCommDuplicate(): returning tag 2147483632 > [3] PetscCommDuplicate(): returning tag 2147483632 > [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter > [0] VecScatterCreate(): General case: MPI to Seq > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [2] PetscCommDuplicate(): returning tag 2147483628 > [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [3] PetscCommDuplicate(): returning tag 2147483628 > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [1] PetscCommDuplicate(): returning tag 2147483628 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867 > > Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784 > [0] PetscCommDuplicate(): returning tag 2147483628 > APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1) > > From bartlomiej.wach at yahoo.pl Sat Apr 23 18:33:27 2011 From: bartlomiej.wach at yahoo.pl (=?utf-8?B?QmFydMWCb21pZWogVw==?=) Date: Sun, 24 Apr 2011 00:33:27 +0100 (BST) Subject: [petsc-users] Identifying processes In-Reply-To: Message-ID: <497548.87377.qm@web28309.mail.ukl.yahoo.com> Hello, I was wondering if anyone could help me to identify processes in parallel execution. I run my app with mpiexec -n 2 and would like to be able to pick a single core to perform a task and be the only one to print instead of having n cores repeat the same thing. PETSC_COMM_WORLD and PETSC_COMM_SELF both cause all processes to print for me, like there is no difference. Thank you Bartholomew -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Apr 23 18:37:59 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Apr 2011 18:37:59 -0500 Subject: [petsc-users] Identifying processes In-Reply-To: <497548.87377.qm@web28309.mail.ukl.yahoo.com> References: <497548.87377.qm@web28309.mail.ukl.yahoo.com> Message-ID: <749EB116-6F30-4C97-99D1-8903539324C3@mcs.anl.gov> PetscMPIInt rank; MPI_Comm_rank(PETSC_COMM_WORLD,&rank); if (!rank) { do something } If both cores do something there there is a mismatch with the mpiexec that you are are running, it may not be the right mpiexec for the MPI includes and library you are using. Barry On Apr 23, 2011, at 6:33 PM, Bart?omiej W wrote: > Hello, > > I was wondering if anyone could help me to identify processes in parallel execution. I run my app with mpiexec -n 2 and would like to be able to pick a single core to perform a task and be the only one to print instead of having n cores repeat the same thing. > > PETSC_COMM_WORLD and PETSC_COMM_SELF both cause all processes to print for me, like there is no difference. > > Thank you > Bartholomew From ilyascfd at gmail.com Sun Apr 24 07:31:53 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Sun, 24 Apr 2011 15:31:53 +0300 Subject: [petsc-users] setting up DA matrix for 3D periodic domain Message-ID: Hi, Manual pages for "MatSetValuesStencil" says that, "For periodic boundary conditions use negative indices for values to the left (below 0; that are to be obtained by wrapping values from right edge). For values to the right of the last entry using that index plus one etc to obtain values that obtained by wrapping the values from the left edge. This does not work for the DA_NONPERIODIC wrap." According to this explanation, If I would set up a matrix for 3D periodic domain using DAs with DA_ XYZPERIODIC, The code segment given below could handle periodicity "without specifying boundary information within the loop" ? - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " . . . MatStencil row(4),col(4,7) PetscInt i1,i7 PetscScalar val(7) i1 = 1 i7 = 7 . . . call DACreate3d(...,DA_XYZPERIODIC,DA_STENCIL_STAR, ... ) call DAGetMatrix(...,A,...) call DAGetCorners(da,xs,ys,zs,xm,ym,zm,ierr) do k=zs,zs+zm-1 do j=ys,ys+ym-1 do i=xs,xs+xm-1 val(1) = ... col(MatStencil_i,1) = i col(MatStencil_j,1) = j col(MatStencil_k,1) = k-1 val(2) = ... col(MatStencil_i,2) = i col(MatStencil_j,2) = j-1 col(MatStencil_k,2) = k val(3) = ... col(MatStencil_i,3) = i-1 col(MatStencil_j,3) = j col(MatStencil_k,3) = k val(4) = ... col(MatStencil_i,4) = i col(MatStencil_j,4) = j col(MatStencil_k,4) = k val(5) = ... col(MatStencil_i,5) = i+1 col(MatStencil_j,5) = j col(MatStencil_k,5) = k val(6) = ... col(MatStencil_i,6) = i col(MatStencil_j,6) = j+1 col(MatStencil_k,6) = k val(7) = ... col(MatStencil_i,7) = i col(MatStencil_j,7) = j col(MatStencil_k,7) = k+1 call MatSetValuesStencil(A,i1,row,i7,col,val,INSERT_VALUES,ierr) end do end do end do call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) " If it is so, how PETSc does it ? By inserting cyclic contributions arising from periodicity into the correct locations within PETSc DAs matrix , as it is done serially ? Thank you, Ilyas. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilyascfd at gmail.com Sun Apr 24 07:35:25 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Sun, 24 Apr 2011 15:35:25 +0300 Subject: [petsc-users] local row calculation in 3D In-Reply-To: References: Message-ID: Thank you Randall, I guess I will follow the Jed's and Matt's suggestions. Ilyas. 2011/4/19 Randall Mackie > You are right! I just didn't read all the way to the end of your email. > Sorry about that. > So here is a little more code that does it correctly: > > PetscInt, pointer :: ltog(:) > > call DAGetGlobalIndicesF90(da,nloc,ltog,ierr); CHKERRQ(ierr) > > > do kk=zs,zs+zm-1 > do jj=ys,ys+ym-1 > do ii=xs,xs+xm-1 > > row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym > grow=ltog(3*row + 1) > > [all your code here] > > call MatSetValues(A,i1,grow,ic,col,v,INSERT_VALUES, > . ierr); CHKERRQ(ierr) > > [more code here] > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > > > > Hope this is a little more helpful. As Jed points out, there are other ways > to do the same > thing (and probably more efficiently than what I've outlined here). > > Randy M. > > > > On Tue, Apr 19, 2011 at 12:00 AM, ilyas ilyas wrote: > >> Hi Randy, >> >> Thank you for your answer. >> >> I have already done it. You can see it in my first e-mail. >> >> It does not work properly for all number of processors. >> For certain number of processors, it works correctly, >> not for all number of processors. >> For example, for 1,2,or 3 processors, it's ok. >> For 4 processors, it gives wrong location, so on. >> "Problem" occurs in 3rd dimension ( (kk-gzs)*gxm*gym ) >> >> Here is another suggestion (I have not tried yet) ; >> >> do kk=zs,zs+zm-1 >> do jj=ys,ys+ym-1 >> do ii=xs,xs+xm-1 >> >> row=ii-gxs + (jj-gys)*MX + (kk-gzs)*MX*MY >> >> MX,MY,MZ are global dimensions.This is also what I do serially >> >> Do you think that it is correct or any other suggestions? >> >> Regards, >> Ilyas. >> >> 2011/4/18 Randall Mackie >> >>> Here's how I do it: >>> >>> do kk=zs,zs+zm-1 >>> do jj=ys,ys+ym-1 >>> do ii=xs,xs+xm-1 >>> >>> row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym >>> >>> >>> Good luck, >>> >>> Randy M. >>> >>> >>> >>> On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas wrote: >>> >>>> Hi, >>>> Thank you for your suggestion. I will take it into account. >>>> Since changing this structure in my "massive" code may take too much >>>> time, >>>> I would like to know that how "row" is calculated in 3D, independently >>>> from processor numbers. >>>> >>>> Regards, >>>> Ilyas >>>> >>>> 2011/4/18 Matthew Knepley >>>> >>>>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> In ex14f.F in KSP, "row" variable is calculated either >>>>>> >>>>> >>>>> These are very old. I suggest you use the FormFunctionLocal() approach >>>>> in ex5f.F which >>>>> does not calculate global row numbers when using a DA. >>>>> >>>>> Matt >>>>> >>>>> >>>>>> 349: do 30 j=ys,ys+ym-1 >>>>>> 350: ... >>>>>> 351: do 40 i=xs,xs+xm-1 >>>>>> 352: row = i - gxs + (j - gys)*gxm + 1 >>>>>> >>>>>> or >>>>>> >>>>>> 442: do 50 j=ys,ys+ym-1 >>>>>> 443: ... >>>>>> 444: row = (j - gys)*gxm + xs - gxs >>>>>> 445: do 60 i=xs,xs+xm-1 >>>>>> 446: row = row + 1 >>>>>> >>>>>> How can I calculate "row" in 3D ? >>>>>> >>>>>> I tried this; >>>>>> >>>>>> do k=zs,zs+zm-1 >>>>>> do j=ys,ys+ym-1 >>>>>> do i=xs,xs+xm-1 >>>>>> >>>>>> row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1 >>>>>> >>>>>> It does not work for certain number of processors. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Ilyas >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Sun Apr 24 08:02:46 2011 From: jed at 59A2.org (Jed Brown) Date: Sun, 24 Apr 2011 15:02:46 +0200 Subject: [petsc-users] setting up DA matrix for 3D periodic domain In-Reply-To: References: Message-ID: On Sun, Apr 24, 2011 at 14:31, ilyas ilyas wrote: > According to this explanation, If I would set up a matrix for 3D periodic > domain using DAs with DA_ XYZPERIODIC, > The code segment given below could handle periodicity "without specifying > boundary information within the loop" ? > Yes, that code is fine. PETSc translates the periodic contributions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdliang at gmail.com Sun Apr 24 11:22:34 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Sun, 24 Apr 2011 12:22:34 -0400 Subject: [petsc-users] point-wise vector product Message-ID: Hello everyone, I am wondering what function in petsc computes the pointwise product of two vectors. For example, vin1=[1,2], vin2=[3,4]; I need a function to output vout= vin1.*vin2 = [3,8]. I can write my own function, but I am worrying whether the vin1 and vin2 are known by all the processors (I guess not if they are parallel vectors and distributed). More precisely, when one processor computes vout[i] = vin1[i]*vin2[i], is it possible that vin1[i] or vin2[i] are not known by this particular processor and the program output an NaN or other meaningless vout[i]? Thank you very much! Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Sun Apr 24 11:24:06 2011 From: jed at 59A2.org (Jed Brown) Date: Sun, 24 Apr 2011 18:24:06 +0200 Subject: [petsc-users] point-wise vector product In-Reply-To: References: Message-ID: On Sun, Apr 24, 2011 at 18:22, Xiangdong Liang wrote: > I am wondering what function in petsc computes the pointwise product of two > vectors. For example, vin1=[1,2], vin2=[3,4]; I need a function to output > vout= vin1.*vin2 = [3,8]. http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Vec/VecPointwiseMult.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From alejandro.aragon at gmail.com Tue Apr 26 01:37:32 2011 From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=) Date: Tue, 26 Apr 2011 08:37:32 +0200 Subject: [petsc-users] KSP solver increases the solution time Message-ID: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> Hi all, I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves: [1] Solving system... 0.154203 s KSP Object: type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test PC Object: type: bjacobi block Jacobi: number of blocks = 3 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object:(sub_) type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 1e-12 using diagonal shift to prevent zero pivot matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Matrix Object: type=seqaij, rows=2020, cols=2020 package used to perform factorization: petsc total: nonzeros=119396, allocated nonzeros=163620 using I-node routines: found 676 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2020, cols=2020 total: nonzeros=119396, allocated nonzeros=163620 using I-node routines: found 676 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=6058, cols=6058 total: nonzeros=365026, allocated nonzeros=509941 using I-node (on process 0) routines: found 676 nodes, limit used is 5 [1] System solved in 51 iterations... 0.543215 s ... ... ... [1] Solving system... 0.302414 s KSP Object: type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test PC Object: type: bjacobi block Jacobi: number of blocks = 3 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object:(sub_) type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 1e-12 using diagonal shift to prevent zero pivot matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Matrix Object: type=seqaij, rows=2020, cols=2020 package used to perform factorization: petsc total: nonzeros=119396, allocated nonzeros=163620 using I-node routines: found 676 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2020, cols=2020 total: nonzeros=119396, allocated nonzeros=163620 using I-node routines: found 676 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=6058, cols=6058 total: nonzeros=365026, allocated nonzeros=509941 using I-node (on process 0) routines: found 676 nodes, limit used is 5 [1] System solved in 3664 iterations... 42.683 s As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all, Alejandro M. Arag?n, Ph.D. From jed at 59A2.org Tue Apr 26 02:52:17 2011 From: jed at 59A2.org (Jed Brown) Date: Tue, 26 Apr 2011 09:52:17 +0200 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> Message-ID: 2011/4/26 Alejandro Marcos Arag?n > As you can see, the second iteration takes more than 40 seconds to solve. > Could some explain why this is happening and why he number of iterations is > increasing dramatically between solves? What has changed between solves? If this is part of a nonlinear problem, it might have just gotten harder to solve. If the linear system is the same, the right hand side for the first problem was probably degenerate (roughly speaking, having significant energy in only a few Krylov modes). -------------- next part -------------- An HTML attachment was scrubbed... URL: From alejandro.aragon at gmail.com Tue Apr 26 04:45:47 2011 From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=) Date: Tue, 26 Apr 2011 11:45:47 +0200 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> Message-ID: <471EBF6A-D42A-43F8-9BD7-BF4EAEE4FF4E@gmail.com> Hi Jed, thanks for replying. In fact, the problem I sent the results from is a dynamic problem of a simply supported beam subjected to a constant load at the center, so I'm integrating in time. The material is linear elastic so the stiffness matrix doesn't change in non-zero structure. Of course the right hand side changes, but I don't think this is the problem because at some point it takes it goes back to just a few iterations to solve. The behavior is cyclic, but I don't understand the reason for this. I've noticed the same behavior of the solver also in quasi-static problems (increasing the load gradually but not integrating over time). Alejandro M. Arag?n On Apr 26, 2011, at 9:52 AM, Jed Brown wrote: > 2011/4/26 Alejandro Marcos Arag?n > As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? > > What has changed between solves? If this is part of a nonlinear problem, it might have just gotten harder to solve. If the linear system is the same, the right hand side for the first problem was probably degenerate (roughly speaking, having significant energy in only a few Krylov modes). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Tue Apr 26 04:56:47 2011 From: jed at 59A2.org (Jed Brown) Date: Tue, 26 Apr 2011 11:56:47 +0200 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: <471EBF6A-D42A-43F8-9BD7-BF4EAEE4FF4E@gmail.com> References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> <471EBF6A-D42A-43F8-9BD7-BF4EAEE4FF4E@gmail.com> Message-ID: 2011/4/26 Alejandro Marcos Arag?n > Hi Jed, thanks for replying. In fact, the problem I sent the results from > is a dynamic problem of a simply supported beam subjected to a constant load > at the center, so I'm integrating in time. > I take it there is no buckling. > The material is linear elastic so the stiffness matrix doesn't change in > non-zero structure. Of course the right hand side changes, but I don't think > this is the problem because at some point it takes it goes back to just a > few iterations to solve. The behavior is cyclic, but I don't understand the > reason for this. I've noticed the same behavior of the solver also in > quasi-static problems (increasing the load gradually but not integrating > over time). > Is the convergence relatively smooth? Are you losing a lot in GMRES restarts (every 30 iterations)? If you have a symmetric formulation (including boundary conditions), you can use -ksp_type cg, otherwise try -ksp_gmres_restart 500. Also, try solving the system with a random right hand side. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stali at purdue.edu Tue Apr 26 01:05:53 2011 From: stali at purdue.edu (Tabrez Ali) Date: Tue, 26 Apr 2011 01:05:53 -0500 Subject: [petsc-users] Error with SBAIJ during KSPSolve Message-ID: <4DB660C1.804@purdue.edu> Hi I am trying to solve a system with constraints (0 on some diagonals). It works fine with AIJ but gives the following error (see below) with SBAIJ Matrices during KSPSolve (sequential). With GMRES it just segfaults. What did I miss? Thanks in advance. Tabrez stali at x61:~/src$ ./a.out References: <4DB660C1.804@purdue.edu> Message-ID: On Tue, Apr 26, 2011 at 08:05, Tabrez Ali wrote: > I am trying to solve a system with constraints (0 on some diagonals). It > works fine with AIJ but gives the following error (see below) with SBAIJ > Matrices during KSPSolve (sequential). With GMRES it just segfaults. You cannot do LU with SBAIJ, try -pc_type cholesky. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Apr 26 07:23:35 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 26 Apr 2011 07:23:35 -0500 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> Message-ID: <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov> > System solved in 3664 iterations... 42.683 s The preconditioner is simply not up to the task it has been assigned. This number of iterations is problematic. Have you tried -pc_type asm -sub_pc_type lu If that works well you can try -pc_type asm -sub_pc_type ilu and see if that still works. If the matrix is indeed symmetric positive definite you will want to use -ksp_type cg Barry On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote: > Hi all, > > I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves: > > [1] Solving system... 0.154203 s > > KSP Object: > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: > type: bjacobi > block Jacobi: number of blocks = 3 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object:(sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object:(sub_) > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 1e-12 > using diagonal shift to prevent zero pivot > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Matrix Object: > type=seqaij, rows=2020, cols=2020 > package used to perform factorization: petsc > total: nonzeros=119396, allocated nonzeros=163620 > using I-node routines: found 676 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2020, cols=2020 > total: nonzeros=119396, allocated nonzeros=163620 > using I-node routines: found 676 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=6058, cols=6058 > total: nonzeros=365026, allocated nonzeros=509941 > using I-node (on process 0) routines: found 676 nodes, limit used is 5 > > > [1] System solved in 51 iterations... 0.543215 s > ... > ... > ... > > [1] Solving system... 0.302414 s > > KSP Object: > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: > type: bjacobi > block Jacobi: number of blocks = 3 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object:(sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object:(sub_) > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 1e-12 > using diagonal shift to prevent zero pivot > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Matrix Object: > type=seqaij, rows=2020, cols=2020 > package used to perform factorization: petsc > total: nonzeros=119396, allocated nonzeros=163620 > using I-node routines: found 676 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2020, cols=2020 > total: nonzeros=119396, allocated nonzeros=163620 > using I-node routines: found 676 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=6058, cols=6058 > total: nonzeros=365026, allocated nonzeros=509941 > using I-node (on process 0) routines: found 676 nodes, limit used is 5 > > [1] System solved in 3664 iterations... 42.683 s > > > > As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all, > > Alejandro M. Arag?n, Ph.D. From alejandro.aragon at gmail.com Tue Apr 26 09:45:53 2011 From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=) Date: Tue, 26 Apr 2011 16:45:53 +0200 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov> References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov> Message-ID: Thank you Barry, your suggestion really helped speed up the program. The maximum number of iterations is 48. I still don't know what the asm pre-conditioner is but I guess I just need to read the manual. I'm trying to add code to do what you suggested automatically, and I found that I can add: PC pc; ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr); ierr = PCSetType(pc, PCASM); CHKERR(ierr); However, I cannot find the function to replace the -sub_pc_type. Can you point me where to look? My system may not be symmetric so I can't use the cg option. Thanks again for your response. a? On Apr 26, 2011, at 2:23 PM, Barry Smith wrote: > >> System solved in 3664 iterations... 42.683 s > > The preconditioner is simply not up to the task it has been assigned. This number of iterations is problematic. > > Have you tried -pc_type asm -sub_pc_type lu If that works well you can try -pc_type asm -sub_pc_type ilu and see if that still works. > > If the matrix is indeed symmetric positive definite you will want to use -ksp_type cg > > > > Barry > > > On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote: > >> Hi all, >> >> I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves: >> >> [1] Solving system... 0.154203 s >> >> KSP Object: >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using nonzero initial guess >> using PRECONDITIONED norm type for convergence test >> PC Object: >> type: bjacobi >> block Jacobi: number of blocks = 3 >> Local solve is same for all blocks, in the following KSP and PC objects: >> KSP Object:(sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object:(sub_) >> type: ilu >> ILU: out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 1e-12 >> using diagonal shift to prevent zero pivot >> matrix ordering: natural >> factor fill ratio given 1, needed 1 >> Factored matrix follows: >> Matrix Object: >> type=seqaij, rows=2020, cols=2020 >> package used to perform factorization: petsc >> total: nonzeros=119396, allocated nonzeros=163620 >> using I-node routines: found 676 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=2020, cols=2020 >> total: nonzeros=119396, allocated nonzeros=163620 >> using I-node routines: found 676 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=6058, cols=6058 >> total: nonzeros=365026, allocated nonzeros=509941 >> using I-node (on process 0) routines: found 676 nodes, limit used is 5 >> >> >> [1] System solved in 51 iterations... 0.543215 s >> ... >> ... >> ... >> >> [1] Solving system... 0.302414 s >> >> KSP Object: >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using nonzero initial guess >> using PRECONDITIONED norm type for convergence test >> PC Object: >> type: bjacobi >> block Jacobi: number of blocks = 3 >> Local solve is same for all blocks, in the following KSP and PC objects: >> KSP Object:(sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object:(sub_) >> type: ilu >> ILU: out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 1e-12 >> using diagonal shift to prevent zero pivot >> matrix ordering: natural >> factor fill ratio given 1, needed 1 >> Factored matrix follows: >> Matrix Object: >> type=seqaij, rows=2020, cols=2020 >> package used to perform factorization: petsc >> total: nonzeros=119396, allocated nonzeros=163620 >> using I-node routines: found 676 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=2020, cols=2020 >> total: nonzeros=119396, allocated nonzeros=163620 >> using I-node routines: found 676 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=6058, cols=6058 >> total: nonzeros=365026, allocated nonzeros=509941 >> using I-node (on process 0) routines: found 676 nodes, limit used is 5 >> >> [1] System solved in 3664 iterations... 42.683 s >> >> >> >> As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all, >> >> Alejandro M. Arag?n, Ph.D. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Apr 26 10:26:54 2011 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 26 Apr 2011 10:26:54 -0500 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov> Message-ID: 2011/4/26 Alejandro Marcos Arag?n > Thank you Barry, your suggestion really helped speed up the program. The > maximum number of iterations is 48. I still don't know what the asm > pre-conditioner is but I guess I just need to read the manual. I'm trying to > add code to do what you suggested automatically, and I found that I can add: > > PC pc; > ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr); > ierr = PCSetType(pc, PCASM); CHKERR(ierr); > > However, I cannot find the function to replace the -sub_pc_type. Can you > point me where to look? My system may not be symmetric so I can't use the cg > option. > 1) Hard coding it in your program does not make sense. You gain nothing, and lose a lot of flexibility. 2) You can do this using http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCASMGetSubKSP.html Matt > Thanks again for your response. > > a? > > > On Apr 26, 2011, at 2:23 PM, Barry Smith wrote: > > > System solved in 3664 iterations... 42.683 s > > > The preconditioner is simply not up to the task it has been assigned. > This number of iterations is problematic. > > Have you tried -pc_type asm -sub_pc_type lu If that works well you > can try -pc_type asm -sub_pc_type ilu and see if that still works. > > If the matrix is indeed symmetric positive definite you will want to use > -ksp_type cg > > > > Barry > > > On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote: > > Hi all, > > > I'm using the standard configuration of the KSP solver, but the time it > takes to solve a large system of equations is increasing (because of the > increasing number of iterations?). These are my timing lines and the log > from the KSP solver in two consecutive solves: > > > [1] Solving system... 0.154203 s > > > KSP Object: > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using nonzero initial guess > > using PRECONDITIONED norm type for convergence test > > PC Object: > > type: bjacobi > > block Jacobi: number of blocks = 3 > > Local solve is same for all blocks, in the following KSP and PC objects: > > KSP Object:(sub_) > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object:(sub_) > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 1e-12 > > using diagonal shift to prevent zero pivot > > matrix ordering: natural > > factor fill ratio given 1, needed 1 > > Factored matrix follows: > > Matrix Object: > > type=seqaij, rows=2020, cols=2020 > > package used to perform factorization: petsc > > total: nonzeros=119396, allocated nonzeros=163620 > > using I-node routines: found 676 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Matrix Object: > > type=seqaij, rows=2020, cols=2020 > > total: nonzeros=119396, allocated nonzeros=163620 > > using I-node routines: found 676 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Matrix Object: > > type=mpiaij, rows=6058, cols=6058 > > total: nonzeros=365026, allocated nonzeros=509941 > > using I-node (on process 0) routines: found 676 nodes, limit used is 5 > > > > [1] System solved in 51 iterations... 0.543215 s > > ... > > ... > > ... > > > [1] Solving system... 0.302414 s > > > KSP Object: > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using nonzero initial guess > > using PRECONDITIONED norm type for convergence test > > PC Object: > > type: bjacobi > > block Jacobi: number of blocks = 3 > > Local solve is same for all blocks, in the following KSP and PC objects: > > KSP Object:(sub_) > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object:(sub_) > > type: ilu > > ILU: out-of-place factorization > > 0 levels of fill > > tolerance for zero pivot 1e-12 > > using diagonal shift to prevent zero pivot > > matrix ordering: natural > > factor fill ratio given 1, needed 1 > > Factored matrix follows: > > Matrix Object: > > type=seqaij, rows=2020, cols=2020 > > package used to perform factorization: petsc > > total: nonzeros=119396, allocated nonzeros=163620 > > using I-node routines: found 676 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Matrix Object: > > type=seqaij, rows=2020, cols=2020 > > total: nonzeros=119396, allocated nonzeros=163620 > > using I-node routines: found 676 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Matrix Object: > > type=mpiaij, rows=6058, cols=6058 > > total: nonzeros=365026, allocated nonzeros=509941 > > using I-node (on process 0) routines: found 676 nodes, limit used is 5 > > > [1] System solved in 3664 iterations... 42.683 s > > > > > As you can see, the second iteration takes more than 40 seconds to solve. > Could some explain why this is happening and why he number of iterations is > increasing dramatically between solves? Thank you all, > > > Alejandro M. Arag?n, Ph.D. > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From alejandro.aragon at gmail.com Wed Apr 27 02:17:27 2011 From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=) Date: Wed, 27 Apr 2011 09:17:27 +0200 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov> Message-ID: <7D72624F-0B0D-4494-A230-F814C8C0FB28@gmail.com> I understand that, but I'm trying to provide default behavior for the solver because the default one (no parameters) works very bad in my case. However, I'm stuck because I can't set the same parameters that I obtain with command line arguments "-pc_type asm -sub_pc_type lu". Can someone point me where is the error with the following code? ... ... PetscInitialize(&argc, &argv,NULL,NULL); PetscErrorCode ierr = MatCreate(PETSC_COMM_WORLD,&A_);CHKERR(ierr); // create linear solver context ierr = KSPCreate(PETSC_COMM_WORLD,&ksp_);CHKERR(ierr); // initial nonzero guess ierr = KSPSetInitialGuessNonzero(ksp_,PETSC_TRUE); CHKERR(ierr); // set runtime options ierr = KSPSetFromOptions(ksp_);CHKERR(ierr); // set the default preconditioner for this program to be ASM PC pc; ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr); ierr = PCSetType(pc, PCASM); CHKERR(ierr); KSP *subksp; /* array of KSP contexts for local subblocks */ PetscInt nlocal,first; /* number of local subblocks, first local subblock */ PC subpc; /* PC context for subblock */ /* Call KSPSetUp() to set the block Jacobi data structures (including creation of an internal KSP context for each block). Note: KSPSetUp() MUST be called before PCASMGetSubKSP(). */ ierr = KSPSetUp(ksp_);CHKERR(ierr); /* Extract the array of KSP contexts for the local blocks */ ierr = PCASMGetSubKSP(pc,&nlocal,&first,&subksp);CHKERR(ierr); /* Loop over the local blocks, setting various KSP options for each block. */ for (int i=0; i 2011/4/26 Alejandro Marcos Arag?n > Thank you Barry, your suggestion really helped speed up the program. The maximum number of iterations is 48. I still don't know what the asm pre-conditioner is but I guess I just need to read the manual. I'm trying to add code to do what you suggested automatically, and I found that I can add: > > PC pc; > ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr); > ierr = PCSetType(pc, PCASM); CHKERR(ierr); > > However, I cannot find the function to replace the -sub_pc_type. Can you point me where to look? My system may not be symmetric so I can't use the cg option. > > 1) Hard coding it in your program does not make sense. You gain nothing, and lose a lot of flexibility. > > 2) You can do this using > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCASMGetSubKSP.html > > Matt > > Thanks again for your response. > > a? > > > On Apr 26, 2011, at 2:23 PM, Barry Smith wrote: > >> >>> System solved in 3664 iterations... 42.683 s >> >> The preconditioner is simply not up to the task it has been assigned. This number of iterations is problematic. >> >> Have you tried -pc_type asm -sub_pc_type lu If that works well you can try -pc_type asm -sub_pc_type ilu and see if that still works. >> >> If the matrix is indeed symmetric positive definite you will want to use -ksp_type cg >> >> >> >> Barry >> >> >> On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote: >> >>> Hi all, >>> >>> I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves: >>> >>> [1] Solving system... 0.154203 s >>> >>> KSP Object: >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=10000 >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using nonzero initial guess >>> using PRECONDITIONED norm type for convergence test >>> PC Object: >>> type: bjacobi >>> block Jacobi: number of blocks = 3 >>> Local solve is same for all blocks, in the following KSP and PC objects: >>> KSP Object:(sub_) >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object:(sub_) >>> type: ilu >>> ILU: out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 1e-12 >>> using diagonal shift to prevent zero pivot >>> matrix ordering: natural >>> factor fill ratio given 1, needed 1 >>> Factored matrix follows: >>> Matrix Object: >>> type=seqaij, rows=2020, cols=2020 >>> package used to perform factorization: petsc >>> total: nonzeros=119396, allocated nonzeros=163620 >>> using I-node routines: found 676 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Matrix Object: >>> type=seqaij, rows=2020, cols=2020 >>> total: nonzeros=119396, allocated nonzeros=163620 >>> using I-node routines: found 676 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Matrix Object: >>> type=mpiaij, rows=6058, cols=6058 >>> total: nonzeros=365026, allocated nonzeros=509941 >>> using I-node (on process 0) routines: found 676 nodes, limit used is 5 >>> >>> >>> [1] System solved in 51 iterations... 0.543215 s >>> ... >>> ... >>> ... >>> >>> [1] Solving system... 0.302414 s >>> >>> KSP Object: >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=10000 >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using nonzero initial guess >>> using PRECONDITIONED norm type for convergence test >>> PC Object: >>> type: bjacobi >>> block Jacobi: number of blocks = 3 >>> Local solve is same for all blocks, in the following KSP and PC objects: >>> KSP Object:(sub_) >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object:(sub_) >>> type: ilu >>> ILU: out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 1e-12 >>> using diagonal shift to prevent zero pivot >>> matrix ordering: natural >>> factor fill ratio given 1, needed 1 >>> Factored matrix follows: >>> Matrix Object: >>> type=seqaij, rows=2020, cols=2020 >>> package used to perform factorization: petsc >>> total: nonzeros=119396, allocated nonzeros=163620 >>> using I-node routines: found 676 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Matrix Object: >>> type=seqaij, rows=2020, cols=2020 >>> total: nonzeros=119396, allocated nonzeros=163620 >>> using I-node routines: found 676 nodes, limit used is 5 >>> linear system matrix = precond matrix: >>> Matrix Object: >>> type=mpiaij, rows=6058, cols=6058 >>> total: nonzeros=365026, allocated nonzeros=509941 >>> using I-node (on process 0) routines: found 676 nodes, limit used is 5 >>> >>> [1] System solved in 3664 iterations... 42.683 s >>> >>> >>> >>> As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all, >>> >>> Alejandro M. Arag?n, Ph.D. >> > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Wed Apr 27 04:32:27 2011 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 27 Apr 2011 11:32:27 +0200 Subject: [petsc-users] KSP solver increases the solution time In-Reply-To: <7D72624F-0B0D-4494-A230-F814C8C0FB28@gmail.com> References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com> <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov> <7D72624F-0B0D-4494-A230-F814C8C0FB28@gmail.com> Message-ID: El 27/04/2011, a las 09:17, Alejandro Marcos Arag?n escribi?: > I understand that, but I'm trying to provide default behavior for the solver because the default one (no parameters) works very bad in my case. > However, I'm stuck because I can't set the same parameters that I obtain with command line arguments "-pc_type asm -sub_pc_type lu". > > Can someone point me where is the error with the following code? You should call KSPSetOperators before doing all the setup. Jose > > ... > ... > PetscInitialize(&argc, &argv,NULL,NULL); > PetscErrorCode ierr = MatCreate(PETSC_COMM_WORLD,&A_);CHKERR(ierr); > > // create linear solver context > ierr = KSPCreate(PETSC_COMM_WORLD,&ksp_);CHKERR(ierr); > > // initial nonzero guess > ierr = KSPSetInitialGuessNonzero(ksp_,PETSC_TRUE); CHKERR(ierr); > > // set runtime options > ierr = KSPSetFromOptions(ksp_);CHKERR(ierr); > > // set the default preconditioner for this program to be ASM > PC pc; > ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr); > ierr = PCSetType(pc, PCASM); CHKERR(ierr); > > KSP *subksp; /* array of KSP contexts for local subblocks */ > PetscInt nlocal,first; /* number of local subblocks, first local subblock */ > PC subpc; /* PC context for subblock */ > > /* > Call KSPSetUp() to set the block Jacobi data structures (including > creation of an internal KSP context for each block). > > Note: KSPSetUp() MUST be called before PCASMGetSubKSP(). > */ > ierr = KSPSetUp(ksp_);CHKERR(ierr); > > /* > Extract the array of KSP contexts for the local blocks > */ > ierr = PCASMGetSubKSP(pc,&nlocal,&first,&subksp);CHKERR(ierr); > > /* > Loop over the local blocks, setting various KSP options > for each block. > */ > for (int i=0; i ierr = KSPGetPC(subksp[i],&subpc);CHKERR(ierr); > ierr = PCSetType(subpc,PCLU);CHKERR(ierr); > } > > This is the error I get: > > User explicitly sets subdomain solvers. > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Null argument, when expecting valid pointer! > [0]PETSC ERROR: Null Object: Parameter # 1! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: yafeq/a.out on a darwin10. named lsmspc26.epfl.ch by aaragon Wed Apr 27 09:14:25 2011 > [0]PETSC ERROR: Libraries linked from /Users/aaragon/Local/lib > [0]PETSC ERROR: Configure run at Thu Apr 7 17:01:26 2011 > [0]PETSC ERROR: Configure options --prefix=/Users/aaragon/Local --with-mpi-include=/Users/aaragon/Local/include --with-mpi-lib=/Users/aaragon/Local/lib/libmpich.a --with-superlu=1 --with-superlu-include=/Users/aaragon/Local/include/superlu --with-superlu-lib=/Users/aaragon/Local/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/Users/aaragon/Local/include/superlu_dist --with-superlu_dist-lib=/Users/aaragon/Local/lib/libsuperlu_dist.a --with-parmetis=1 --download-parmetis=ifneeded > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: MatGetVecs() line 7265 in src/mat/interface/matrix.c > [0]PETSC ERROR: KSPGetVecs() line 806 in src/ksp/ksp/interface/iterativ.c > [0]PETSC ERROR: KSPSetUp_GMRES() line 94 in src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: KSPSetUp() line 199 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: User provided function() line 397 in "unknowndirectory/"/Users/aaragon/Local/include/cpputils/solver.hpp > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Operation done in wrong order! > [0]PETSC ERROR: Need to call PCSetUP() on PC (or KSPSetUp() on the outer KSP object) before calling here! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: yafeq/a.out on a darwin10. named lsmspc26.epfl.ch by aaragon Wed Apr 27 09:14:25 2011 > [0]PETSC ERROR: Libraries linked from /Users/aaragon/Local/lib > [0]PETSC ERROR: Configure run at Thu Apr 7 17:01:26 2011 > [0]PETSC ERROR: Configure options --prefix=/Users/aaragon/Local --with-mpi-include=/Users/aaragon/Local/include --with-mpi-lib=/Users/aaragon/Local/lib/libmpich.a --with-superlu=1 --with-superlu-include=/Users/aaragon/Local/include/superlu --with-superlu-lib=/Users/aaragon/Local/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/Users/aaragon/Local/include/superlu_dist --with-superlu_dist-lib=/Users/aaragon/Local/lib/libsuperlu_dist.a --with-parmetis=1 --download-parmetis=ifneeded > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: PCASMGetSubKSP_ASM() line 644 in src/ksp/pc/impls/asm/asm.c > [0]PETSC ERROR: PCASMGetSubKSP() line 926 in src/ksp/pc/impls/asm/asm.c > [0]PETSC ERROR: User provided function() line 402 in "unknowndirectory/"/Users/aaragon/Local/include/cpputils/solver.hpp > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010 > > and the error continues... > From xdliang at gmail.com Wed Apr 27 10:57:13 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Wed, 27 Apr 2011 11:57:13 -0400 Subject: [petsc-users] Changing the diagonal of a matrix via a vector Message-ID: Hello everyone, I am a novice to Petsc and parallel computing. I have created a mpi sparse matrix A (size n-by-n) and a parallel vector b (size n-by-1). Now I want to modify the diagonal of A by adding the values of vector b. Namely, A(i,i) = A(i,i) + b(i) and all the off-diagonal elements remains the same. I am worrying that when I use MatSetValue or MatSetValues, b(i) may not be accessed by some particular processor since VecGetValues can only get values on the same processor. One possible solution I am thinking is converting vector b to a diagonal matrix B and then do the MatAXPY operation. However, using MatSetValue to set diagonal elements of B, B(i,i) = b(i), still faces the similar problem. Can anyone give me some suggestion? Thanks. Best, Xiangdong P.S. When I compiled my program, I get warnings like that: warning: return makes pointer from integer without a cast. Actually, these lines are standard Petsc functions like that: ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); ierr = VecDestroy(x); CHKERRQ(ierr); How can I get rid of these warnings? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 27 11:03:07 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 27 Apr 2011 18:03:07 +0200 Subject: [petsc-users] Changing the diagonal of a matrix via a vector In-Reply-To: References: Message-ID: On Wed, Apr 27, 2011 at 17:57, Xiangdong Liang wrote: > I am a novice to Petsc and parallel computing. I have created a mpi sparse > matrix A (size n-by-n) and a parallel vector b (size n-by-1). Now I want to > modify the diagonal of A by adding the values of vector b. Namely, A(i,i) = > A(i,i) + b(i) and all the off-diagonal elements remains the same. I am > worrying that when I use MatSetValue or MatSetValues, b(i) may not be > accessed by some particular processor since VecGetValues can only get values > on the same processor. One possible solution I am thinking is converting > vector b to a diagonal matrix B and then do the MatAXPY operation. However, > using MatSetValue to set diagonal elements of B, B(i,i) = b(i), still > faces the similar problem. Can anyone give me some suggestion? Thanks. > MatDiagonalSet(A,b,ADD_VALUES) http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Mat/MatDiagonalSet.html > P.S. When I compiled my program, I get warnings like that: warning: return > makes pointer from integer without a cast. Actually, these lines are > standard Petsc functions like that: > > ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); > ierr = VecDestroy(x); CHKERRQ(ierr); > > How can I get rid of these warnings? > Either make your function return int (or PetscErrorCode), passing "return values" back through arguments or use CHKERRV (worse because errors won't propagate up correctly). http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Sys/CHKERRQ.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdliang at gmail.com Wed Apr 27 11:41:22 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Wed, 27 Apr 2011 12:41:22 -0400 Subject: [petsc-users] Changing the diagonal of a matrix via a vector In-Reply-To: References: Message-ID: On Wed, Apr 27, 2011 at 12:03 PM, Jed Brown wrote: > On Wed, Apr 27, 2011 at 17:57, Xiangdong Liang wrote: > >> I am a novice to Petsc and parallel computing. I have created a mpi sparse >> matrix A (size n-by-n) and a parallel vector b (size n-by-1). Now I want to >> modify the diagonal of A by adding the values of vector b. Namely, A(i,i) = >> A(i,i) + b(i) and all the off-diagonal elements remains the same. I am >> worrying that when I use MatSetValue or MatSetValues, b(i) may not be >> accessed by some particular processor since VecGetValues can only get values >> on the same processor. One possible solution I am thinking is converting >> vector b to a diagonal matrix B and then do the MatAXPY operation. However, >> using MatSetValue to set diagonal elements of B, B(i,i) = b(i), still >> faces the similar problem. Can anyone give me some suggestion? Thanks. >> > > MatDiagonalSet(A,b,ADD_VALUES) > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Mat/MatDiagonalSet.html > > >> P.S. When I compiled my program, I get warnings like that: warning: return >> makes pointer from integer without a cast. Actually, these lines are >> standard Petsc functions like that: >> >> ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); >> ierr = VecDestroy(x); CHKERRQ(ierr); >> >> How can I get rid of these warnings? >> > > Either make your function return int (or PetscErrorCode), passing "return > values" back through arguments or use CHKERRV (worse because errors won't > propagate up correctly). > > Thanks a lot, Jed. I am using Petsc's built-in function, VecCreate and MatCreate. they are supposed to return PetscErrorCode. However, I still get "warning: return makes pointer from integer without a cast" for these built-in functions. > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Sys/CHKERRQ.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Wed Apr 27 12:41:43 2011 From: jed at 59A2.org (Jed Brown) Date: Wed, 27 Apr 2011 19:41:43 +0200 Subject: [petsc-users] Changing the diagonal of a matrix via a vector In-Reply-To: References: Message-ID: On Wed, Apr 27, 2011 at 18:41, Xiangdong Liang wrote: > Thanks a lot, Jed. I am using Petsc's built-in function, VecCreate and > MatCreate. they are supposed to return PetscErrorCode. However, I still get > "warning: return makes pointer from integer without a cast" for these > built-in functions. There is a "return" inside the CHKERRQ macro. You either need to make *your* function return PetscErrorCode or use a different checking macro (e.g. CHKERRABORT). -------------- next part -------------- An HTML attachment was scrubbed... URL: From xdliang at gmail.com Wed Apr 27 15:35:03 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Wed, 27 Apr 2011 16:35:03 -0400 Subject: [petsc-users] create a new large vector via combining existing small vectors Message-ID: Hello everyone, I have a problem with creating a new large vector via combining existing small vectors. Suppose I have two vectors v1 and v2 (size n-by-1) already. I want to have a new vector vout (size 2n-by-1) with vout(1:n) = v1 and vout(n+1:2*n) = v2. Is there any quick way to create vout with Petsc's built in functions? Thanks. Best, Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Apr 27 16:09:28 2011 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Apr 2011 16:09:28 -0500 Subject: [petsc-users] create a new large vector via combining existing small vectors In-Reply-To: References: Message-ID: On Apr 27, 2011, at 3:35 PM, Xiangdong Liang wrote: > Hello everyone, > > I have a problem with creating a new large vector via combining existing small vectors. Suppose I have two vectors v1 and v2 (size n-by-1) already. I want to have a new vector vout (size 2n-by-1) with vout(1:n) = v1 and vout(n+1:2*n) = v2. Is there any quick way to create vout with Petsc's built in functions? Thanks. No. You can use VecCreate() and then a couple of VecScatters to get the entries from the two small vectors to the large one. Barry > > Best, > Xiangdong From ilyascfd at gmail.com Thu Apr 28 05:36:18 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Thu, 28 Apr 2011 13:36:18 +0300 Subject: [petsc-users] setting up DA matrix for 3D periodic domain In-Reply-To: References: Message-ID: Jed, Sorry for my late response. Thank you very much. Ilyas. 2011/4/24 Jed Brown > On Sun, Apr 24, 2011 at 14:31, ilyas ilyas wrote: > >> According to this explanation, If I would set up a matrix for 3D periodic >> domain using DAs with DA_ XYZPERIODIC, >> The code segment given below could handle periodicity "without specifying >> boundary information within the loop" ? >> > > Yes, that code is fine. PETSc translates the periodic contributions. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilyascfd at gmail.com Thu Apr 28 06:03:11 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Thu, 28 Apr 2011 14:03:11 +0300 Subject: [petsc-users] working with different size of arrays in a single DA Message-ID: Hi, May be, It is a simple question, but I am little bit confused. If I have two different size of arrays (one is cell-based which is from 1 to N, other one is face-based which is from 1 to N+1). How can I create and work with them within a single DA structure, for example in evaluating a function or setting up a matrix ? Regards, Ilyas. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 28 06:50:16 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 28 Apr 2011 13:50:16 +0200 Subject: [petsc-users] working with different size of arrays in a single DA In-Reply-To: References: Message-ID: On Thu, Apr 28, 2011 at 13:03, ilyas ilyas wrote: > May be, It is a simple question, but I am little bit confused. > If I have two different size of arrays (one is cell-based which is from 1 > to N, > other one is face-based which is from 1 to N+1). > How can I create and work with them within a single DA structure, > for example in evaluating a function or setting up a matrix ? > Two choices: 1. Increase the block size (number of components per node) and just write the identity into the equatiosn for the "N+1" cell (which does not exist). This will normally give better memory performance and the few extra trivial equations around the margin are not a big deal. 2. Use two separate DAs and make the parallel decomposition compatible. You can put them together into one system using DMComposite. This is usually overkill for staggered grids, but extends to general multi-physics problems. Support for this option is better in petsc-dev, see, for example, http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/snes/examples/tutorials/ex28.c.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilyascfd at gmail.com Thu Apr 28 08:41:43 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Thu, 28 Apr 2011 16:41:43 +0300 Subject: [petsc-users] working with different size of arrays in a single DA In-Reply-To: References: Message-ID: Dear Jed, 2011/4/28 Jed Brown > On Thu, Apr 28, 2011 at 13:03, ilyas ilyas wrote: > >> May be, It is a simple question, but I am little bit confused. >> If I have two different size of arrays (one is cell-based which is from 1 >> to N, >> other one is face-based which is from 1 to N+1). >> How can I create and work with them within a single DA structure, >> for example in evaluating a function or setting up a matrix ? >> > > Two choices: > > 1. Increase the block size (number of components per node) and just write > the identity into the equatiosn for the "N+1" cell (which does not exist). > This will normally give better memory performance and the few extra trivial > equations around the margin are not a big deal. > Would you please explain more the first option? > > 2. Use two separate DAs and make the parallel decomposition compatible. You > can put them together into one system using DMComposite. This is usually > overkill for staggered grids, but extends to general multi-physics problems. > Support for this option is better in petsc-dev, see, for example, > > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/snes/examples/tutorials/ex28.c.html > Thank you, Ilyas. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bartlomiej.wach at yahoo.pl Thu Apr 28 09:02:23 2011 From: bartlomiej.wach at yahoo.pl (=?utf-8?B?QmFydMWCb21pZWogVw==?=) Date: Thu, 28 Apr 2011 15:02:23 +0100 (BST) Subject: [petsc-users] Large matrixes on single machine Message-ID: <458933.34056.qm@web28313.mail.ukl.yahoo.com> Hello, I was trying to allocate a sparse AIJ matrix of over 800 entries MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz); (with proper nonzeros vector) results in an Maximum memory PetscMalloc()ed 315699888 OS cannot compute size of entire process (in ubuntu) Can this be dealt with somehow? I am aware of the 4gb limitation of memory. If not, how would one run it on several machines? Thank You very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 28 09:33:52 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 28 Apr 2011 16:33:52 +0200 Subject: [petsc-users] Large matrixes on single machine In-Reply-To: <458933.34056.qm@web28313.mail.ukl.yahoo.com> References: <458933.34056.qm@web28313.mail.ukl.yahoo.com> Message-ID: On Thu, Apr 28, 2011 at 16:02, Bart?omiej W wrote: > Hello, > > I was trying to allocate a sparse AIJ matrix of over 800 entries > > MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz); > > (with proper nonzeros vector) > results in an > Maximum memory PetscMalloc()ed 315699888 OS cannot compute size of entire > process > (in ubuntu) What was in the nnz array? If you don't expect the problem to exceed the addressable memory, then the array is probably corrupt. If you really mean to be solving a very large problem, you will have to get a 64-bit machine and configure --with-64-bit-indices, or run in parallel. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Apr 28 09:36:16 2011 From: jed at 59A2.org (Jed Brown) Date: Thu, 28 Apr 2011 16:36:16 +0200 Subject: [petsc-users] working with different size of arrays in a single DA In-Reply-To: References: Message-ID: On Thu, Apr 28, 2011 at 15:41, ilyas ilyas wrote: > Would you please explain more the first option? You create a system with larger block size. If you have two node-centered values and one cell-centered, you would use a block size of 3. Each center would be associated with the node to its lower left, for example. There will be a fringe of "cell-centers" that extend out of domain on the right and top, you use x = 0 for these equations and the Jacobian for those equations will just have a 1 on the diagonal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Apr 28 11:54:59 2011 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 28 Apr 2011 11:54:59 -0500 (CDT) Subject: [petsc-users] Large matrixes on single machine In-Reply-To: References: <458933.34056.qm@web28313.mail.ukl.yahoo.com> Message-ID: On Thu, 28 Apr 2011, Jed Brown wrote: > On Thu, Apr 28, 2011 at 16:02, Bart?omiej W wrote: > > > Hello, > > > > I was trying to allocate a sparse AIJ matrix of over 800 entries > > > > MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz); > > > > (with proper nonzeros vector) > > results in an > > Maximum memory PetscMalloc()ed 315699888 OS cannot compute size of entire > > process > > (in ubuntu) > > > What was in the nnz array? If you don't expect the problem to exceed the > addressable memory, then the array is probably corrupt. If you really mean > to be solving a very large problem, you will have to get a 64-bit machine > and configure --with-64-bit-indices, or run in parallel. What was the complete error message? The above says '315MB in use'. So was the code trying to allocate 3GB - when it failed? Also How many total non-zeros in the matrix? Satish From xdliang at gmail.com Fri Apr 29 10:50:06 2011 From: xdliang at gmail.com (Xiangdong Liang) Date: Fri, 29 Apr 2011 11:50:06 -0400 Subject: [petsc-users] questions about parallel vectors Message-ID: Hello everyone, I am trying to create a sparse matrix A, which depends on a parallel vector v. For example, my function looks like this: Mat myfun(MPI_Comm comm, Vec v, other parameters). When I set the value A(i,j) = v[k], v[k] may not be obtained by VecGetValues since that operation can only get values on the same processor. I am thinking 1) create v as an array and pass this array into myfun. 2) create another vector v2, which is a full copy of parallel v through VecScatter. 3) when I first create the initial vec v, using VecCreate(PETSC_COMM_SELF,v) or VecCreateSeq. Does this guarantee that all the processors creating matrix A have all the components of vector v? I think 1) and 2) are going to work, but not sure about option 3). I have no idea which would have better performance. Can you give me some suggestions on how to handle this problem? Thanks. Another quick question, what is the difference between PetscViewerSetFormat and PetscViewerPushFormat? Best, Xiangdong From abhyshr at mcs.anl.gov Fri Apr 29 14:41:25 2011 From: abhyshr at mcs.anl.gov (Shri) Date: Fri, 29 Apr 2011 14:41:25 -0500 (CDT) Subject: [petsc-users] questions about parallel vectors In-Reply-To: Message-ID: <175206172.8268.1304106085650.JavaMail.root@zimbra.anl.gov> ----- Original Message ----- > Hello everyone, > > I am trying to create a sparse matrix A, which depends on a parallel > vector v. For example, my function looks like this: Mat > myfun(MPI_Comm comm, Vec v, other parameters). When I set the value > A(i,j) = v[k], v[k] may not be obtained by VecGetValues since that > operation can only get values on the same processor. I am thinking > > 1) create v as an array and pass this array into myfun. > 2) create another vector v2, which is a full copy of parallel v > through VecScatter. Do you need all the vector elements on each processor to set the matrix values or just a subset of them? ' If you need all the vector elements then you can use VecScattertoAll http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Vec/VecScatterCreateToAll.html If you only need a subset then you could create v as a ghosted vector. See the example http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/vec/vec/examples/tutorials/ex9.c.html > 3) when I first create the initial vec v, using > VecCreate(PETSC_COMM_SELF,v) or VecCreateSeq. Does this guarantee that > all the processors creating matrix A have all the components of vector > v? > > I think 1) and 2) are going to work, but not sure about option 3). I > have no idea which would have better performance. Can you give me some > suggestions on how to handle this problem? Thanks. > > Another quick question, what is the difference between > PetscViewerSetFormat and PetscViewerPushFormat? > > Best, > Xiangdong From ilyascfd at gmail.com Sat Apr 30 06:17:43 2011 From: ilyascfd at gmail.com (ilyas ilyas) Date: Sat, 30 Apr 2011 14:17:43 +0300 Subject: [petsc-users] working with different size of arrays in a single DA In-Reply-To: References: Message-ID: Thank you Jed, I guess the thing that makes it complicated application of boundary conditions. Since XYZGHOSTED is not available in fortran, I am using XYZPERIODIC in order to implement bc's, as it is done in ex31.c in SNES Combining XYZPERIODIC with larger block size is still not much clear for me. By the way, what is the current status of DMDACreate3D and other DMDA routines and their fortran support ? According to ex11f90.F and ex22f.F in the manual page of DMDACreate3d , these routine(s) provide ghost cell support for fortran in 3D ? If ghost cell support is available, implementation would be relatively easy Cheers, Ilyas 2011/4/28 Jed Brown > On Thu, Apr 28, 2011 at 15:41, ilyas ilyas wrote: > >> Would you please explain more the first option? > > > You create a system with larger block size. If you have two node-centered > values and one cell-centered, you would use a block size of 3. Each center > would be associated with the node to its lower left, for example. There will > be a fringe of "cell-centers" that extend out of domain on the right and > top, you use x = 0 for these equations and the Jacobian for those equations > will just have a 1 on the diagonal. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Sat Apr 30 10:45:06 2011 From: jed at 59A2.org (Jed Brown) Date: Sat, 30 Apr 2011 17:45:06 +0200 Subject: [petsc-users] working with different size of arrays in a single DA In-Reply-To: References: Message-ID: On Sat, Apr 30, 2011 at 13:17, ilyas ilyas wrote: > I guess the thing that makes it complicated application of boundary > conditions. > Since XYZGHOSTED is not available in fortran, > I am using XYZPERIODIC in order to implement bc's, as it is done in ex31.c > in SNES > Combining XYZPERIODIC with larger block size is still not much clear for > me. > > By the way, what is the current status of DMDACreate3D > and other DMDA routines and their fortran support ? > According to ex11f90.F and ex22f.F in the manual page of DMDACreate3d , > these routine(s) provide ghost cell support for fortran in 3D ? > If ghost cell support is available, implementation would be relatively easy > There is Fortran support for periodic boundaries. Also, petsc-dev has support for ghost cells even when they do not imply any wrapping, see DMDA_BOUNDARY_GHOST. http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/DM/DMDACreate3d.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From stali at geology.wisc.edu Sat Apr 30 12:58:52 2011 From: stali at geology.wisc.edu (Tabrez Ali) Date: Sat, 30 Apr 2011 12:58:52 -0500 Subject: [petsc-users] Preallocation (Unstructured FE) Message-ID: <4DBC4DDC.1010604@geology.wisc.edu> Petsc Developers/Users I having some performance issues with preallocation in a fully unstructured FE code. It would be very helpful if those using FE codes can comment. For a problem of size 100K nodes and 600K tet elements (on 1 cpu) 1. If I calculate the _exact_ number of non-zeros per row (using a running list in Fortran) by looping over nodes & elements, the code takes 17 mins (to calculate nnz's/per row, assemble and solve). 2. If I dont use a running list and simply get the average of the max number of nodes a node might be connected to (again by looping over nodes & elements but not using a running list) then it takes 8 mins 3. If I just magically guess the right value calculated in 2 and use that as average nnz per row then it only takes 25 secs. Basically in all cases Assembly and Solve are very fast (few seconds) but the nnz calculation itself (in 2 and 3) takes a long time. How can this be cut down? Is there a heuristic way to estimate the number (as done in 3) even if it slightly overestimates the nnz's per row or are efficient ways to do step 1 or 2. Right now I have do i=1,num_nodes; do j=1,num_elements ... which obviously is slow for large number of nodes/elements. Thanks in advance Tabrez