From C.Klaij at marin.nl Wed Feb 1 01:42:38 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 1 Feb 2012 07:42:38 +0000 Subject: [petsc-users] access to matnest block (0,1) ? Message-ID: >> Then what would be the best way to create the IS in this case? >> Can it somehow be deduced from the separate blocks? >> > >The ISs define the row space of the blocks inside the global matrix. Following ex28 I made two stride ISs to define the row space but I'm still having trouble, this is for 2 procs: Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="a00_", type=mpiaij, rows=24, cols=24 (0,1) : prefix="a01_", type=mpiaij, rows=24, cols=12 (1,0) : prefix="a10_", type=mpiaij, rows=12, cols=24 (1,1) : prefix="a11_", type=mpiaij, rows=12, cols=12 [0] Index set is permutation [0] Number of indices in (stride) set 12 [0] 0 0 [0] 1 1 [0] 2 2 [0] 3 3 [0] 4 4 [0] 5 5 [0] 6 6 [0] 7 7 [0] 8 8 [0] 9 9 [0] 10 10 [0] 11 11 [1] Number of indices in (stride) set 12 [1] 0 12 [1] 1 13 [1] 2 14 [1] 3 15 [1] 4 16 [1] 5 17 [1] 6 18 [1] 7 19 [1] 8 20 [1] 9 21 [1] 10 22 [1] 11 23 [0] Number of indices in (stride) set 6 [0] 0 24 [0] 1 25 [0] 2 26 [0] 3 27 [0] 4 28 [0] 5 29 [1] Number of indices in (stride) set 6 [1] 0 30 [1] 1 31 [1] 2 32 [1] 3 33 [1] 4 34 [1] 5 35 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Arguments are incompatible! [0]PETSC ERROR: Could not find index set! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./matnest-trycpp on a linux_64b named lin0133 by cklaij Wed Feb 1 08:33:40 2012 [0]PETSC ERROR: Libraries linked from /opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5/lib [0]PETSC ERROR: Configure run at Thu Jan 26 13:44:12 2012 [0]PETSC ERROR: Configure options --prefix=/opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5 --with-mpi-dir=/opt/refresco/64bit_intelv11.1_openmpi/openmpi-1.4.4 --with-x=1 --with-mpe=0 --with-debugging=1 --with-clanguage=c++ --with-hypre-include=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/include --with-hypre-lib=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/lib/libHYPRE.a --with-ml-include=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/include --with-ml-lib=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/lib/libml.a --with-blas-lapack-dir=/opt/intel/mkl [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatNestFindIS() line 292 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/impls/nest/matnest.c [0]PETSC ERROR: MatNestFindSubMat() line 350 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/impls/nest/matnest.c [0]PETSC ERROR: MatGetSubMatrix_Nest() line 366 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/impls/nest/matnest.c [0]PETSC ERROR: MatGetSubMatrix() line 7135 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_FieldSplit() line 377 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: PCSetUp() line 819 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 260 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 379 in /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/ksp/interface/itfunc.c dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From Thomas.Witkowski at tu-dresden.de Wed Feb 1 04:02:28 2012 From: Thomas.Witkowski at tu-dresden.de (Thomas Witkowski) Date: Wed, 01 Feb 2012 11:02:28 +0100 Subject: [petsc-users] Preconditioning of laplace equation with boomeramg Message-ID: <20120201110228.e1mm4ni3jgogow0o@mail.zih.tu-dresden.de> I'm about to write a preconditioner for a 2x2 block matrix coming from the FEM discretization (2D, equidistance mesh) of a 4th order PDE. The (0,0) block is the discrete laplace with Neumann boundary conditions. I want to precondition this block by using some amg iterations. Just for tests, the preconditioner should solve in each step the neumann problem with some rhs using a gmres solver with hypre/boomeramg preconditioner up to a given tolerance. But in most iterations it ends up with DIVERGED_DTOL. What could be the reason for? Is AMG is linear preconditioner? The constant null space of the matrix is set via KSPSetNullSpace. Thomas From loic.gouarin at math.u-psud.fr Wed Feb 1 04:47:01 2012 From: loic.gouarin at math.u-psud.fr (gouarin) Date: Wed, 01 Feb 2012 11:47:01 +0100 Subject: [petsc-users] MG on fieldsplit Message-ID: <4F291825.40000@math.u-psud.fr> Hi, I sent some email on Stokes solver using DMComposite grid (one for the velocity and an other one for the pressure). Now, I try to use PCMG to precondition the velocity part with fieldsplit and PETSc command line options only. This is an example of my options: ./test -stokes_ksp_type fgmres -stokes_pc_type fieldsplit -stokes_fieldsplit_0_ksp_type preonly -stokes_fieldsplit_0_pc_type mg -stokes_fieldsplit_0_pc_mg_levels 2 -stokes_fieldsplit_0_pc_mg_galerkin ... -stokes_fieldsplit_1_ksp_type preonly -stokes_fieldsplit_1_pc_type jacobi It doesn't work because we need to call PCFieldSplitGetSubKSP to get the subpc[0] to create the interpolation matrices. So, we have to call KSPSetUp before PCFieldSplitGetSubKSP. But, if we call KSPSetUp, we have an error because the interpolation matrices are not created ! But If I do: ------ PCSetDM(pc, dom->get_pack()); PCSetType(pc, PCFIELDSPLIT); KSPSetUp(solver); PCFieldSplitGetSubKSP(pc, &nsplits, &subksp); PetscMalloc(sizeof(PC)*nsplits, &subpc); KSPSetType(subksp[0], KSPPREONLY); KSPGetPC(subksp[0], &subpc[0]); PCSetType(subpc[0], PCMG); create the interpolation matrices KSPSetFromOptions(solver); ------ It works but we can't set the mg_levels with the command line options anymore. Moreover, I have some warning of this type: Option left: name:-stokes_fieldsplit_0_pc_mg_multiplicative_cycles value: 3 Is there a way to do that only with the command line option ? Thanks, Loic -- Loic Gouarin Laboratoire de Math?matiques Universit? Paris-Sud B?timent 425 91405 Orsay Cedex France Tel: (+33) 1 69 15 60 14 Fax: (+33) 1 69 15 67 18 From mike.hui.zhang at hotmail.com Wed Feb 1 05:02:19 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Wed, 1 Feb 2012 12:02:19 +0100 Subject: [petsc-users] Row Orderings of Mat and Vec In-Reply-To: References: Message-ID: On Jan 31, 2012, at 1:23 PM, Jed Brown wrote: > On Tue, Jan 31, 2012 at 04:11, Hui Zhang wrote: > Suppose using MatCreate() and MatSetSizes() we obtain a Mat A, and then > using VecCreate() and VecSetSizes() with the same rows' arguments as A > we obtain a Vec b. Let A_i (b_i) consists of the rows owned by processor i. > > Can I believe that under the ordering of petsc, the following equality holds, > A= [A_0 > A_1 > ... > A_N], > b= [b_0 > b_1 > ... > b_N], > i.e. A_i, b_i with smaller i--the processor number goes first? > > Matrices and vectors always have contiguous row partitions, yes. > > > My question comes from assembly of linear system. Suppose under my application > ordering(AO) the system is Ax=b. I can get an AO from A using > > MatGetOwnershipRanges(A,&Istart,&Iend); > AOCreateBasic( -, -,app_ind, Istart..Iend, &ao_1); > > and assemble A. Under the petsc ordering, A becomes P_1*A mathematically, > > I don't think I follow. This AO does not transform A in any way. You can use the AO to translate application indices to PETSc indices to give to MatSetValues(). > > In most cases, I would recommend redistributing the mesh according to the partition and then using local indices (e.g. MatSetValuesLocal()) during assembly. You might use AO for that setup step, but it usually doesn't make sense to use _during_ assembly. thanks, do you mean that it is faster to use LocaltoGlobalMapping & SetValuesLocal than to use AO & SetValues globally? Is there a big difference in performance? > > with > P_1 the permutation corresponding to ao_1. In a similar way, we have that b > becomes P_2 b under the petsc ordering. I want to make sure P_1 and P_2 are > the same so arises my question. > > Thanks a lot! > Hui > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Feb 1 05:32:49 2012 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 1 Feb 2012 12:32:49 +0100 Subject: [petsc-users] Preconditioning of laplace equation with boomeramg In-Reply-To: <20120201110228.e1mm4ni3jgogow0o@mail.zih.tu-dresden.de> References: <20120201110228.e1mm4ni3jgogow0o@mail.zih.tu-dresden.de> Message-ID: What's happening with the solve on the coarse grid? Are you using a direct solver? Are you sure the null space is removed from the coarse grid operator as well? On 1 February 2012 11:02, Thomas Witkowski wrote: > I'm about to write a preconditioner for a 2x2 block matrix coming from the > FEM discretization (2D, equidistance mesh) of a 4th order PDE. The (0,0) > block is the discrete laplace with Neumann boundary conditions. I want to > precondition this block by using some amg iterations. Just for tests, the > preconditioner should solve in each step the neumann problem with some rhs > using a gmres solver with hypre/boomeramg preconditioner up to a given > tolerance. But in most iterations it ends up with DIVERGED_DTOL. What could > be the reason for? Is AMG is linear preconditioner? The constant null space > of the matrix is set via KSPSetNullSpace. > > Thomas From thomas.witkowski at tu-dresden.de Wed Feb 1 06:42:14 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Wed, 01 Feb 2012 13:42:14 +0100 Subject: [petsc-users] Preconditioning of laplace equation with boomeramg In-Reply-To: References: <20120201110228.e1mm4ni3jgogow0o@mail.zih.tu-dresden.de> Message-ID: <4F293326.2070108@tu-dresden.de> Dave, thank you for the hint. Yes, direct solver was set on coarse space. I changed it to Jacobi and it works as expected! Thomas Am 01.02.2012 12:32, schrieb Dave May: > What's happening with the solve on the coarse grid? > Are you using a direct solver? > Are you sure the null space is removed from the coarse grid operator as well? > > > On 1 February 2012 11:02, Thomas Witkowski > wrote: >> I'm about to write a preconditioner for a 2x2 block matrix coming from the >> FEM discretization (2D, equidistance mesh) of a 4th order PDE. The (0,0) >> block is the discrete laplace with Neumann boundary conditions. I want to >> precondition this block by using some amg iterations. Just for tests, the >> preconditioner should solve in each step the neumann problem with some rhs >> using a gmres solver with hypre/boomeramg preconditioner up to a given >> tolerance. But in most iterations it ends up with DIVERGED_DTOL. What could >> be the reason for? Is AMG is linear preconditioner? The constant null space >> of the matrix is set via KSPSetNullSpace. >> >> Thomas From bsmith at mcs.anl.gov Wed Feb 1 08:05:52 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 1 Feb 2012 08:05:52 -0600 Subject: [petsc-users] Row Orderings of Mat and Vec In-Reply-To: References: Message-ID: On Feb 1, 2012, at 5:02 AM, Hui Zhang wrote: > > On Jan 31, 2012, at 1:23 PM, Jed Brown wrote: > >> On Tue, Jan 31, 2012 at 04:11, Hui Zhang wrote: >> Suppose using MatCreate() and MatSetSizes() we obtain a Mat A, and then >> tion and then using local indices (e.g. MatSetValuesLocal()) during assembly. You might use AO for that setup step, but it usually doesn't make sense to use _during_ assembly. > > thanks, do you mean that it is faster to use LocaltoGlobalMapping & SetValuesLocal > than to use AO & SetValues globally? Is there a big difference in performance? Yes, using the SetValuesLocal approach will be much faster for large problems than using the AO to map to global indices and calling SetValues. Barry > > >> >> with >> P_1 the permutation corresponding to ao_1. In a similar way, we have that b >> becomes P_2 b under the petsc ordering. I want to make sure P_1 and P_2 are >> the same so arises my question. >> >> Thanks a lot! >> Hui >> >> >> > From jedbrown at mcs.anl.gov Wed Feb 1 08:39:42 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 1 Feb 2012 08:39:42 -0600 Subject: [petsc-users] access to matnest block (0,1) ? In-Reply-To: References: Message-ID: How did you create the MatNest? Did you use the same index sets? On Wed, Feb 1, 2012 at 01:42, Klaij, Christiaan wrote: > >> Then what would be the best way to create the IS in this case? > >> Can it somehow be deduced from the separate blocks? > >> > > > >The ISs define the row space of the blocks inside the global matrix. > > Following ex28 I made two stride ISs to define the row space > but I'm still having trouble, this is for 2 procs: > > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="a00_", type=mpiaij, rows=24, cols=24 > (0,1) : prefix="a01_", type=mpiaij, rows=24, cols=12 > (1,0) : prefix="a10_", type=mpiaij, rows=12, cols=24 > (1,1) : prefix="a11_", type=mpiaij, rows=12, cols=12 > > [0] Index set is permutation > [0] Number of indices in (stride) set 12 > [0] 0 0 > [0] 1 1 > [0] 2 2 > [0] 3 3 > [0] 4 4 > [0] 5 5 > [0] 6 6 > [0] 7 7 > [0] 8 8 > [0] 9 9 > [0] 10 10 > [0] 11 11 > [1] Number of indices in (stride) set 12 > [1] 0 12 > [1] 1 13 > [1] 2 14 > [1] 3 15 > [1] 4 16 > [1] 5 17 > [1] 6 18 > [1] 7 19 > [1] 8 20 > [1] 9 21 > [1] 10 22 > [1] 11 23 > This really can't be right. Modify ex28 to have it print out the index sets. If distributed evenly over two procs, we might expect the first index set to hold [0..11; 18..29] and the second to hold [12..17; 30..35]. The way you are addressing would force a non-contiguous row partition. > [0] Number of indices in (stride) set 6 > [0] 0 24 > [0] 1 25 > [0] 2 26 > [0] 3 27 > [0] 4 28 > [0] 5 29 > [1] Number of indices in (stride) set 6 > [1] 0 30 > [1] 1 31 > [1] 2 32 > [1] 3 33 > [1] 4 34 > [1] 5 35 > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Arguments are incompatible! > [0]PETSC ERROR: Could not find index set! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 > CDT 2011 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./matnest-trycpp on a linux_64b named lin0133 by cklaij > Wed Feb 1 08:33:40 2012 > [0]PETSC ERROR: Libraries linked from > /opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5/lib > [0]PETSC ERROR: Configure run at Thu Jan 26 13:44:12 2012 > [0]PETSC ERROR: Configure options > --prefix=/opt/refresco/64bit_intelv11.1_openmpi/petsc-3.2-p5 > --with-mpi-dir=/opt/refresco/64bit_intelv11.1_openmpi/openmpi-1.4.4 > --with-x=1 --with-mpe=0 --with-debugging=1 --with-clanguage=c++ > --with-hypre-include=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/include > --with-hypre-lib=/opt/refresco/64bit_intelv11.1_openmpi/hypre-2.7.0b/lib/libHYPRE.a > --with-ml-include=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/include > --with-ml-lib=/opt/refresco/64bit_intelv11.1_openmpi/ml-6.2/lib/libml.a > --with-blas-lapack-dir=/opt/intel/mkl > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: MatNestFindIS() line 292 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: MatNestFindSubMat() line 350 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: MatGetSubMatrix_Nest() line 366 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: MatGetSubMatrix() line 7135 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_FieldSplit() line 377 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: PCSetUp() line 819 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 260 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 379 in > /home/CKlaij/ReFRESCO/Libraries/build/petsc-3.2-p5/src/ksp/ksp/interface/itfunc.c > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Wed Feb 1 09:14:37 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Wed, 1 Feb 2012 16:14:37 +0100 Subject: [petsc-users] Row Orderings of Mat and Vec In-Reply-To: References: Message-ID: > > On Feb 1, 2012, at 5:02 AM, Hui Zhang wrote: > >> >> On Jan 31, 2012, at 1:23 PM, Jed Brown wrote: >> >>> On Tue, Jan 31, 2012 at 04:11, Hui Zhang wrote: >>> Suppose using MatCreate() and MatSetSizes() we obtain a Mat A, and then >>> tion and then using local indices (e.g. MatSetValuesLocal()) during assembly. You might use AO for that setup step, but it usually doesn't make sense to use _during_ assembly. >> >> thanks, do you mean that it is faster to use LocaltoGlobalMapping & SetValuesLocal >> than to use AO & SetValues globally? Is there a big difference in performance? > > Yes, using the SetValuesLocal approach will be much faster for large problems than using the AO to map to global indices and calling SetValues. > > Barry Thank you! Can I use SetValuesLocal to set values beyond LocalOwnershipRange? > >> >> >>> >>> with >>> P_1 the permutation corresponding to ao_1. In a similar way, we have that b >>> becomes P_2 b under the petsc ordering. I want to make sure P_1 and P_2 are >>> the same so arises my question. >>> >>> Thanks a lot! >>> Hui >>> >>> >>> >> > From jedbrown at mcs.anl.gov Wed Feb 1 09:32:20 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 1 Feb 2012 09:32:20 -0600 Subject: [petsc-users] MG on fieldsplit In-Reply-To: <4F291825.40000@math.u-psud.fr> References: <4F291825.40000@math.u-psud.fr> Message-ID: On Wed, Feb 1, 2012 at 04:47, gouarin wrote: > Hi, > > I sent some email on Stokes solver using DMComposite grid (one for the > velocity and an other one for the pressure). > > Now, I try to use PCMG to precondition the velocity part with fieldsplit > and PETSc command line options only. > > This is an example of my options: > > ./test -stokes_ksp_type fgmres > -stokes_pc_type fieldsplit > -stokes_fieldsplit_0_ksp_type preonly > -stokes_fieldsplit_0_pc_type mg > -stokes_fieldsplit_0_pc_mg_**levels 2 > -stokes_fieldsplit_0_pc_mg_**galerkin > ... > -stokes_fieldsplit_1_ksp_type preonly > -stokes_fieldsplit_1_pc_type jacobi > > It doesn't work because we need to call PCFieldSplitGetSubKSP to get the > subpc[0] to create the interpolation matrices. So, we have to call > KSPSetUp before PCFieldSplitGetSubKSP. But, if we call KSPSetUp, we have an > error because the interpolation matrices are not created ! > The interpolation matrices for PCMG? Those should be constructed by the DM. It looks from your code like you have a DMComposite, in which case you shouldn't have to do anything special. Try "make runex28_4". (It's broken with a change that went into petsc-dev a few days ago. I'll be able to push the fix whenever I find a wireless network in this airport that doesn't block ssh.) > > But If I do: > > ------ > PCSetDM(pc, dom->get_pack()); > PCSetType(pc, PCFIELDSPLIT); > KSPSetUp(solver); > PCFieldSplitGetSubKSP(pc, &nsplits, &subksp); > PetscMalloc(sizeof(PC)***nsplits, &subpc); > > KSPSetType(subksp[0], KSPPREONLY); > KSPGetPC(subksp[0], &subpc[0]); > PCSetType(subpc[0], PCMG); > > create the interpolation matrices > > KSPSetFromOptions(solver); > ------ > > It works but we can't set the mg_levels with the command line options > anymore. Moreover, I have some warning of this type: > > Option left: name:-stokes_fieldsplit_0_pc_**mg_multiplicative_cycles > value: 3 > > Is there a way to do that only with the command line option ? > > Thanks, > Loic > > -- > Loic Gouarin > Laboratoire de Math?matiques > Universit? Paris-Sud > B?timent 425 > 91405 Orsay Cedex > France > Tel: (+33) 1 69 15 60 14 > Fax: (+33) 1 69 15 67 18 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Wed Feb 1 09:52:17 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Wed, 1 Feb 2012 16:52:17 +0100 Subject: [petsc-users] Row Orderings of Mat and Vec In-Reply-To: References: Message-ID: >> On Feb 1, 2012, at 5:02 AM, Hui Zhang wrote: >> >>> >>> On Jan 31, 2012, at 1:23 PM, Jed Brown wrote: >>> >>>> On Tue, Jan 31, 2012 at 04:11, Hui Zhang wrote: >>>> Suppose using MatCreate() and MatSetSizes() we obtain a Mat A, and then >>>> tion and then using local indices (e.g. MatSetValuesLocal()) during assembly. You might use AO for that setup step, but it usually doesn't make sense to use _during_ assembly. >>> >>> thanks, do you mean that it is faster to use LocaltoGlobalMapping & SetValuesLocal >>> than to use AO & SetValues globally? Is there a big difference in performance? >> >> Yes, using the SetValuesLocal approach will be much faster for large problems than using the AO to map to global indices and calling SetValues. >> >> Barry > > Thank you! Can I use SetValuesLocal to set values beyond LocalOwnershipRange? It is ok. I found I can. > >> >>> >>> >>>> >>>> with >>>> P_1 the permutation corresponding to ao_1. In a similar way, we have that b >>>> becomes P_2 b under the petsc ordering. I want to make sure P_1 and P_2 are >>>> the same so arises my question. >>>> >>>> Thanks a lot! >>>> Hui >>>> >>>> >>>> >>> >> > From wumengda at gmail.com Wed Feb 1 16:44:17 2012 From: wumengda at gmail.com (Mengda Wu) Date: Wed, 1 Feb 2012 17:44:17 -0500 Subject: [petsc-users] VecCreate and MatCreate guarantee zero all entries on creation? Message-ID: Hello all, I have a simple questions. Do VecCreate and MatCreate guarantee zero all entries on creation? Thanks, Mengda -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Feb 1 16:47:58 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 1 Feb 2012 16:47:58 -0600 Subject: [petsc-users] VecCreate and MatCreate guarantee zero all entries on creation? In-Reply-To: References: Message-ID: On Feb 1, 2012, at 4:44 PM, Mengda Wu wrote: > Hello all, > > I have a simple questions. Do VecCreate and MatCreate guarantee zero all entries on creation? Yes > > Thanks, > Mengda From C.Klaij at marin.nl Thu Feb 2 03:14:37 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 2 Feb 2012 09:14:37 +0000 Subject: [petsc-users] access to matnest block (0,1) ? Message-ID: > How did you create the MatNest? Did you use the same index sets? This is in short what I did: Mat A,subA[4]; MatCreate(PETSC_COMM_WORLD,&subA[0]); MatSetSizes(subA[0],PETSC_DECIDE,PETSC_DECIDE,2*nx*ny,2*nx*ny); /* set type, set prefix, set values here */ MatCreate(PETSC_COMM_WORLD,&subA[1]); MatSetSizes(subA[1],PETSC_DECIDE,PETSC_DECIDE,2*nx*ny,nx*ny); /* set type, set prefix, set values */ MatTranspose(subA[1],MAT_INITIAL_MATRIX,&subA[2]); MatCreate(PETSC_COMM_WORLD,&subA[3]); MatSetSizes(subA[3],PETSC_DECIDE,PETSC_DECIDE,nx*ny,nx*ny); /* set type, set prefix, set values */ MatCreateNest(PETSC_COMM_WORLD,2,PETSC_NULL,2,PETSC_NULL,subA,&A); > This really can't be right. Modify ex28 to have it print out the index > sets. If distributed evenly over two procs, we might expect the first index > set to hold [0..11; 18..29] and the second to hold [12..17; 30..35]. The > way you are addressing would force a non-contiguous row partition. The partitioning corresponds to the grid. With the 3x4 grid and 2 procs, I have 6 cells per proc. subA[0] corresponds to two variables per cell (u and v) so 12 rows on proc0 and 12 on proc1. SubA[3] has one variable per cell (p) so 6 rows on proc0 and 6 on proc1. I tried to create the matching ISs but something's wrong. What would be the right way to do it? dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From loic.gouarin at math.u-psud.fr Thu Feb 2 05:03:36 2012 From: loic.gouarin at math.u-psud.fr (gouarin) Date: Thu, 02 Feb 2012 12:03:36 +0100 Subject: [petsc-users] MG on fieldsplit In-Reply-To: References: <4F291825.40000@math.u-psud.fr> Message-ID: <4F2A6D88.8030703@math.u-psud.fr> Hi Jed, thanks for your reply. On 01/02/2012 16:32, Jed Brown wrote: > On Wed, Feb 1, 2012 at 04:47, gouarin > wrote: > > Hi, > > I sent some email on Stokes solver using DMComposite grid (one for > the velocity and an other one for the pressure). > > Now, I try to use PCMG to precondition the velocity part with > fieldsplit and PETSc command line options only. > > This is an example of my options: > > ./test -stokes_ksp_type fgmres > -stokes_pc_type fieldsplit > -stokes_fieldsplit_0_ksp_type preonly > -stokes_fieldsplit_0_pc_type mg > -stokes_fieldsplit_0_pc_mg_ levels 2 > -stokes_fieldsplit_0_pc_mg_ galerkin > ... > -stokes_fieldsplit_1_ksp_type preonly > -stokes_fieldsplit_1_pc_type jacobi > > It doesn't work because we need to call PCFieldSplitGetSubKSP to > get the subpc[0] to create the interpolation matrices. So, we > have to call KSPSetUp before PCFieldSplitGetSubKSP. But, if we > call KSPSetUp, we have an error because the interpolation matrices > are not created ! > > > The interpolation matrices for PCMG? Those should be constructed by > the DM. It looks from your code like you have a DMComposite, in which > case you shouldn't have to do anything special. > > Try "make runex28_4". (It's broken with a change that went into > petsc-dev a few days ago. I'll be able to push the fix whenever I find > a wireless network in this airport that doesn't block ssh.) It seems that I have forgotten to initialize DMSetInitialGuess, DMSetFunction, DMSetJacobian. I don't understand what is the best way to use 4Q1-Q1 elements for the Stokes problem: - Construct the matrices for each DM in the DMComposite - Construct the global matrix and the global preconditioner - Use MatNest There are a lot of possibilities with PETSc with new add-ons and it is difficult to find our way with only the documentation. Best regards, Loic > > But If I do: > > ------ > PCSetDM(pc, dom->get_pack()); > PCSetType(pc, PCFIELDSPLIT); > KSPSetUp(solver); > PCFieldSplitGetSubKSP(pc, &nsplits, &subksp); > PetscMalloc(sizeof(PC)* nsplits, &subpc); > > KSPSetType(subksp[0], KSPPREONLY); > KSPGetPC(subksp[0], &subpc[0]); > PCSetType(subpc[0], PCMG); > > create the interpolation matrices > > KSPSetFromOptions(solver); > ------ > > It works but we can't set the mg_levels with the command line > options anymore. Moreover, I have some warning of this type: > > Option left: name:-stokes_fieldsplit_0_pc_ > mg_multiplicative_cycles value: 3 > > Is there a way to do that only with the command line option ? > > Thanks, > Loic > > -- > Loic Gouarin > Laboratoire de Math?matiques > Universit? Paris-Sud > B?timent 425 > 91405 Orsay Cedex > France > Tel: (+33) 1 69 15 60 14 > Fax: (+33) 1 69 15 67 18 > > -- Loic Gouarin Laboratoire de Math?matiques Universit? Paris-Sud B?timent 425 91405 Orsay Cedex France Tel: (+33) 1 69 15 60 14 Fax: (+33) 1 69 15 67 18 -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Thu Feb 2 05:47:54 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Thu, 02 Feb 2012 12:47:54 +0100 Subject: [petsc-users] Richardson with direct solver does not converge in preconditioned residual norm Message-ID: <4F2A77EA.7080906@tu-dresden.de> For some tests I use richardson iterations with direct solver for preconditioning (if everything is fine, richardson should be replaces by preonly). -ksp_type richardson -pc_type lu -pc_factor_mat_solver_package mumps -ksp_monitor -ksp_monitor_true_residual -ksp_max_it 10 For some matrices I see that it converges fine in true residual norm but not in the preconditioned one: 0 KSP Residual norm 1.540366130785e+05 0 KSP preconditioned resid norm 1.540366130785e+05 true resid norm 4.257656834616e+04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 1.355077212761e+05 1 KSP preconditioned resid norm 1.355077212761e+05 true resid norm 1.468758291284e-11 ||r(i)||/||b|| 3.449686877867e-16 2 KSP Residual norm 3.775360693480e+05 2 KSP preconditioned resid norm 3.775360693480e+05 true resid norm 5.008860262312e-12 ||r(i)||/||b|| 1.176435879376e-16 3 KSP Residual norm 1.714431257209e+05 3 KSP preconditioned resid norm 1.714431257209e+05 true resid norm 5.365631839419e-12 ||r(i)||/||b|| 1.260231166541e-16 4 KSP Residual norm 7.164219897555e+04 4 KSP preconditioned resid norm 7.164219897555e+04 true resid norm 5.582291603774e-12 ||r(i)||/||b|| 1.311118256030e-16 5 KSP Residual norm 2.480147180914e+05 5 KSP preconditioned resid norm 2.480147180914e+05 true resid norm 5.464714292269e-12 ||r(i)||/||b|| 1.283502758569e-16 6 KSP Residual norm 1.749548383255e+05 6 KSP preconditioned resid norm 1.749548383255e+05 true resid norm 6.601924132117e-12 ||r(i)||/||b|| 1.550600339239e-16 7 KSP Residual norm 1.873773824295e+05 7 KSP preconditioned resid norm 1.873773824295e+05 true resid norm 6.368611865551e-12 ||r(i)||/||b|| 1.495802060366e-16 8 KSP Residual norm 2.610223461339e+05 8 KSP preconditioned resid norm 2.610223461339e+05 true resid norm 8.365362648969e-12 ||r(i)||/||b|| 1.964780858090e-16 9 KSP Residual norm 2.459609758347e+05 9 KSP preconditioned resid norm 2.459609758347e+05 true resid norm 8.427381039077e-12 ||r(i)||/||b|| 1.979347177668e-16 10 KSP Residual norm 1.611793769272e+05 10 KSP preconditioned resid norm 1.611793769272e+05 true resid norm 8.325158481093e-12 ||r(i)||/||b|| 1.955338066095e-16 Would could be the reason for? Thomas From thomas.witkowski at tu-dresden.de Thu Feb 2 06:09:02 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Thu, 02 Feb 2012 13:09:02 +0100 Subject: [petsc-users] Use boomeramg to solve system of PDEs Message-ID: <4F2A7CDE.1010103@tu-dresden.de> The documentation of boomeramg mention that it's possible to solve also matrices arising from the discretization of system of PDEs. But there is no more information on it. What should I do to make use of it in PETSc? Thomas From mark.adams at columbia.edu Thu Feb 2 07:43:49 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Thu, 2 Feb 2012 08:43:49 -0500 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: <4F2A7CDE.1010103@tu-dresden.de> References: <4F2A7CDE.1010103@tu-dresden.de> Message-ID: Use MatSetBlockSize(mat,ndof); and that info will get passed down to HYPRE. Mark On Feb 2, 2012, at 7:09 AM, Thomas Witkowski wrote: > The documentation of boomeramg mention that it's possible to solve also matrices arising from the discretization of system of PDEs. But there is no more information on it. What should I do to make use of it in PETSc? > > Thomas > From jedbrown at mcs.anl.gov Thu Feb 2 08:07:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 2 Feb 2012 16:07:35 +0200 Subject: [petsc-users] access to matnest block (0,1) ? In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 11:14, Klaij, Christiaan wrote: > Mat A,subA[4]; > > MatCreate(PETSC_COMM_WORLD,&subA[0]); > MatSetSizes(subA[0],PETSC_DECIDE,PETSC_DECIDE,2*nx*ny,2*nx*ny); > /* set type, set prefix, set values here */ > > MatCreate(PETSC_COMM_WORLD,&subA[1]); > MatSetSizes(subA[1],PETSC_DECIDE,PETSC_DECIDE,2*nx*ny,nx*ny); > /* set type, set prefix, set values */ > > MatTranspose(subA[1],MAT_INITIAL_MATRIX,&subA[2]); > > MatCreate(PETSC_COMM_WORLD,&subA[3]); > MatSetSizes(subA[3],PETSC_DECIDE,PETSC_DECIDE,nx*ny,nx*ny); > /* set type, set prefix, set values */ > > MatCreateNest(PETSC_COMM_WORLD,2,PETSC_NULL,2,PETSC_NULL,subA,&A); > I should add a MatNestGetISs() so you can get out the automatically-created ISs. They will have the structure described below. If you want to work with the released version, you should create the ISs yourself. > > > > This really can't be right. Modify ex28 to have it print out the index > > sets. If distributed evenly over two procs, we might expect the first > index > > set to hold [0..11; 18..29] and the second to hold [12..17; 30..35]. The > > way you are addressing would force a non-contiguous row partition. > > The partitioning corresponds to the grid. With the 3x4 grid and 2 > procs, I have 6 cells per proc. subA[0] corresponds to two > variables per cell (u and v) so 12 rows on proc0 and 12 on > proc1. SubA[3] has one variable per cell (p) so 6 rows on proc0 > and 6 on proc1. I tried to create the matching ISs but something's wrong. > What would be the right way to do it? > A has a contigous distribution, so the ISs must respect that. Did you try creating the index sets described above? Please explain "something's wrong". -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Feb 2 08:23:53 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 2 Feb 2012 16:23:53 +0200 Subject: [petsc-users] MG on fieldsplit In-Reply-To: <4F2A6D88.8030703@math.u-psud.fr> References: <4F291825.40000@math.u-psud.fr> <4F2A6D88.8030703@math.u-psud.fr> Message-ID: On Thu, Feb 2, 2012 at 13:03, gouarin wrote: > It seems that I have forgotten to initialize DMSetInitialGuess, > DMSetFunction, DMSetJacobian. > > I don't understand what is the best way to use 4Q1-Q1 elements for the > Stokes problem: > src/ksp/ksp/examples/tutorials/ex43.c (2D) and ex42.c (3D) use these elements for variable-viscosity Stokes. > > - Construct the matrices for each DM in the DMComposite > - Construct the global matrix and the global preconditioner > - Use MatNest > DMComposite can allocate for AIJ or Nest. Just set the matrix type, e.g. -dm_mat_type nest, ex28.c does this (see runex28_3). But preallocation for off-diagonal blocks (of MatNest or AIJ) is still messy and I haven't figured out a good API other than to write an efficient dynamic preallocation to make it unnecessary. A lot of users like Jacobian-free Newton-Krylov with the true Jacobian applied by finite differencing (-snes_mf_operator) in which case the current code is fine because we only assemble diagonal blocks. > > There are a lot of possibilities with PETSc with new add-ons and it is > difficult to find our way with only the documentation. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Feb 2 08:27:47 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 2 Feb 2012 16:27:47 +0200 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: References: <4F2A7CDE.1010103@tu-dresden.de> Message-ID: On Thu, Feb 2, 2012 at 15:43, Mark F. Adams wrote: > Use MatSetBlockSize(mat,ndof); and that info will get passed down to HYPRE. This assumes that you have ordered your equations so that all components are next to each other. This is usually practical for systems in H^1 (e.g. compressible elasticity), but for systems with constraints (mixed methods for Stokes) or systems in H(div) or H(curl), you need more specialized methods. Hypre has some support for H(curl), but it's for a specific discretization and you would have to use their custom interface. Unfortunately, those systems always tend to be a lot of work, so the alternative is to use fieldsplit (also "auxilliary space") preconditioning to transform those systems into ones that can be solved with standard methods. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Feb 2 08:32:57 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 2 Feb 2012 16:32:57 +0200 Subject: [petsc-users] Richardson with direct solver does not converge in preconditioned residual norm In-Reply-To: <4F2A77EA.7080906@tu-dresden.de> References: <4F2A77EA.7080906@tu-dresden.de> Message-ID: Is this problem singular? On Thu, Feb 2, 2012 at 13:47, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > For some tests I use richardson iterations with direct solver for > preconditioning (if everything is fine, richardson should be replaces by > preonly). > > -ksp_type richardson -pc_type lu -pc_factor_mat_solver_package mumps > -ksp_monitor -ksp_monitor_true_residual -ksp_max_it 10 > > For some matrices I see that it converges fine in true residual norm but > not in the preconditioned one: > > 0 KSP Residual norm 1.540366130785e+05 > 0 KSP preconditioned resid norm 1.540366130785e+05 true resid norm > 4.257656834616e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP Residual norm 1.355077212761e+05 > 1 KSP preconditioned resid norm 1.355077212761e+05 true resid norm > 1.468758291284e-11 ||r(i)||/||b|| 3.449686877867e-16 > 2 KSP Residual norm 3.775360693480e+05 > 2 KSP preconditioned resid norm 3.775360693480e+05 true resid norm > 5.008860262312e-12 ||r(i)||/||b|| 1.176435879376e-16 > 3 KSP Residual norm 1.714431257209e+05 > 3 KSP preconditioned resid norm 1.714431257209e+05 true resid norm > 5.365631839419e-12 ||r(i)||/||b|| 1.260231166541e-16 > 4 KSP Residual norm 7.164219897555e+04 > 4 KSP preconditioned resid norm 7.164219897555e+04 true resid norm > 5.582291603774e-12 ||r(i)||/||b|| 1.311118256030e-16 > 5 KSP Residual norm 2.480147180914e+05 > 5 KSP preconditioned resid norm 2.480147180914e+05 true resid norm > 5.464714292269e-12 ||r(i)||/||b|| 1.283502758569e-16 > 6 KSP Residual norm 1.749548383255e+05 > 6 KSP preconditioned resid norm 1.749548383255e+05 true resid norm > 6.601924132117e-12 ||r(i)||/||b|| 1.550600339239e-16 > 7 KSP Residual norm 1.873773824295e+05 > 7 KSP preconditioned resid norm 1.873773824295e+05 true resid norm > 6.368611865551e-12 ||r(i)||/||b|| 1.495802060366e-16 > 8 KSP Residual norm 2.610223461339e+05 > 8 KSP preconditioned resid norm 2.610223461339e+05 true resid norm > 8.365362648969e-12 ||r(i)||/||b|| 1.964780858090e-16 > 9 KSP Residual norm 2.459609758347e+05 > 9 KSP preconditioned resid norm 2.459609758347e+05 true resid norm > 8.427381039077e-12 ||r(i)||/||b|| 1.979347177668e-16 > 10 KSP Residual norm 1.611793769272e+05 > 10 KSP preconditioned resid norm 1.611793769272e+05 true resid norm > 8.325158481093e-12 ||r(i)||/||b|| 1.955338066095e-16 > > > Would could be the reason for? > > Thomas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From loic.gouarin at math.u-psud.fr Thu Feb 2 09:24:30 2012 From: loic.gouarin at math.u-psud.fr (gouarin) Date: Thu, 02 Feb 2012 16:24:30 +0100 Subject: [petsc-users] MG on fieldsplit In-Reply-To: References: <4F291825.40000@math.u-psud.fr> <4F2A6D88.8030703@math.u-psud.fr> Message-ID: <4F2AAAAE.4070304@math.u-psud.fr> On 02/02/2012 15:23, Jed Brown wrote: > On Thu, Feb 2, 2012 at 13:03, gouarin > wrote: > > It seems that I have forgotten to initialize DMSetInitialGuess, > DMSetFunction, DMSetJacobian. > > I don't understand what is the best way to use 4Q1-Q1 elements for > the Stokes problem: > > > src/ksp/ksp/examples/tutorials/ex43.c (2D) and ex42.c (3D) use these > elements for variable-viscosity Stokes. No. In those examples, you have the same grid for the velocity and the pressure. If I use 4Q1_Q1 elements, I have not the same grid. I have one for the velocity and an other for the pressure in the DMComposite. And for me, it is the difficulty because as you say after, I have to do my own preallocation step to have the good off-diagonal blocks. This is why I asked what is the best way to construct my matrix. I hoped that now it is not necessary to do this preallocation. This is also a difficulty to use a multigrid only on the velocity. Thanks, Loic > > - Construct the matrices for each DM in the DMComposite > - Construct the global matrix and the global preconditioner > - Use MatNest > > > DMComposite can allocate for AIJ or Nest. Just set the matrix type, > e.g. -dm_mat_type nest, ex28.c does this (see runex28_3). > > But preallocation for off-diagonal blocks (of MatNest or AIJ) is still > messy and I haven't figured out a good API other than to write an > efficient dynamic preallocation to make it unnecessary. A lot of users > like Jacobian-free Newton-Krylov with the true Jacobian applied by > finite differencing (-snes_mf_operator) in which case the current code > is fine because we only assemble diagonal blocks. > > > There are a lot of possibilities with PETSc with new add-ons and > it is difficult to find our way with only the documentation. > > -- Loic Gouarin Laboratoire de Math?matiques Universit? Paris-Sud B?timent 425 91405 Orsay Cedex France Tel: (+33) 1 69 15 60 14 Fax: (+33) 1 69 15 67 18 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Feb 2 09:41:06 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 2 Feb 2012 17:41:06 +0200 Subject: [petsc-users] MG on fieldsplit In-Reply-To: <4F2AAAAE.4070304@math.u-psud.fr> References: <4F291825.40000@math.u-psud.fr> <4F2A6D88.8030703@math.u-psud.fr> <4F2AAAAE.4070304@math.u-psud.fr> Message-ID: On Thu, Feb 2, 2012 at 17:24, gouarin wrote: > No. In those examples, you have the same grid for the velocity and the > pressure. If I use 4Q1_Q1 elements, I have not the same grid. I have one > for the velocity and an other for the pressure in the DMComposite. And for > me, it is the difficulty because as you say after, I have to do my own > preallocation step to have the good off-diagonal blocks. > Okay, I misinterpreted your notation. For your mixed elements, you would couple with DMComposite. > > This is why I asked what is the best way to construct my matrix. I hoped > that now it is not necessary to do this preallocation. > As long as dynamic preallocation is slow, you need to provide it. > This is also a difficulty to use a multigrid only on the velocity. > What is hard about using multigrid only on velocity? -pc_type fieldsplit -fieldsplit_velocity_pc_type mg -fieldsplit_velocity_pc_mg_levels 3 will do geometric MG on the velocity block. Note that your elements are slightly non-standard, so depending on how you want to work, you might provide your own coarsening and interpolation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From loic.gouarin at math.u-psud.fr Thu Feb 2 10:04:52 2012 From: loic.gouarin at math.u-psud.fr (gouarin) Date: Thu, 02 Feb 2012 17:04:52 +0100 Subject: [petsc-users] MG on fieldsplit In-Reply-To: References: <4F291825.40000@math.u-psud.fr> <4F2A6D88.8030703@math.u-psud.fr> <4F2AAAAE.4070304@math.u-psud.fr> Message-ID: <4F2AB424.3000008@math.u-psud.fr> On 02/02/2012 16:41, Jed Brown wrote: > On Thu, Feb 2, 2012 at 17:24, gouarin > wrote: > > No. In those examples, you have the same grid for the velocity and > the pressure. If I use 4Q1_Q1 elements, I have not the same grid. > I have one for the velocity and an other for the pressure in the > DMComposite. And for me, it is the difficulty because as you say > after, I have to do my own preallocation step to have the good > off-diagonal blocks. > > > Okay, I misinterpreted your notation. For your mixed elements, you > would couple with DMComposite. > > > This is why I asked what is the best way to construct my matrix. I > hoped that now it is not necessary to do this preallocation. > > > As long as dynamic preallocation is slow, you need to provide it. > > This is also a difficulty to use a multigrid only on the velocity. > > > What is hard about using multigrid only on velocity? > > -pc_type fieldsplit -fieldsplit_velocity_pc_type mg > -fieldsplit_velocity_pc_mg_levels 3 > > will do geometric MG on the velocity block. Note that your elements > are slightly non-standard, so depending on how you want to work, you > might provide your own coarsening and interpolation. It is not hard to use multigrid on velocity if I provide the interpolation as I said in my first email and as you say here. But it is for me too intrusive because I have to do a special case to use a multigrid pcmg whereas I can use boomeramg easily. I don't know if what I say is clear ... The idea is to test many solvers on this problem. Thanks for your replies. Loic -- Loic Gouarin Laboratoire de Math?matiques Universit? Paris-Sud B?timent 425 91405 Orsay Cedex France Tel: (+33) 1 69 15 60 14 Fax: (+33) 1 69 15 67 18 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Feb 2 11:32:05 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 2 Feb 2012 11:32:05 -0600 Subject: [petsc-users] MG on fieldsplit In-Reply-To: <4F2AB424.3000008@math.u-psud.fr> References: <4F291825.40000@math.u-psud.fr> <4F2A6D88.8030703@math.u-psud.fr> <4F2AAAAE.4070304@math.u-psud.fr> <4F2AB424.3000008@math.u-psud.fr> Message-ID: Geometric multigrid needs more information than algebraic multigrid. It's a fact or life. In exchange, it can often be more efficient. You likely notice that algebraic multigrid applied directly to the Stokes problem fails, so you need to do more anyway. If you are happy with the performance of using FieldSplit and applying BoomerAMG (or PCGAMG, or PCML) to the viscous (velocity-velocity) part, then use that. If you aren't happy with it, then try other methods, but some of those will need more information. On Thu, Feb 2, 2012 at 10:04, gouarin wrote: > It is not hard to use multigrid on velocity if I provide the interpolation > as I said in my first email and as you say here. But it is for me too > intrusive because I have to do a special case to use a multigrid pcmg > whereas I can use boomeramg easily. I don't know if what I say is clear ... > > The idea is to test many solvers on this problem. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolph at berkeley.edu Thu Feb 2 23:20:02 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Thu, 2 Feb 2012 21:20:02 -0800 Subject: [petsc-users] Valgrind and uninitialized values Message-ID: <61DB5B01-724B-42F3-96E4-0B203AE0F019@berkeley.edu> I am trying to track down a memory corruption bug using valgrind, but I am having to wade through lots and lots of error messages similar to this one, which I believe are either spurious or related to some problem in Petsc and not in my code (please correct me if I'm wrong!) ==18162== Conditional jump or move depends on uninitialised value(s) ==18162== at 0x7258D73: MPIDI_CH3U_Handle_recv_req (ch3u_handle_recv_req.c:99) ==18162== by 0x724F2E1: MPIDI_CH3I_SMP_read_progress (ch3_smp_progress.c:656) ==18162== by 0x72462D4: MPIDI_CH3I_Progress (ch3_progress.c:185) ==18162== by 0x72AC52B: MPIC_Wait (helper_fns.c:518) ==18162== by 0x72AC314: MPIC_Sendrecv (helper_fns.c:163) ==18162== by 0x7218E18: MPIR_Allgather_OSU (allgather_osu.c:524) ==18162== by 0x7217099: PMPI_Allgather (allgather.c:840) ==18162== by 0x649FFD: PetscLayoutSetUp (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x631C32: VecCreate_MPI_Private (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x63253E: VecCreate_MPI (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x5E3279: VecSetType (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x6329CC: VecCreate_Standard (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x5E3279: VecSetType (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x93BFE7: DMCreateGlobalVector_DA (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x8C7F68: DMCreateGlobalVector (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x93B6DA: VecDuplicate_MPI_DA (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x5CF419: VecDuplicate (in /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) ==18162== by 0x438F04: initializeNodalFields (nodalFields.c:35) ==18162== by 0x434730: main (main_isotropic_convection.c:123) The relevant line of code is: 33: ierr = DMCreateGlobalVector(grid->da, &nodalFields->lastT); CHKERRQ(ierr); 34: ierr = PetscObjectSetName((PetscObject) nodalFields->lastT, "lastT");CHKERRQ(ierr); 35: ierr = VecDuplicate( nodalFields->lastT, &nodalFields->thisT);CHKERRQ(ierr); I am using icc with petsc-3.2 on the Intel Westmere cluster at TACC. Petsc was compiled with debugging enabled. Thanks for your help. Max From knepley at gmail.com Thu Feb 2 23:25:06 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 2 Feb 2012 23:25:06 -0600 Subject: [petsc-users] Valgrind and uninitialized values In-Reply-To: <61DB5B01-724B-42F3-96E4-0B203AE0F019@berkeley.edu> References: <61DB5B01-724B-42F3-96E4-0B203AE0F019@berkeley.edu> Message-ID: On Thu, Feb 2, 2012 at 11:20 PM, Max Rudolph wrote: > I am trying to track down a memory corruption bug using valgrind, but I am > having to wade through lots and lots of error messages similar to this one, > which I believe are either spurious or related to some problem in Petsc and > not in my code (please correct me if I'm wrong!) > Are you using ---download-mpich? It should be valgrind clean. These are a pain, but you can make a supressions file as well. Matt > ==18162== Conditional jump or move depends on uninitialised value(s) > ==18162== at 0x7258D73: MPIDI_CH3U_Handle_recv_req > (ch3u_handle_recv_req.c:99) > ==18162== by 0x724F2E1: MPIDI_CH3I_SMP_read_progress > (ch3_smp_progress.c:656) > ==18162== by 0x72462D4: MPIDI_CH3I_Progress (ch3_progress.c:185) > ==18162== by 0x72AC52B: MPIC_Wait (helper_fns.c:518) > ==18162== by 0x72AC314: MPIC_Sendrecv (helper_fns.c:163) > ==18162== by 0x7218E18: MPIR_Allgather_OSU (allgather_osu.c:524) > ==18162== by 0x7217099: PMPI_Allgather (allgather.c:840) > ==18162== by 0x649FFD: PetscLayoutSetUp (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x631C32: VecCreate_MPI_Private (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x63253E: VecCreate_MPI (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x5E3279: VecSetType (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x6329CC: VecCreate_Standard (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x5E3279: VecSetType (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x93BFE7: DMCreateGlobalVector_DA (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x8C7F68: DMCreateGlobalVector (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x93B6DA: VecDuplicate_MPI_DA (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x5CF419: VecDuplicate (in > /work/01038/max/gk_0.1mm/gk_conv_50_vg/iso-convect-p) > ==18162== by 0x438F04: initializeNodalFields (nodalFields.c:35) > ==18162== by 0x434730: main (main_isotropic_convection.c:123) > > The relevant line of code is: > 33: ierr = DMCreateGlobalVector(grid->da, &nodalFields->lastT); > CHKERRQ(ierr); > 34: ierr = PetscObjectSetName((PetscObject) nodalFields->lastT, > "lastT");CHKERRQ(ierr); > 35: ierr = VecDuplicate( nodalFields->lastT, > &nodalFields->thisT);CHKERRQ(ierr); > > I am using icc with petsc-3.2 on the Intel Westmere cluster at TACC. Petsc > was compiled with debugging enabled. Thanks for your help. > > Max -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolph at berkeley.edu Thu Feb 2 23:42:48 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Thu, 2 Feb 2012 21:42:48 -0800 Subject: [petsc-users] Valgrind and uninitialized values Message-ID: No, I am using the version of Petsc supplied on the lonestar machine, which was compiled with their MPI build. I will look into making a suppressions file. Thanks for your help. Max > > > I am trying to track down a memory corruption bug using valgrind, but I am > > > having to wade through lots and lots of error messages similar to this one, > > > which I believe are either spurious or related to some problem in Petsc and > > > not in my code (please correct me if I'm wrong!) > > > > Are you using ---download-mpich? It should be valgrind clean. These are a > pain, but you can make a supressions > file as well. > > Matt > From thomas.witkowski at tu-dresden.de Fri Feb 3 01:48:07 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 08:48:07 +0100 Subject: [petsc-users] Richardson with direct solver does not converge in preconditioned residual norm In-Reply-To: References: <4F2A77EA.7080906@tu-dresden.de> Message-ID: <4F2B9137.7090403@tu-dresden.de> Shouldn't be, but it seems that is is close to singular in computer arithmetic. I would like to understand we it's so. The matrix is a 2x2 block matrix with no coupling between the main blocks. I know that this does not make much sense but its for tests only and I would like to add some couplings later. Both blocks are nonsingular and easy solvable with direct solvers. But when adding both together, the condition number rise to something around 10^23. Is it only a question of scaling both matrices to the same order? Thomas Am 02.02.2012 15:32, schrieb Jed Brown: > Is this problem singular? > > On Thu, Feb 2, 2012 at 13:47, Thomas Witkowski > > wrote: > > For some tests I use richardson iterations with direct solver for > preconditioning (if everything is fine, richardson should be > replaces by preonly). > > -ksp_type richardson -pc_type lu -pc_factor_mat_solver_package > mumps -ksp_monitor -ksp_monitor_true_residual -ksp_max_it 10 > > For some matrices I see that it converges fine in true residual > norm but not in the preconditioned one: > > 0 KSP Residual norm 1.540366130785e+05 > 0 KSP preconditioned resid norm 1.540366130785e+05 true resid > norm 4.257656834616e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP Residual norm 1.355077212761e+05 > 1 KSP preconditioned resid norm 1.355077212761e+05 true resid > norm 1.468758291284e-11 ||r(i)||/||b|| 3.449686877867e-16 > 2 KSP Residual norm 3.775360693480e+05 > 2 KSP preconditioned resid norm 3.775360693480e+05 true resid > norm 5.008860262312e-12 ||r(i)||/||b|| 1.176435879376e-16 > 3 KSP Residual norm 1.714431257209e+05 > 3 KSP preconditioned resid norm 1.714431257209e+05 true resid > norm 5.365631839419e-12 ||r(i)||/||b|| 1.260231166541e-16 > 4 KSP Residual norm 7.164219897555e+04 > 4 KSP preconditioned resid norm 7.164219897555e+04 true resid > norm 5.582291603774e-12 ||r(i)||/||b|| 1.311118256030e-16 > 5 KSP Residual norm 2.480147180914e+05 > 5 KSP preconditioned resid norm 2.480147180914e+05 true resid > norm 5.464714292269e-12 ||r(i)||/||b|| 1.283502758569e-16 > 6 KSP Residual norm 1.749548383255e+05 > 6 KSP preconditioned resid norm 1.749548383255e+05 true resid > norm 6.601924132117e-12 ||r(i)||/||b|| 1.550600339239e-16 > 7 KSP Residual norm 1.873773824295e+05 > 7 KSP preconditioned resid norm 1.873773824295e+05 true resid > norm 6.368611865551e-12 ||r(i)||/||b|| 1.495802060366e-16 > 8 KSP Residual norm 2.610223461339e+05 > 8 KSP preconditioned resid norm 2.610223461339e+05 true resid > norm 8.365362648969e-12 ||r(i)||/||b|| 1.964780858090e-16 > 9 KSP Residual norm 2.459609758347e+05 > 9 KSP preconditioned resid norm 2.459609758347e+05 true resid > norm 8.427381039077e-12 ||r(i)||/||b|| 1.979347177668e-16 > 10 KSP Residual norm 1.611793769272e+05 > 10 KSP preconditioned resid norm 1.611793769272e+05 true resid > norm 8.325158481093e-12 ||r(i)||/||b|| 1.955338066095e-16 > > > Would could be the reason for? > > Thomas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Feb 3 02:20:52 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 09:20:52 +0100 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: References: <4F2A7CDE.1010103@tu-dresden.de> Message-ID: <4F2B98E4.3070606@tu-dresden.de> I'll try it. Is there any difference between using MATMPIAIJ and MatSetBlockSize, and using MATMPIBAIJ? Thomas Am 02.02.2012 14:43, schrieb Mark F. Adams: > Use MatSetBlockSize(mat,ndof); and that info will get passed down to HYPRE. > Mark > > On Feb 2, 2012, at 7:09 AM, Thomas Witkowski wrote: > >> The documentation of boomeramg mention that it's possible to solve also matrices arising from the discretization of system of PDEs. But there is no more information on it. What should I do to make use of it in PETSc? >> >> Thomas >> From jedbrown at mcs.anl.gov Fri Feb 3 03:54:44 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 3 Feb 2012 12:54:44 +0300 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: <4F2B98E4.3070606@tu-dresden.de> References: <4F2A7CDE.1010103@tu-dresden.de> <4F2B98E4.3070606@tu-dresden.de> Message-ID: On Fri, Feb 3, 2012 at 11:20, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > I'll try it. Is there any difference between using MATMPIAIJ and > MatSetBlockSize, and using MATMPIBAIJ? There should not be a semantic difference. BAIJ is more memory and time efficient for any kernels used by PETSc, but it cannot be used directly by BoomerAMG, so it needs to be converted (increasing storage). So depending on how much time you spend in different parts of code, and how much memory you have available, either one could be faster. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 3 03:57:21 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 3 Feb 2012 12:57:21 +0300 Subject: [petsc-users] Richardson with direct solver does not converge in preconditioned residual norm In-Reply-To: <4F2B9137.7090403@tu-dresden.de> References: <4F2A77EA.7080906@tu-dresden.de> <4F2B9137.7090403@tu-dresden.de> Message-ID: On Fri, Feb 3, 2012 at 10:48, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Shouldn't be, but it seems that is is close to singular in computer > arithmetic. I would like to understand we it's so. The matrix is a 2x2 > block matrix with no coupling between the main blocks. I know that this > does not make much sense but its for tests only and I would like to add > some couplings later. Both blocks are nonsingular and easy solvable with > direct solvers. But when adding both together, the condition number rise to > something around 10^23. Is it only a question of scaling both matrices to > the same order? If it's *very* poorly scaled, then yes, it could be. You can try to correct it with -ksp_diagonal_scale -ksp_diagonal_scale_fix. It seems more likely to me that it's a null space issue. How many near-zero eigenvalues are there? Perhaps you effectively have an all-Neumann boundary condition (e.g. incompressible flow with all Dirichlet velocity boundary conditions leaves the pressure undetermined up to a constant). -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Feb 3 05:22:49 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 12:22:49 +0100 Subject: [petsc-users] Richardson with direct solver does not converge in preconditioned residual norm In-Reply-To: References: <4F2A77EA.7080906@tu-dresden.de> <4F2B9137.7090403@tu-dresden.de> Message-ID: <4F2BC389.8000201@tu-dresden.de> Yes, you are right, it's an issue with the null space due to boundary conditions. Thomas Am 03.02.2012 10:57, schrieb Jed Brown: > On Fri, Feb 3, 2012 at 10:48, Thomas Witkowski > > wrote: > > Shouldn't be, but it seems that is is close to singular in > computer arithmetic. I would like to understand we it's so. The > matrix is a 2x2 block matrix with no coupling between the main > blocks. I know that this does not make much sense but its for > tests only and I would like to add some couplings later. Both > blocks are nonsingular and easy solvable with direct solvers. But > when adding both together, the condition number rise to something > around 10^23. Is it only a question of scaling both matrices to > the same order? > > > If it's *very* poorly scaled, then yes, it could be. You can try to > correct it with -ksp_diagonal_scale -ksp_diagonal_scale_fix. > > It seems more likely to me that it's a null space issue. How many > near-zero eigenvalues are there? Perhaps you effectively have an > all-Neumann boundary condition (e.g. incompressible flow with all > Dirichlet velocity boundary conditions leaves the pressure > undetermined up to a constant). -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Feb 3 06:34:33 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 13:34:33 +0100 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: References: <4F2A7CDE.1010103@tu-dresden.de> Message-ID: <4F2BD459.3060606@tu-dresden.de> I did it, but gmres with boomeramg diverges. The system has three unknowns per mesh node. Each block operator is either a Laplace or the mass matrix. So each block by-itself is solvable with amg. Thus it follows that the overall system is solvable? In my case the system is not symmetric and indefinite. The boundary conditions are Neuman everywhere, but the global matrix has an empty null space. As the local blocks (in the case of the discrete Laplace) have constant null space I set -pc_hypre_boomeramg_relax_type_coarse Jacobi for boomeramg not to make direct solves on coarse grid. Is there any theoretical reason that AMG cannot work in this case or is it a question of just the right settings for the solver? Thomas Am 02.02.2012 14:43, schrieb Mark F. Adams: > Use MatSetBlockSize(mat,ndof); and that info will get passed down to HYPRE. > Mark > > On Feb 2, 2012, at 7:09 AM, Thomas Witkowski wrote: > >> The documentation of boomeramg mention that it's possible to solve also matrices arising from the discretization of system of PDEs. But there is no more information on it. What should I do to make use of it in PETSc? >> >> Thomas >> From jedbrown at mcs.anl.gov Fri Feb 3 06:43:34 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 3 Feb 2012 15:43:34 +0300 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: <4F2BD459.3060606@tu-dresden.de> References: <4F2A7CDE.1010103@tu-dresden.de> <4F2BD459.3060606@tu-dresden.de> Message-ID: On Fri, Feb 3, 2012 at 15:34, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > I did it, but gmres with boomeramg diverges. The system has three unknowns > per mesh node. Each block operator is either a Laplace or the mass matrix. > So each block by-itself is solvable with amg. Thus it follows that the > overall system is solvable? In my case the system is not symmetric and > indefinite. The boundary conditions are Neuman everywhere, but the global > matrix has an empty null space. As the local blocks (in the case of the > discrete Laplace) have constant null space I set -pc_hypre_boomeramg_relax_ > **type_coarse Jacobi for boomeramg not to make direct solves on coarse > grid. Is there any theoretical reason that AMG cannot work in this case or > is it a question of just the right settings for the solver? > How did you order dofs? How are the blocks coupled? AMG is more delicate and generally less robust for systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Feb 3 07:03:20 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 14:03:20 +0100 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: References: <4F2A7CDE.1010103@tu-dresden.de> <4F2BD459.3060606@tu-dresden.de> Message-ID: <4F2BDB18.80702@tu-dresden.de> Am 03.02.2012 13:43, schrieb Jed Brown: > How did you order dofs? Node-wise, thus the first three unknows are the three dofs of the first mesh node, and so on. > > How are the blocks coupled? The system writes L 0 M M L 0 L M L With L = discrete laplace, M = mass matrix, 0 = empty matrix > > AMG is more delicate and generally less robust for systems. Is this different with geometric multigrid? Thomas From jedbrown at mcs.anl.gov Fri Feb 3 07:11:07 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 3 Feb 2012 16:11:07 +0300 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: <4F2BDB18.80702@tu-dresden.de> References: <4F2A7CDE.1010103@tu-dresden.de> <4F2BD459.3060606@tu-dresden.de> <4F2BDB18.80702@tu-dresden.de> Message-ID: On Fri, Feb 3, 2012 at 16:03, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > The system writes > > L 0 M > M L 0 > L M L > > With L = discrete laplace, M = mass matrix, 0 = empty matrix > Hmm, what are the relative scales of these equations? > > >> AMG is more delicate and generally less robust for systems. >> > Is this different with geometric multigrid? > Null space issues are usually easier with geometric, but constructing low energy interpolants can require custom work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Feb 3 07:19:06 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 14:19:06 +0100 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: References: <4F2A7CDE.1010103@tu-dresden.de> <4F2BD459.3060606@tu-dresden.de> <4F2BDB18.80702@tu-dresden.de> Message-ID: <4F2BDECA.9030307@tu-dresden.de> Am 03.02.2012 14:11, schrieb Jed Brown: > On Fri, Feb 3, 2012 at 16:03, Thomas Witkowski > > wrote: > > The system writes > > L 0 M > M L 0 > L M L > > With L = discrete laplace, M = mass matrix, 0 = empty matrix > > > Hmm, what are the relative scales of these equations? > Just for the test I remove all scalings (so no timestep-sizes or other constants are multiplied). L and M are the standard matrices discretized with linear FEMs on a 2D box (no local mesh adaptivity). Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Fri Feb 3 07:25:11 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 03 Feb 2012 14:25:11 +0100 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: References: <4F2A7CDE.1010103@tu-dresden.de> <4F2BD459.3060606@tu-dresden.de> <4F2BDB18.80702@tu-dresden.de> Message-ID: <4F2BE037.2040802@tu-dresden.de> Is there (good) literature that provides more information about solving systems of PDE with geometric/algebraic multigrid? Am 03.02.2012 14:11, schrieb Jed Brown: > On Fri, Feb 3, 2012 at 16:03, Thomas Witkowski > > wrote: > > The system writes > > L 0 M > M L 0 > L M L > > With L = discrete laplace, M = mass matrix, 0 = empty matrix > > > Hmm, what are the relative scales of these equations? > > > > AMG is more delicate and generally less robust for systems. > > Is this different with geometric multigrid? > > > Null space issues are usually easier with geometric, but constructing > low energy interpolants can require custom work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Fri Feb 3 08:07:37 2012 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 3 Feb 2012 14:07:37 +0000 Subject: [petsc-users] access to matnest block (0,1) ? Message-ID: > I should add a MatNestGetISs() so you can get out the automatically-created > ISs. They will have the structure described below. If you want to work with > the released version, you should create the ISs yourself. As a user, having MatNestGetISs would be most convenient. > A has a contigous distribution, so the ISs must respect that. Did you try > creating the index sets described above? Please explain "something's wrong". Finally I understand the relation between the nesting and the index sets. After replacing my incorrect ISs with the correct ones, everything's fine (no more [0]PETSC ERROR: Arguments are incompatible! [0]PETSC ERROR: Could not find index set!). Thanks a lot for your help, Jed, I really appreciate it. dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From balay at mcs.anl.gov Fri Feb 3 08:53:10 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 3 Feb 2012 08:53:10 -0600 (CST) Subject: [petsc-users] Valgrind and uninitialized values In-Reply-To: References: Message-ID: Its easier to install a seprate petsc with --download-mpich - for valgrind purposes.. Satish On Thu, 2 Feb 2012, Max Rudolph wrote: > No, I am using the version of Petsc supplied on the lonestar machine, which was compiled with their MPI build. I will look into making a suppressions file. Thanks for your help. > > Max > > > > > > I am trying to track down a memory corruption bug using valgrind, but I am > > > > > having to wade through lots and lots of error messages similar to this one, > > > > > which I believe are either spurious or related to some problem in Petsc and > > > > > not in my code (please correct me if I'm wrong!) > > > > > > > Are you using ---download-mpich? It should be valgrind clean. These are a > > pain, but you can make a supressions > > file as well. > > > > Matt > > > > From mark.adams at columbia.edu Fri Feb 3 09:09:01 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Fri, 3 Feb 2012 10:09:01 -0500 Subject: [petsc-users] Use boomeramg to solve system of PDEs In-Reply-To: <4F2BE037.2040802@tu-dresden.de> References: <4F2A7CDE.1010103@tu-dresden.de> <4F2BD459.3060606@tu-dresden.de> <4F2BDB18.80702@tu-dresden.de> <4F2BE037.2040802@tu-dresden.de> Message-ID: <552DBF40-0060-4A2F-B2C4-E43C5686914F@columbia.edu> I recommend the Trottenberg "Multrigrid" book. This will not work unscaled, the off diagonal terms as large as the diagonal, its is very unsymmetric. The time term on the diagonal might save you but that is fragile. I'm sure someone has worked through how to attack this as a block system but I can see that its almost lower triangular so block Gauss-Seidel iterations looks like a good place to start. Mark On Feb 3, 2012, at 8:25 AM, Thomas Witkowski wrote: > Is there (good) literature that provides more information about solving systems of PDE with geometric/algebraic multigrid? > > Am 03.02.2012 14:11, schrieb Jed Brown: >> >> On Fri, Feb 3, 2012 at 16:03, Thomas Witkowski wrote: >> The system writes >> >> L 0 M >> M L 0 >> L M L >> >> With L = discrete laplace, M = mass matrix, 0 = empty matrix >> >> Hmm, what are the relative scales of these equations? >> >> >> >> AMG is more delicate and generally less robust for systems. >> Is this different with geometric multigrid? >> >> Null space issues are usually easier with geometric, but constructing low energy interpolants can require custom work. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From margarita.satraki at gmail.com Fri Feb 3 16:35:35 2012 From: margarita.satraki at gmail.com (Margarita Satraki) Date: Fri, 3 Feb 2012 22:35:35 +0000 Subject: [petsc-users] efficiency of MatSetValues Message-ID: Hello, I'm a bit confused regarding the efficiency of MatSetValues in Petsc and I was hoping for some help. I'm using Petsc 3.2-p6 on Fedora11. I'm attaching a simple code that displays the time needed for some simple cases using MatSetValues. 1st case: The matrix is composed by copying a different matrix and then MatSetValues is used to insert elements at the same nonzero pattern (diagonal) as the initial matrix (with defining the number of nonzeros per row). 2nd case: The matrix is composed by copying a different matrix and then MatSetValues is used to insert elements at a different nonzero pattern than the initial matrix (diagonal-1st off diagonal) (with defining the number of nonzeros per row). 3rd case: Same as 2nd but without defining the number of nonzeros per row. It seems that only the 1st case gives good results in the sense that by increasing the size of the matrix you increase the time needed by MatSetValues linearly. Both the 2nd and the 3rd case give similar results, much worse than the 1st. I understand that the 1st case has the advantage because of accurate memory allocation but shouldn't the 2ndcase be better than the 3rd since it at least defines the number of nonzeros per row so it again allocates memory more accurately? Many thanks for any help, Margarita /////////////////////////////////////////////////////// DPhil candidate University of Oxford -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: time.cpp Type: text/x-c++src Size: 3665 bytes Desc: not available URL: From margarita.satraki at gmail.com Fri Feb 3 16:41:12 2012 From: margarita.satraki at gmail.com (Margarita Satraki) Date: Fri, 3 Feb 2012 22:41:12 +0000 Subject: [petsc-users] efficiency of MatSetValues Message-ID: Hello, I'm a bit confused regarding the efficiency of MatSetValues in Petsc and I was hoping for some help. I'm using Petsc 3.2-p6 on Fedora11. I'm attaching a simple code that displays the time needed for some simple cases using MatSetValues. 1st case: The matrix is composed by copying a different matrix and then MatSetValues is used to insert elements at the same nonzero pattern (diagonal) as the initial matrix (with defining the number of nonzeros per row). 2nd case: The matrix is composed by copying a different matrix and then MatSetValues is used to insert elements at a different nonzero pattern than the initial matrix (diagonal-1st off diagonal) (with defining the number of nonzeros per row). 3rd case: Same as 2nd but without defining the number of nonzeros per row. It seems that only the 1st case gives good results in the sense that by increasing the size of the matrix you increase the time needed by MatSetValues linearly. Both the 2nd and the 3rd case give similar results, much worse than the 1st. I understand that the 1st case has the advantage because of accurate memory allocation but shouldn't the 2ndcase be better than the 3rd since it at least defines the number of nonzeros per row so it again allocates memory more accurately? Many thanks for any help, Margarita /////////////////////////////////////////////////////// DPhil candidate University of Oxford -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: time.cpp Type: text/x-c++src Size: 3665 bytes Desc: not available URL: From jedbrown at mcs.anl.gov Fri Feb 3 16:41:39 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 4 Feb 2012 01:41:39 +0300 Subject: [petsc-users] efficiency of MatSetValues In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 01:35, Margarita Satraki wrote: > It seems that only the 1st case gives good results in the sense that by > increasing the size of the matrix you increase the time needed by > MatSetValues linearly. Both the 2nd and the 3rd case give similar results, > much worse than the 1st. I understand that the 1st case has the advantage > because of accurate memory allocation but shouldn't the 2ndcase be better > than the 3rd since it at least defines the number of nonzeros per row so it > again allocates memory more accurately? Those nonzeros are in the wrong place and PETSc does not know that you want to "delete" the old entries. Just preallocate the correct number of nonzeros and it will be fast, don't bother with copying in a "similar" matrix. -------------- next part -------------- An HTML attachment was scrubbed... URL: From margarita.satraki at gmail.com Fri Feb 3 16:46:54 2012 From: margarita.satraki at gmail.com (Margarita Satraki) Date: Fri, 3 Feb 2012 22:46:54 +0000 Subject: [petsc-users] efficiency of MatSetValues In-Reply-To: References: Message-ID: Hi Jed, Thanks for the reply. I've defined 2 nonzeros per row so it should consider them to be in the correct place (1 for diagonal and 1 for the 1st off diagonal). I do not want to delete anything, just experiment with inserting new entries. This is a simple example to demonstrate my problem with a more complicated code of nonlinear elasticity. Margarita On 3 February 2012 22:41, Jed Brown wrote: > On Sat, Feb 4, 2012 at 01:35, Margarita Satraki < > margarita.satraki at gmail.com> wrote: > >> It seems that only the 1st case gives good results in the sense that by >> increasing the size of the matrix you increase the time needed by >> MatSetValues linearly. Both the 2nd and the 3rd case give similar results, >> much worse than the 1st. I understand that the 1st case has the advantage >> because of accurate memory allocation but shouldn't the 2ndcase be better >> than the 3rd since it at least defines the number of nonzeros per row so it >> again allocates memory more accurately? > > > Those nonzeros are in the wrong place and PETSc does not know that you > want to "delete" the old entries. > > Just preallocate the correct number of nonzeros and it will be fast, don't > bother with copying in a "similar" matrix. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vkuhlem at emory.edu Fri Feb 3 16:47:28 2012 From: vkuhlem at emory.edu (Kuhlemann, Verena) Date: Fri, 3 Feb 2012 22:47:28 +0000 Subject: [petsc-users] A^TA Message-ID: Hi, one quick question: I have a matrix P and need to compute P^t*P. Is there a function available for this? I couldn't find anything. Right now I am computing P^t*P by using MatPtAP with A=identity, but I was hoping there is another way where I don't need to set up the identity matrix. Thanks, Verena ________________________________ This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited. If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 3 16:54:18 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 4 Feb 2012 01:54:18 +0300 Subject: [petsc-users] A^TA In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 01:47, Kuhlemann, Verena wrote: > Hi, > > one quick question: I have a matrix P and need to compute P^t*P. > Is there a function available for this? I couldn't find anything. > Right now I am computing P^t*P by using MatPtAP with A=identity, > but I was hoping there is another way where I don't need to set up > the identity matrix. > Use petsc-dev and MatTransposeMatMult(P,P,...,&PtP) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vkuhlem at emory.edu Fri Feb 3 16:58:21 2012 From: vkuhlem at emory.edu (Kuhlemann, Verena) Date: Fri, 3 Feb 2012 22:58:21 +0000 Subject: [petsc-users] A^TA In-Reply-To: References: , Message-ID: Thanks, I'll try that. ________________________________ From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Jed Brown [jedbrown at mcs.anl.gov] Sent: Friday, February 03, 2012 5:54 PM To: PETSc users list Subject: Re: [petsc-users] A^TA On Sat, Feb 4, 2012 at 01:47, Kuhlemann, Verena > wrote: Hi, one quick question: I have a matrix P and need to compute P^t*P. Is there a function available for this? I couldn't find anything. Right now I am computing P^t*P by using MatPtAP with A=identity, but I was hoping there is another way where I don't need to set up the identity matrix. Use petsc-dev and MatTransposeMatMult(P,P,...,&PtP) ________________________________ This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited. If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments). -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 3 16:58:45 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 3 Feb 2012 16:58:45 -0600 Subject: [petsc-users] efficiency of MatSetValues In-Reply-To: References: Message-ID: On Fri, Feb 3, 2012 at 4:46 PM, Margarita Satraki < margarita.satraki at gmail.com> wrote: > Hi Jed, > > Thanks for the reply. > I've defined 2 nonzeros per row so it should consider them to be in the > correct place (1 for diagonal and 1 for the 1st off diagonal). I do not > want to delete anything, just experiment with inserting new entries. This > is a simple example to demonstrate my problem with a more complicated code > of nonlinear elasticity. Assembly of a matrix compresses it, throwing away extra allocated places that were not used. Matt > Margarita > > > On 3 February 2012 22:41, Jed Brown wrote: > >> On Sat, Feb 4, 2012 at 01:35, Margarita Satraki < >> margarita.satraki at gmail.com> wrote: >> >>> It seems that only the 1st case gives good results in the sense that by >>> increasing the size of the matrix you increase the time needed by >>> MatSetValues linearly. Both the 2nd and the 3rd case give similar results, >>> much worse than the 1st. I understand that the 1st case has the advantage >>> because of accurate memory allocation but shouldn't the 2ndcase be better >>> than the 3rd since it at least defines the number of nonzeros per row so it >>> again allocates memory more accurately? >> >> >> Those nonzeros are in the wrong place and PETSc does not know that you >> want to "delete" the old entries. >> >> Just preallocate the correct number of nonzeros and it will be fast, >> don't bother with copying in a "similar" matrix. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From margarita.satraki at gmail.com Fri Feb 3 17:05:27 2012 From: margarita.satraki at gmail.com (Margarita Satraki) Date: Fri, 3 Feb 2012 23:05:27 +0000 Subject: [petsc-users] efficiency of MatSetValues In-Reply-To: References: Message-ID: Does the same happen with MatDuplicate? It overwrites the MatSeqAIJSetPreallocation? In this case, can I redefine the nonzero places? I need it for the case of creating the preconditioner by changing the jacobian slightly. Margarita On 3 February 2012 22:58, Matthew Knepley wrote: > On Fri, Feb 3, 2012 at 4:46 PM, Margarita Satraki < > margarita.satraki at gmail.com> wrote: > >> Hi Jed, >> >> Thanks for the reply. >> I've defined 2 nonzeros per row so it should consider them to be in the >> correct place (1 for diagonal and 1 for the 1st off diagonal). I do not >> want to delete anything, just experiment with inserting new entries. This >> is a simple example to demonstrate my problem with a more complicated code >> of nonlinear elasticity. > > > Assembly of a matrix compresses it, throwing away extra allocated places > that were not used. > > Matt > > > >> Margarita >> >> >> On 3 February 2012 22:41, Jed Brown wrote: >> >>> On Sat, Feb 4, 2012 at 01:35, Margarita Satraki < >>> margarita.satraki at gmail.com> wrote: >>> >>>> It seems that only the 1st case gives good results in the sense that by >>>> increasing the size of the matrix you increase the time needed by >>>> MatSetValues linearly. Both the 2nd and the 3rd case give similar results, >>>> much worse than the 1st. I understand that the 1st case has the advantage >>>> because of accurate memory allocation but shouldn't the 2ndcase be better >>>> than the 3rd since it at least defines the number of nonzeros per row so it >>>> again allocates memory more accurately? >>> >>> >>> Those nonzeros are in the wrong place and PETSc does not know that you >>> want to "delete" the old entries. >>> >>> Just preallocate the correct number of nonzeros and it will be fast, >>> don't bother with copying in a "similar" matrix. >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 3 17:27:40 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 3 Feb 2012 17:27:40 -0600 Subject: [petsc-users] efficiency of MatSetValues In-Reply-To: References: Message-ID: On Fri, Feb 3, 2012 at 5:05 PM, Margarita Satraki < margarita.satraki at gmail.com> wrote: > Does the same happen with MatDuplicate? It overwrites the > MatSeqAIJSetPreallocation? > Yes. > In this case, can I redefine the nonzero places? > No. > I need it for the case of creating the preconditioner by changing the > jacobian slightly. > The right way to do this is just fully allocate the preconditioner. We have benchmarked this hundreds of times, and it is negligible. Thanks, Matt > Margarita > > On 3 February 2012 22:58, Matthew Knepley wrote: > >> On Fri, Feb 3, 2012 at 4:46 PM, Margarita Satraki < >> margarita.satraki at gmail.com> wrote: >> >>> Hi Jed, >>> >>> Thanks for the reply. >>> I've defined 2 nonzeros per row so it should consider them to be in the >>> correct place (1 for diagonal and 1 for the 1st off diagonal). I do not >>> want to delete anything, just experiment with inserting new entries. This >>> is a simple example to demonstrate my problem with a more complicated code >>> of nonlinear elasticity. >> >> >> Assembly of a matrix compresses it, throwing away extra allocated places >> that were not used. >> >> Matt >> >> >> >>> Margarita >>> >>> >>> On 3 February 2012 22:41, Jed Brown wrote: >>> >>>> On Sat, Feb 4, 2012 at 01:35, Margarita Satraki < >>>> margarita.satraki at gmail.com> wrote: >>>> >>>>> It seems that only the 1st case gives good results in the sense that >>>>> by increasing the size of the matrix you increase the time needed by >>>>> MatSetValues linearly. Both the 2nd and the 3rd case give similar results, >>>>> much worse than the 1st. I understand that the 1st case has the advantage >>>>> because of accurate memory allocation but shouldn't the 2ndcase be better >>>>> than the 3rd since it at least defines the number of nonzeros per row so it >>>>> again allocates memory more accurately? >>>> >>>> >>>> Those nonzeros are in the wrong place and PETSc does not know that you >>>> want to "delete" the old entries. >>>> >>>> Just preallocate the correct number of nonzeros and it will be fast, >>>> don't bother with copying in a "similar" matrix. >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From margarita.satraki at gmail.com Fri Feb 3 17:32:28 2012 From: margarita.satraki at gmail.com (Margarita Satraki) Date: Fri, 3 Feb 2012 23:32:28 +0000 Subject: [petsc-users] efficiency of MatSetValues In-Reply-To: References: Message-ID: Great, thanks for your help! Margarita On 3 February 2012 23:27, Matthew Knepley wrote: > On Fri, Feb 3, 2012 at 5:05 PM, Margarita Satraki < > margarita.satraki at gmail.com> wrote: > >> Does the same happen with MatDuplicate? It overwrites the >> MatSeqAIJSetPreallocation? >> > > Yes. > > >> In this case, can I redefine the nonzero places? >> > > No. > > >> I need it for the case of creating the preconditioner by changing the >> jacobian slightly. >> > > The right way to do this is just fully allocate the preconditioner. We > have benchmarked > this hundreds of times, and it is negligible. > > Thanks, > > Matt > > >> Margarita >> >> On 3 February 2012 22:58, Matthew Knepley wrote: >> >>> On Fri, Feb 3, 2012 at 4:46 PM, Margarita Satraki < >>> margarita.satraki at gmail.com> wrote: >>> >>>> Hi Jed, >>>> >>>> Thanks for the reply. >>>> I've defined 2 nonzeros per row so it should consider them to be in the >>>> correct place (1 for diagonal and 1 for the 1st off diagonal). I do not >>>> want to delete anything, just experiment with inserting new entries. This >>>> is a simple example to demonstrate my problem with a more complicated code >>>> of nonlinear elasticity. >>> >>> >>> Assembly of a matrix compresses it, throwing away extra allocated places >>> that were not used. >>> >>> Matt >>> >>> >>> >>>> Margarita >>>> >>>> >>>> On 3 February 2012 22:41, Jed Brown wrote: >>>> >>>>> On Sat, Feb 4, 2012 at 01:35, Margarita Satraki < >>>>> margarita.satraki at gmail.com> wrote: >>>>> >>>>>> It seems that only the 1st case gives good results in the sense that >>>>>> by increasing the size of the matrix you increase the time needed by >>>>>> MatSetValues linearly. Both the 2nd and the 3rd case give similar results, >>>>>> much worse than the 1st. I understand that the 1st case has the advantage >>>>>> because of accurate memory allocation but shouldn't the 2ndcase be better >>>>>> than the 3rd since it at least defines the number of nonzeros per row so it >>>>>> again allocates memory more accurately? >>>>> >>>>> >>>>> Those nonzeros are in the wrong place and PETSc does not know that you >>>>> want to "delete" the old entries. >>>>> >>>>> Just preallocate the correct number of nonzeros and it will be fast, >>>>> don't bother with copying in a "similar" matrix. >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredva at ifi.uio.no Mon Feb 6 08:33:23 2012 From: fredva at ifi.uio.no (Fredrik Heffer Valdmanis) Date: Mon, 6 Feb 2012 15:33:23 +0100 Subject: [petsc-users] VECMPICUSP with ghosted vector Message-ID: Hi, In FEniCS, we use ghosted vectors for parallel computations, with the functions VecCreateGhost VecGhostGetLocalForm As I am integrating the new GPU-based vectors and matrices in FEniCS, I want the ghosted vectors to be of type VECMPICUSP. I have tried to do this by calling VecSetType after creating the vector, but that makes VecGhostGetLocalForm give an error. Is it possible to set the type to be mpicusp when using ghost vectors, without changing much of the code? If so, how? If not, how would you recommend I proceed to work with mpicusp vectors in this context? Thanks! -- Fredrik -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Feb 6 11:09:06 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 6 Feb 2012 11:09:06 -0600 Subject: [petsc-users] VECMPICUSP with ghosted vector In-Reply-To: References: Message-ID: <13AEC6C8-F05D-4275-83A3-03DBDE6A5146@mcs.anl.gov> Fredrik, This question belongs on petsc-dev at mcs.anl.gov since it involves additions/extensions to PETSc so I am moving the discussion over to there. We have not done the required work to have ghosted vectors work with CUSP yet, so this will require some additions to PETSc. We can help you with that process but since the PETSc team does not have a CUSP person developing PETSc full time you will need to actual contribute some code but I'll try to guide you in the right direction. The first observation is that ghosted vectors in PETSc are actually handled with largely the same code as VECMPI vectors (with just no ghost points by default) so in theory little work needs to be done to get the functionality you need. What makes the needed changes non-trivial is the current interface where one calls VecCreateGhost() to create the vectors. This is one of our "easy" interfaces and it is somewhat legacy in that there is no way to control the types of the vectors since it creates everything about the vector in one step. Note that we have the same issues with regard to the pthread versions of the PETSc vectors and ghosting. So before we even talk about what code to change/add we need to decide on the interface. Presumably you want to be able to decide at runtime whether to use regular VECMPI, VECMPICUSP or VECMPIPTHREAD in your ghosted vectors. How do we get that information in there? An additional argument to VecCreateGhost() (ugly?)? Options database (by calling VecSetFromOptions() ?), other ways? So for example one could have: VecCreateGhost(......) VecSetFromOptions(......) to set the specific type cusp or pthread? What about VecCreateGhost(......) VecSetType(......,VECMPICUSP); which as you note doesn't currently work. Note that the PTHREAD version needs to do its own memory allocation so essentially has to undo much of what VecCreateGhost() already did, is that a bad thing? Or do we get rid of VecCreateGhost() completely and change the model to something like VecCreate() VecSetType() VecSetGhosted() or VecCreate() VecSetTypeFromOptions() VecSetGhosted() or even VecCreate() VecSetGhosted() which will just default to regular MPI ghosted. this model allows a clean implementation that doesn't require undoing previously built internals. Everyone chime in with observations so we can figure out any refactorizations needed. Barry On Feb 6, 2012, at 8:33 AM, Fredrik Heffer Valdmanis wrote: > Hi, > > In FEniCS, we use ghosted vectors for parallel computations, with the functions > > VecCreateGhost > VecGhostGetLocalForm > > As I am integrating the new GPU-based vectors and matrices in FEniCS, I want the ghosted vectors to be of type VECMPICUSP. I have tried to do this by calling VecSetType after creating the vector, but that makes VecGhostGetLocalForm give an error. > > Is it possible to set the type to be mpicusp when using ghost vectors, without changing much of the code? If so, how? > > If not, how would you recommend I proceed to work with mpicusp vectors in this context? > > Thanks! > > -- > Fredrik From friedmud at gmail.com Mon Feb 6 23:06:29 2012 From: friedmud at gmail.com (Derek Gaston) Date: Mon, 6 Feb 2012 22:06:29 -0700 Subject: [petsc-users] Hang at PetscLayoutSetUp() Message-ID: Hello all, I'm running some largish finite element calculations at the moment (50 Million to 400 Million DoFs on up to 10,000 processors) using a code based on PETSc (obviously!) and while most of the simulations are working well, every now again I seem to run into a hang in the setup phase of the simulation. I've attached GDB several times and it seems to alway be hanging in PetscLayoutSetUp() during matrix creation. Here is the top of a stack trace showing what I mean: #0 0x00002aac9d86cef2 in opal_progress () from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libopen-pal.so.0 #1 0x00002aac9d16a0c4 in ompi_request_default_wait_all () from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 #2 0x00002aac9d1da9ee in ompi_coll_tuned_sendrecv_actual () from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 #3 0x00002aac9d1e2716 in ompi_coll_tuned_allgather_intra_bruck () from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 #4 0x00002aac9d1db439 in ompi_coll_tuned_allgather_intra_dec_fixed () from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 #5 0x00002aac9d1827e6 in PMPI_Allgather () from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 #6 0x0000000000508184 in PetscLayoutSetUp () #7 0x00000000005b9f39 in MatMPIAIJSetPreallocation_MPIAIJ () #8 0x00000000005c1317 in MatCreateMPIAIJ () As you can see, I'm currently using openMPI (even though I do have access to others) along with the intel compiler (this is a mostly C++ code). This problem doesn't exhibit itself on any smaller problems (we run TONS of runs all the time in the 10,000-5,000,000 DoF range on 1-3000 procs) and only seems to come up on these larger runs. I'm starting to suspect that it's an openMPI issue. Has anyone seen anything like this before? Here are some specs for my current environment PETSc 3.1-p8 (I know, I know....) OpenMPI 1.4.4 intel compilers 12.1.1 Modified Redhat with 2.6.18 Kernel QDR Infiniband Thanks for any help! Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 6 23:27:05 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 7 Feb 2012 08:27:05 +0300 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 08:06, Derek Gaston wrote: > Hello all, > > I'm running some largish finite element calculations at the moment (50 > Million to 400 Million DoFs on up to 10,000 processors) using a code based > on PETSc (obviously!) and while most of the simulations are working well, > every now again I seem to run into a hang in the setup phase of the > simulation. > > I've attached GDB several times and it seems to alway be hanging > in PetscLayoutSetUp() during matrix creation. Here is the top of a stack > trace showing what I mean: > > #0 0x00002aac9d86cef2 in opal_progress () from > /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libopen-pl.so.0 > #1 0x00002aac9d16a0c4 in ompi_request_default_wait_all () from > /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 > #2 0x00002aac9d1da9ee in ompi_coll_tuned_sendrecv_actual () from > /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 > #3 0x00002aac9d1e2716 in ompi_coll_tuned_allgather_intra_bruck () from > /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 > #4 0x00002aac9d1db439 in ompi_coll_tuned_allgather_intra_dec_fixed () > from /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 > #5 0x00002aac9d1827e6 in PMPI_Allgather () from > /apps/local/openmpi/1.4.4/intel-12.1.1/opt/lib/libmpi.so.0 > #6 0x0000000000508184 in PetscLayoutSetUp () > #7 0x00000000005b9f39 in MatMPIAIJSetPreallocation_MPIAIJ () > #8 0x00000000005c1317 in MatCreateMPIAIJ () > Are _all_ the processes making it here? > > As you can see, I'm currently using openMPI (even though I do have access > to others) along with the intel compiler (this is a mostly C++ code). This > problem doesn't exhibit itself on any smaller problems (we run TONS of runs > all the time in the 10,000-5,000,000 DoF range on 1-3000 procs) and only > seems to come up on these larger runs. > > I'm starting to suspect that it's an openMPI issue. Has anyone seen > anything like this before? > > Here are some specs for my current environment > > PETSc 3.1-p8 (I know, I know....) > OpenMPI 1.4.4 > intel compilers 12.1.1 > Modified Redhat with 2.6.18 Kernel > QDR Infiniband > > Thanks for any help! > > Derek > -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Mon Feb 6 23:34:05 2012 From: friedmud at gmail.com (Derek Gaston) Date: Mon, 6 Feb 2012 22:34:05 -0700 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Mon, Feb 6, 2012 at 10:27 PM, Jed Brown wrote: > > Are _all_ the processes making it here? > Sigh. I knew someone was going to ask that ;-) I'll have to write a short script to grab the stack trace from every one of the 10,000 processes to see where they are and try to find any anomalies. Anyone have a script (or pieces of one) to do this that they wouldn't mind sharing? I did spot check quite a few and they were all in the same spot. Now here comes the weirdness: I left one of these processes attached in GDB for quite a while (10+ minutes) after the whole job had been hung for over an hour. When I noticed that I had left it attached I detached GDB and.... the job started right up! That is: it moved on past this problem! How is that for some weirdness. It might have just been coincidence... or maybe me stalling that process for a bit by attaching GDB nudged some communication in the right direction... I don't know. I know that's not terribly scientific. I'll have to wait until the next job hangs before I can do more inspection, but when (not if) that happens I'll post back with more info. Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Feb 7 00:20:16 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 7 Feb 2012 09:20:16 +0300 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 08:34, Derek Gaston wrote: > On Mon, Feb 6, 2012 at 10:27 PM, Jed Brown wrote: > >> >> Are _all_ the processes making it here? >> > > Sigh. I knew someone was going to ask that ;-) > > I'll have to write a short script to grab the stack trace from every one > of the 10,000 processes to see where they are and try to find > any anomalies. Anyone have a script (or pieces of one) to do this that > they wouldn't mind sharing? > > I did spot check quite a few and they were all in the same spot. > > Now here comes the weirdness: I left one of these processes attached in > GDB for quite a while (10+ minutes) after the whole job had been hung for > over an hour. When I noticed that I had left it attached I detached GDB > and.... the job started right up! That is: it moved on past this problem! > How is that for some weirdness. It might have just been coincidence... or > maybe me stalling that process for a bit by attaching GDB nudged some > communication in the right direction... I don't know. > Hmm, progress semantics of MPI should ensure completion. Stalling the process with gdb should not change anything (assuming you weren't actually making changes with gdb). Can you run with MPICH2? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Feb 7 01:45:57 2012 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 7 Feb 2012 08:45:57 +0100 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: <3491C895-12B3-43D7-A2F3-7932E24BFCA3@dsic.upv.es> El 07/02/2012, a las 06:34, Derek Gaston escribi?: > On Mon, Feb 6, 2012 at 10:27 PM, Jed Brown wrote: > > Are _all_ the processes making it here? > > Sigh. I knew someone was going to ask that ;-) > > I'll have to write a short script to grab the stack trace from every one of the 10,000 processes to see where they are and try to find any anomalies. Anyone have a script (or pieces of one) to do this that they wouldn't mind sharing? Try with PADB: http://padb.pittman.org.uk/ Jose > > I did spot check quite a few and they were all in the same spot. > > Now here comes the weirdness: I left one of these processes attached in GDB for quite a while (10+ minutes) after the whole job had been hung for over an hour. When I noticed that I had left it attached I detached GDB and.... the job started right up! That is: it moved on past this problem! How is that for some weirdness. It might have just been coincidence... or maybe me stalling that process for a bit by attaching GDB nudged some communication in the right direction... I don't know. > > I know that's not terribly scientific. I'll have to wait until the next job hangs before I can do more inspection, but when (not if) that happens I'll post back with more info. > > Derek From recrusader at gmail.com Tue Feb 7 10:19:20 2012 From: recrusader at gmail.com (recrusader) Date: Tue, 7 Feb 2012 10:19:20 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: Dear Matt, I got the help from Thrust developers. The problem is in PETSc, we define CUSPARRAY using cusp::array1d. Generaly, PetscScalar is std::complex. However, the CUSPARRAY array cannot be decorated at GPU side. We need cusp::array1d. Is there any simple method to process this in PETSc? Thank you very much. Best, Yujie On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: > On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: > >> Thank you very much, Matt, >> >> You mean the headers of the simple codes, I further simply the codes as > > > This is a question for the CUSP mailing list. > > Thanks, > > Matt > > >> " >> #include >> #include >> >> int main(void) >> { >> cusp::array1d, cusp::host_memory> *x; >> >> x=new cusp::array1d, cusp::host_memory>(2,0.0); >> >> std::complex alpha(1,2.0); >> cusp::blas::scal(*x,alpha); >> >> return 0; >> }" >> >> I got the same compilation results " >> login1$ nvcc gputest.cu -o gputest >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >> warning: calling a __host__ function from a __host__ __device__ >> function is not allowed >> detected during: >> instantiation of "void >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> T=std::complex, T2=std::complex]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >> here >> instantiation of "InputIterator >> thrust::detail::host::for_each(InputIterator, InputIterator, >> UnaryFunction) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >> here >> instantiation of "InputIterator >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> UnaryFunction, thrust::host_space_tag) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >> here >> instantiation of "InputIterator >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> [with InputIterator=thrust::detail::normal_iterator >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >> here >> instantiation of "void thrust::for_each(InputIterator, >> InputIterator, UnaryFunction) [with >> InputIterator=thrust::detail::normal_iterator *>, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> (367): here >> instantiation of "void >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> [with ForwardIterator=thrust::detail::normal_iterator >> *>, ScalarType=std::complex]" >> (748): here >> instantiation of "void cusp::blas::scal(Array &, >> ScalarType) [with Array=cusp::array1d, >> cusp::host_memory>, ScalarType=std::complex]" >> gputest.cu(25): here >> >> " >> >> Thanks a lot. >> >> Best, >> Yujie >> >> On 1/29/12, Matthew Knepley wrote: >> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >> wrote: >> > >> >> Dear PETSc developers, >> >> >> >> With your help, I can successfully PETSc-deve with enabling GPU and >> >> complex number. >> >> However, when I compiled the codes, I met some errors. I also tried to >> >> use simple codes to realize the same function. However, the errors >> >> disappear. One example is as follows: >> >> >> >> for the function "VecScale_SeqCUSP" >> >> "#undef __FUNCT__ >> >> #define __FUNCT__ "VecScale_SeqCUSP" >> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >> >> { >> >> CUSPARRAY *xarray; >> >> PetscErrorCode ierr; >> >> >> >> PetscFunctionBegin; >> >> if (alpha == 0.0) { >> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >> >> } else if (alpha != 1.0) { >> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >> >> try { >> >> cusp::blas::scal(*xarray,alpha); >> >> } catch(char* ex) { >> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >> >> } >> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >> >> } >> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >> >> PetscFunctionReturn(0); >> >> } " >> >> >> >> When I compiled PETSc-dev, I met the following errors: >> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >> >> calling a __host__ function from a __host__ __device__ function is not >> >> allowed >> >> detected during: >> >> instantiation of "void >> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> >> T=std::complex, T2=PetscScalar]" >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >> >> here >> >> instantiation of "void >> >> thrust::detail::device::cuda::for_each_n_closure> >> Size, UnaryFunction>::operator()() [with >> >> >> >> >> RandomAccessIterator=thrust::detail::normal_iterator>, >> >> Size=long, >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> >> here >> >> instantiation of "void >> >> >> >> >> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >> >> [with >> >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> >> long, cusp::blas::detail::SCAL>>]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >> >> here >> >> instantiation of "size_t >> >> >> >> >> thrust::detail::device::cuda::detail::closure_launcher_base> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >> >> >> >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> >> long, cusp::blas::detail::SCAL>>, >> >> launch_by_value=true]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >> >> here >> >> instantiation of "thrust::pair >> >> >> >> >> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >> >> [with >> >> >> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >> >> long, cusp::blas::detail::SCAL>>, Size=long]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >> >> here >> >> [ 6 instantiation contexts not shown ] >> >> instantiation of "InputIterator >> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> >> UnaryFunction, thrust::device_space_tag) [with >> >> >> >> >> InputIterator=thrust::detail::normal_iterator>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >> >> instantiation of "InputIterator >> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> >> [with >> >> >> InputIterator=thrust::detail::normal_iterator>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >> >> instantiation of "void thrust::for_each(InputIterator, >> >> InputIterator, UnaryFunction) [with >> >> >> >> >> InputIterator=thrust::detail::normal_iterator>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> (367): here >> >> instantiation of "void >> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> >> [with >> >> >> ForwardIterator=thrust::detail::normal_iterator>, >> >> ScalarType=std::complex]" >> >> (748): here >> >> instantiation of "void cusp::blas::scal(Array &, >> >> ScalarType) [with Array=cusp::array1d> >> cusp::device_memory>, ScalarType=std::complex]" >> >> veccusp.cu(1185): here >> >> >> >> >> >> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> >> error: a value of type "int" cannot be assigned to an entity of type >> >> "_ZNSt7complexIdE9_ComplexTE" >> >> >> >> " >> >> However, I further realize simiar codes as >> >> " >> >> #include >> >> #include >> >> #include >> >> #include >> >> #include >> >> #include >> >> >> >> int main(void) >> >> { >> >> cusp::array1d, cusp::host_memory> *x; >> >> >> >> x=new cusp::array1d, cusp::host_memory>(2,0.0); >> >> >> >> std::complex alpha(1,2.0); >> >> cusp::blas::scal(*x,alpha); >> >> >> >> return 0; >> >> } >> >> " >> >> >> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >> >> warning information as follows: >> >> " >> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >> >> warning: calling a __host__ function from a __host__ __device__ >> >> function is not allowed >> >> detected during: >> >> instantiation of "void >> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >> >> T=std::complex, T2=std::complex]" >> >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >> >> here >> >> instantiation of "InputIterator >> >> thrust::detail::host::for_each(InputIterator, InputIterator, >> >> UnaryFunction) [with >> >> InputIterator=thrust::detail::normal_iterator *>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> >> >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >> >> here >> >> instantiation of "InputIterator >> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >> >> UnaryFunction, thrust::host_space_tag) [with >> >> InputIterator=thrust::detail::normal_iterator *>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >> >> here >> >> instantiation of "InputIterator >> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >> >> [with >> InputIterator=thrust::detail::normal_iterator >> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >> >> here >> >> instantiation of "void thrust::for_each(InputIterator, >> >> InputIterator, UnaryFunction) [with >> >> InputIterator=thrust::detail::normal_iterator *>, >> >> UnaryFunction=cusp::blas::detail::SCAL>]" >> >> (367): here >> >> instantiation of "void >> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >> >> [with >> ForwardIterator=thrust::detail::normal_iterator >> >> *>, ScalarType=std::complex]" >> >> (748): here >> >> instantiation of "void cusp::blas::scal(Array &, >> >> ScalarType) [with Array=cusp::array1d, >> >> cusp::host_memory>, ScalarType=std::complex]" >> >> gputest.cu(25): here >> >> >> >> " >> >> There are not errors like >> >> >> >> >> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >> >> error: a value of type "int" cannot be assigned to an entity of type >> >> "_ZNSt7complexIdE9_ComplexTE" " >> >> >> >> Furthermore, the warning information is also different between >> >> PETSc-dev and simple codes. >> >> >> >> Could you give me some suggestion for this errors? Thank you very much. >> >> >> > >> > The headers are complicated to get right. The whole point of what we >> did is >> > to give a way to use GPU >> > simply through the existing PETSc linear algebra interface. >> > >> > Matt >> > >> > >> >> Best, >> >> Yujie >> >> >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments is infinitely more interesting than any results to which >> their >> > experiments lead. >> > -- Norbert Wiener >> > >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 7 11:13:44 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 7 Feb 2012 11:13:44 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 10:19 AM, recrusader wrote: > Dear Matt, > > I got the help from Thrust developers. > The problem is > in PETSc, we define CUSPARRAY using cusp::array1d cusp::device_memory>. Generaly, PetscScalar is std::complex. > However, the CUSPARRAY array cannot be decorated at GPU side. We need > cusp::array1d. > Is there any simple method to process this in PETSc? > This reply does not make any sense to me. What do you mean by the word "decorated"? If you mean that they only support the complex type cusp::complex, then I advise you to configure for real numbers. We do not support cusp::complex as our complex type. Thanks, Matt > Thank you very much. > > Best, > Yujie > > On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: > >> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >> >>> Thank you very much, Matt, >>> >>> You mean the headers of the simple codes, I further simply the codes as >> >> >> This is a question for the CUSP mailing list. >> >> Thanks, >> >> Matt >> >> >>> " >>> #include >>> #include >>> >>> int main(void) >>> { >>> cusp::array1d, cusp::host_memory> *x; >>> >>> x=new cusp::array1d, cusp::host_memory>(2,0.0); >>> >>> std::complex alpha(1,2.0); >>> cusp::blas::scal(*x,alpha); >>> >>> return 0; >>> }" >>> >>> I got the same compilation results " >>> login1$ nvcc gputest.cu -o gputest >>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>> warning: calling a __host__ function from a __host__ __device__ >>> function is not allowed >>> detected during: >>> instantiation of "void >>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>> T=std::complex, T2=std::complex]" >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>> here >>> instantiation of "InputIterator >>> thrust::detail::host::for_each(InputIterator, InputIterator, >>> UnaryFunction) [with >>> InputIterator=thrust::detail::normal_iterator *>, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>> here >>> instantiation of "InputIterator >>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>> UnaryFunction, thrust::host_space_tag) [with >>> InputIterator=thrust::detail::normal_iterator *>, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>> here >>> instantiation of "InputIterator >>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>> [with InputIterator=thrust::detail::normal_iterator >>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>> here >>> instantiation of "void thrust::for_each(InputIterator, >>> InputIterator, UnaryFunction) [with >>> InputIterator=thrust::detail::normal_iterator *>, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> (367): here >>> instantiation of "void >>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>> [with >>> ForwardIterator=thrust::detail::normal_iterator >>> *>, ScalarType=std::complex]" >>> (748): here >>> instantiation of "void cusp::blas::scal(Array &, >>> ScalarType) [with Array=cusp::array1d, >>> cusp::host_memory>, ScalarType=std::complex]" >>> gputest.cu(25): here >>> >>> " >>> >>> Thanks a lot. >>> >>> Best, >>> Yujie >>> >>> On 1/29/12, Matthew Knepley wrote: >>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>> wrote: >>> > >>> >> Dear PETSc developers, >>> >> >>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>> >> complex number. >>> >> However, when I compiled the codes, I met some errors. I also tried to >>> >> use simple codes to realize the same function. However, the errors >>> >> disappear. One example is as follows: >>> >> >>> >> for the function "VecScale_SeqCUSP" >>> >> "#undef __FUNCT__ >>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>> >> { >>> >> CUSPARRAY *xarray; >>> >> PetscErrorCode ierr; >>> >> >>> >> PetscFunctionBegin; >>> >> if (alpha == 0.0) { >>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>> >> } else if (alpha != 1.0) { >>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>> >> try { >>> >> cusp::blas::scal(*xarray,alpha); >>> >> } catch(char* ex) { >>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>> >> } >>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>> >> } >>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>> >> PetscFunctionReturn(0); >>> >> } " >>> >> >>> >> When I compiled PETSc-dev, I met the following errors: >>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >>> >> calling a __host__ function from a __host__ __device__ function is not >>> >> allowed >>> >> detected during: >>> >> instantiation of "void >>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>> >> T=std::complex, T2=PetscScalar]" >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>> >> here >>> >> instantiation of "void >>> >> thrust::detail::device::cuda::for_each_n_closure>> >> Size, UnaryFunction>::operator()() [with >>> >> >>> >> >>> RandomAccessIterator=thrust::detail::normal_iterator>, >>> >> Size=long, >>> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>> >> here >>> >> instantiation of "void >>> >> >>> >> >>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>> >> [with >>> >> >>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>> >> long, cusp::blas::detail::SCAL>>]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>> >> here >>> >> instantiation of "size_t >>> >> >>> >> >>> thrust::detail::device::cuda::detail::closure_launcher_base>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>> >> >>> >> >>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>> >> long, cusp::blas::detail::SCAL>>, >>> >> launch_by_value=true]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>> >> here >>> >> instantiation of "thrust::pair >>> >> >>> >> >>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>> >> [with >>> >> >>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>> >> here >>> >> [ 6 instantiation contexts not shown ] >>> >> instantiation of "InputIterator >>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>> >> UnaryFunction, thrust::device_space_tag) [with >>> >> >>> >> >>> InputIterator=thrust::detail::normal_iterator>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >>> >> instantiation of "InputIterator >>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>> >> [with >>> >> >>> InputIterator=thrust::detail::normal_iterator>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >>> >> instantiation of "void thrust::for_each(InputIterator, >>> >> InputIterator, UnaryFunction) [with >>> >> >>> >> >>> InputIterator=thrust::detail::normal_iterator>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> (367): here >>> >> instantiation of "void >>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>> >> [with >>> >> >>> ForwardIterator=thrust::detail::normal_iterator>, >>> >> ScalarType=std::complex]" >>> >> (748): here >>> >> instantiation of "void cusp::blas::scal(Array &, >>> >> ScalarType) [with Array=cusp::array1d>> >> cusp::device_memory>, ScalarType=std::complex]" >>> >> veccusp.cu(1185): here >>> >> >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>> >> error: a value of type "int" cannot be assigned to an entity of type >>> >> "_ZNSt7complexIdE9_ComplexTE" >>> >> >>> >> " >>> >> However, I further realize simiar codes as >>> >> " >>> >> #include >>> >> #include >>> >> #include >>> >> #include >>> >> #include >>> >> #include >>> >> >>> >> int main(void) >>> >> { >>> >> cusp::array1d, cusp::host_memory> *x; >>> >> >>> >> x=new cusp::array1d, >>> cusp::host_memory>(2,0.0); >>> >> >>> >> std::complex alpha(1,2.0); >>> >> cusp::blas::scal(*x,alpha); >>> >> >>> >> return 0; >>> >> } >>> >> " >>> >> >>> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >>> >> warning information as follows: >>> >> " >>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>> >> warning: calling a __host__ function from a __host__ __device__ >>> >> function is not allowed >>> >> detected during: >>> >> instantiation of "void >>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>> >> T=std::complex, T2=std::complex]" >>> >> >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>> >> here >>> >> instantiation of "InputIterator >>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>> >> UnaryFunction) [with >>> >> InputIterator=thrust::detail::normal_iterator *>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> >>> >> >>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>> >> here >>> >> instantiation of "InputIterator >>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>> >> UnaryFunction, thrust::host_space_tag) [with >>> >> InputIterator=thrust::detail::normal_iterator *>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>> >> here >>> >> instantiation of "InputIterator >>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>> >> [with >>> InputIterator=thrust::detail::normal_iterator >>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>> >> here >>> >> instantiation of "void thrust::for_each(InputIterator, >>> >> InputIterator, UnaryFunction) [with >>> >> InputIterator=thrust::detail::normal_iterator *>, >>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>> >> (367): here >>> >> instantiation of "void >>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>> >> [with >>> ForwardIterator=thrust::detail::normal_iterator >>> >> *>, ScalarType=std::complex]" >>> >> (748): here >>> >> instantiation of "void cusp::blas::scal(Array &, >>> >> ScalarType) [with Array=cusp::array1d, >>> >> cusp::host_memory>, ScalarType=std::complex]" >>> >> gputest.cu(25): here >>> >> >>> >> " >>> >> There are not errors like >>> >> >>> >> >>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>> >> error: a value of type "int" cannot be assigned to an entity of type >>> >> "_ZNSt7complexIdE9_ComplexTE" " >>> >> >>> >> Furthermore, the warning information is also different between >>> >> PETSc-dev and simple codes. >>> >> >>> >> Could you give me some suggestion for this errors? Thank you very >>> much. >>> >> >>> > >>> > The headers are complicated to get right. The whole point of what we >>> did is >>> > to give a way to use GPU >>> > simply through the existing PETSc linear algebra interface. >>> > >>> > Matt >>> > >>> > >>> >> Best, >>> >> Yujie >>> >> >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> > experiments is infinitely more interesting than any results to which >>> their >>> > experiments lead. >>> > -- Norbert Wiener >>> > >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Feb 7 11:15:59 2012 From: recrusader at gmail.com (recrusader) Date: Tue, 7 Feb 2012 11:15:59 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: The following is the reply from Thrust developers, "The issue here is that the member functions of std::complex are not decorated with __host__ __device__ and therefore the compiler complains when asked to instantiate such functions in other __host__ __device__ code. If you switch std::complex to cusp::complex (which *is* decorated with __host__ __device__) then the problem should disappear." Thanks. On Tue, Feb 7, 2012 at 11:13 AM, Matthew Knepley wrote: > On Tue, Feb 7, 2012 at 10:19 AM, recrusader wrote: > >> Dear Matt, >> >> I got the help from Thrust developers. >> The problem is >> in PETSc, we define CUSPARRAY using cusp::array1d> cusp::device_memory>. Generaly, PetscScalar is std::complex. >> However, the CUSPARRAY array cannot be decorated at GPU side. We need >> cusp::array1d. >> Is there any simple method to process this in PETSc? >> > > This reply does not make any sense to me. What do you mean by the word > "decorated"? If you mean that they only support the > complex type cusp::complex, then I advise you to configure for real > numbers. We do not support cusp::complex as our complex > type. > > Thanks, > > Matt > > >> Thank you very much. >> >> Best, >> Yujie >> >> On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: >> >>> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >>> >>>> Thank you very much, Matt, >>>> >>>> You mean the headers of the simple codes, I further simply the codes as >>> >>> >>> This is a question for the CUSP mailing list. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> " >>>> #include >>>> #include >>>> >>>> int main(void) >>>> { >>>> cusp::array1d, cusp::host_memory> *x; >>>> >>>> x=new cusp::array1d, cusp::host_memory>(2,0.0); >>>> >>>> std::complex alpha(1,2.0); >>>> cusp::blas::scal(*x,alpha); >>>> >>>> return 0; >>>> }" >>>> >>>> I got the same compilation results " >>>> login1$ nvcc gputest.cu -o gputest >>>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>> warning: calling a __host__ function from a __host__ __device__ >>>> function is not allowed >>>> detected during: >>>> instantiation of "void >>>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>> T=std::complex, T2=std::complex]" >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>> here >>>> instantiation of "InputIterator >>>> thrust::detail::host::for_each(InputIterator, InputIterator, >>>> UnaryFunction) [with >>>> InputIterator=thrust::detail::normal_iterator *>, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>> here >>>> instantiation of "InputIterator >>>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>> UnaryFunction, thrust::host_space_tag) [with >>>> InputIterator=thrust::detail::normal_iterator *>, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>> here >>>> instantiation of "InputIterator >>>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>> [with InputIterator=thrust::detail::normal_iterator >>>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>> here >>>> instantiation of "void thrust::for_each(InputIterator, >>>> InputIterator, UnaryFunction) [with >>>> InputIterator=thrust::detail::normal_iterator *>, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> (367): here >>>> instantiation of "void >>>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>>> [with >>>> ForwardIterator=thrust::detail::normal_iterator >>>> *>, ScalarType=std::complex]" >>>> (748): here >>>> instantiation of "void cusp::blas::scal(Array &, >>>> ScalarType) [with Array=cusp::array1d, >>>> cusp::host_memory>, ScalarType=std::complex]" >>>> gputest.cu(25): here >>>> >>>> " >>>> >>>> Thanks a lot. >>>> >>>> Best, >>>> Yujie >>>> >>>> On 1/29/12, Matthew Knepley wrote: >>>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>>> wrote: >>>> > >>>> >> Dear PETSc developers, >>>> >> >>>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>>> >> complex number. >>>> >> However, when I compiled the codes, I met some errors. I also tried >>>> to >>>> >> use simple codes to realize the same function. However, the errors >>>> >> disappear. One example is as follows: >>>> >> >>>> >> for the function "VecScale_SeqCUSP" >>>> >> "#undef __FUNCT__ >>>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>>> >> { >>>> >> CUSPARRAY *xarray; >>>> >> PetscErrorCode ierr; >>>> >> >>>> >> PetscFunctionBegin; >>>> >> if (alpha == 0.0) { >>>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>>> >> } else if (alpha != 1.0) { >>>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>> >> try { >>>> >> cusp::blas::scal(*xarray,alpha); >>>> >> } catch(char* ex) { >>>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>>> >> } >>>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>> >> } >>>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>>> >> PetscFunctionReturn(0); >>>> >> } " >>>> >> >>>> >> When I compiled PETSc-dev, I met the following errors: >>>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): warning: >>>> >> calling a __host__ function from a __host__ __device__ function is >>>> not >>>> >> allowed >>>> >> detected during: >>>> >> instantiation of "void >>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>> >> T=std::complex, T2=PetscScalar]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>>> >> here >>>> >> instantiation of "void >>>> >> >>>> thrust::detail::device::cuda::for_each_n_closure>>> >> Size, UnaryFunction>::operator()() [with >>>> >> >>>> >> >>>> RandomAccessIterator=thrust::detail::normal_iterator>, >>>> >> Size=long, >>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>> >> here >>>> >> instantiation of "void >>>> >> >>>> >> >>>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>>> >> [with >>>> >> >>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>> >> long, cusp::blas::detail::SCAL>>]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>>> >> here >>>> >> instantiation of "size_t >>>> >> >>>> >> >>>> thrust::detail::device::cuda::detail::closure_launcher_base>>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>>> >> >>>> >> >>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>> >> long, cusp::blas::detail::SCAL>>, >>>> >> launch_by_value=true]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>>> >> here >>>> >> instantiation of "thrust::pair >>>> >> >>>> >> >>>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>>> >> [with >>>> >> >>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>>> >> here >>>> >> [ 6 instantiation contexts not shown ] >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>> >> UnaryFunction, thrust::device_space_tag) [with >>>> >> >>>> >> >>>> InputIterator=thrust::detail::normal_iterator>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>> >> [with >>>> >> >>>> InputIterator=thrust::detail::normal_iterator>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >>>> >> instantiation of "void thrust::for_each(InputIterator, >>>> >> InputIterator, UnaryFunction) [with >>>> >> >>>> >> >>>> InputIterator=thrust::detail::normal_iterator>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> (367): here >>>> >> instantiation of "void >>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>> ScalarType) >>>> >> [with >>>> >> >>>> ForwardIterator=thrust::detail::normal_iterator>, >>>> >> ScalarType=std::complex]" >>>> >> (748): here >>>> >> instantiation of "void cusp::blas::scal(Array &, >>>> >> ScalarType) [with Array=cusp::array1d>>> >> cusp::device_memory>, ScalarType=std::complex]" >>>> >> veccusp.cu(1185): here >>>> >> >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>> >> "_ZNSt7complexIdE9_ComplexTE" >>>> >> >>>> >> " >>>> >> However, I further realize simiar codes as >>>> >> " >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> >>>> >> int main(void) >>>> >> { >>>> >> cusp::array1d, cusp::host_memory> *x; >>>> >> >>>> >> x=new cusp::array1d, >>>> cusp::host_memory>(2,0.0); >>>> >> >>>> >> std::complex alpha(1,2.0); >>>> >> cusp::blas::scal(*x,alpha); >>>> >> >>>> >> return 0; >>>> >> } >>>> >> " >>>> >> >>>> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >>>> >> warning information as follows: >>>> >> " >>>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>> >> warning: calling a __host__ function from a __host__ __device__ >>>> >> function is not allowed >>>> >> detected during: >>>> >> instantiation of "void >>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>> >> T=std::complex, T2=std::complex]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>> >> here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>>> >> UnaryFunction) [with >>>> >> InputIterator=thrust::detail::normal_iterator >>>> *>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>> >> here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>> >> UnaryFunction, thrust::host_space_tag) [with >>>> >> InputIterator=thrust::detail::normal_iterator >>>> *>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>> >> here >>>> >> instantiation of "InputIterator >>>> >> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>> >> [with >>>> InputIterator=thrust::detail::normal_iterator >>>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> >>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>> >> here >>>> >> instantiation of "void thrust::for_each(InputIterator, >>>> >> InputIterator, UnaryFunction) [with >>>> >> InputIterator=thrust::detail::normal_iterator >>>> *>, >>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>> >> (367): here >>>> >> instantiation of "void >>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>> ScalarType) >>>> >> [with >>>> ForwardIterator=thrust::detail::normal_iterator >>>> >> *>, ScalarType=std::complex]" >>>> >> (748): here >>>> >> instantiation of "void cusp::blas::scal(Array &, >>>> >> ScalarType) [with Array=cusp::array1d, >>>> >> cusp::host_memory>, ScalarType=std::complex]" >>>> >> gputest.cu(25): here >>>> >> >>>> >> " >>>> >> There are not errors like >>>> >> >>>> >> >>>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>> >> "_ZNSt7complexIdE9_ComplexTE" " >>>> >> >>>> >> Furthermore, the warning information is also different between >>>> >> PETSc-dev and simple codes. >>>> >> >>>> >> Could you give me some suggestion for this errors? Thank you very >>>> much. >>>> >> >>>> > >>>> > The headers are complicated to get right. The whole point of what we >>>> did is >>>> > to give a way to use GPU >>>> > simply through the existing PETSc linear algebra interface. >>>> > >>>> > Matt >>>> > >>>> > >>>> >> Best, >>>> >> Yujie >>>> >> >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> > experiments is infinitely more interesting than any results to which >>>> their >>>> > experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 7 11:17:47 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 7 Feb 2012 11:17:47 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 11:15 AM, recrusader wrote: > The following is the reply from Thrust developers, > "The issue here is that the member functions of std::complex are not > decorated with __host__ __device__ and therefore the compiler complains > when asked to instantiate such functions in other __host__ __device__ code. > If you switch std::complex to cusp::complex (which *is* decorated with > __host__ __device__) then the problem should disappear." > That makes sense, and my reply is the same. PETSc does not support that complex type. You can try to typedef PetscScalar to that type, but I have no idea what problems it might cause. Matt > Thanks. > > On Tue, Feb 7, 2012 at 11:13 AM, Matthew Knepley wrote: > >> On Tue, Feb 7, 2012 at 10:19 AM, recrusader wrote: >> >>> Dear Matt, >>> >>> I got the help from Thrust developers. >>> The problem is >>> in PETSc, we define CUSPARRAY using cusp::array1d>> cusp::device_memory>. Generaly, PetscScalar is std::complex. >>> However, the CUSPARRAY array cannot be decorated at GPU side. We need >>> cusp::array1d. >>> Is there any simple method to process this in PETSc? >>> >> >> This reply does not make any sense to me. What do you mean by the word >> "decorated"? If you mean that they only support the >> complex type cusp::complex, then I advise you to configure for real >> numbers. We do not support cusp::complex as our complex >> type. >> >> Thanks, >> >> Matt >> >> >>> Thank you very much. >>> >>> Best, >>> Yujie >>> >>> On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: >>> >>>> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >>>> >>>>> Thank you very much, Matt, >>>>> >>>>> You mean the headers of the simple codes, I further simply the codes as >>>> >>>> >>>> This is a question for the CUSP mailing list. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> " >>>>> #include >>>>> #include >>>>> >>>>> int main(void) >>>>> { >>>>> cusp::array1d, cusp::host_memory> *x; >>>>> >>>>> x=new cusp::array1d, cusp::host_memory>(2,0.0); >>>>> >>>>> std::complex alpha(1,2.0); >>>>> cusp::blas::scal(*x,alpha); >>>>> >>>>> return 0; >>>>> }" >>>>> >>>>> I got the same compilation results " >>>>> login1$ nvcc gputest.cu -o gputest >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>> warning: calling a __host__ function from a __host__ __device__ >>>>> function is not allowed >>>>> detected during: >>>>> instantiation of "void >>>>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>> T=std::complex, T2=std::complex]" >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>> here >>>>> instantiation of "InputIterator >>>>> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>> UnaryFunction) [with >>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>> here >>>>> instantiation of "InputIterator >>>>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>> UnaryFunction, thrust::host_space_tag) [with >>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>> here >>>>> instantiation of "InputIterator >>>>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>>> [with >>>>> InputIterator=thrust::detail::normal_iterator >>>>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>> here >>>>> instantiation of "void thrust::for_each(InputIterator, >>>>> InputIterator, UnaryFunction) [with >>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> (367): here >>>>> instantiation of "void >>>>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>>>> [with >>>>> ForwardIterator=thrust::detail::normal_iterator >>>>> *>, ScalarType=std::complex]" >>>>> (748): here >>>>> instantiation of "void cusp::blas::scal(Array &, >>>>> ScalarType) [with Array=cusp::array1d, >>>>> cusp::host_memory>, ScalarType=std::complex]" >>>>> gputest.cu(25): here >>>>> >>>>> " >>>>> >>>>> Thanks a lot. >>>>> >>>>> Best, >>>>> Yujie >>>>> >>>>> On 1/29/12, Matthew Knepley wrote: >>>>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>>>> wrote: >>>>> > >>>>> >> Dear PETSc developers, >>>>> >> >>>>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>>>> >> complex number. >>>>> >> However, when I compiled the codes, I met some errors. I also tried >>>>> to >>>>> >> use simple codes to realize the same function. However, the errors >>>>> >> disappear. One example is as follows: >>>>> >> >>>>> >> for the function "VecScale_SeqCUSP" >>>>> >> "#undef __FUNCT__ >>>>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>>>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>>>> >> { >>>>> >> CUSPARRAY *xarray; >>>>> >> PetscErrorCode ierr; >>>>> >> >>>>> >> PetscFunctionBegin; >>>>> >> if (alpha == 0.0) { >>>>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>>>> >> } else if (alpha != 1.0) { >>>>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>> >> try { >>>>> >> cusp::blas::scal(*xarray,alpha); >>>>> >> } catch(char* ex) { >>>>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>>>> >> } >>>>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>> >> } >>>>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>>>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>>>> >> PetscFunctionReturn(0); >>>>> >> } " >>>>> >> >>>>> >> When I compiled PETSc-dev, I met the following errors: >>>>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): >>>>> warning: >>>>> >> calling a __host__ function from a __host__ __device__ function is >>>>> not >>>>> >> allowed >>>>> >> detected during: >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>> >> T=std::complex, T2=PetscScalar]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>>>> >> here >>>>> >> instantiation of "void >>>>> >> >>>>> thrust::detail::device::cuda::for_each_n_closure>>>> >> Size, UnaryFunction>::operator()() [with >>>>> >> >>>>> >> >>>>> RandomAccessIterator=thrust::detail::normal_iterator>, >>>>> >> Size=long, >>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>> >> here >>>>> >> instantiation of "void >>>>> >> >>>>> >> >>>>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>>>> >> [with >>>>> >> >>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>> >> long, cusp::blas::detail::SCAL>>]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>>>> >> here >>>>> >> instantiation of "size_t >>>>> >> >>>>> >> >>>>> thrust::detail::device::cuda::detail::closure_launcher_base>>>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>>>> >> >>>>> >> >>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>> >> long, cusp::blas::detail::SCAL>>, >>>>> >> launch_by_value=true]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>>>> >> here >>>>> >> instantiation of "thrust::pair >>>>> >> >>>>> >> >>>>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>>>> >> [with >>>>> >> >>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>>>> >> here >>>>> >> [ 6 instantiation contexts not shown ] >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>> >> UnaryFunction, thrust::device_space_tag) [with >>>>> >> >>>>> >> >>>>> InputIterator=thrust::detail::normal_iterator>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>> UnaryFunction) >>>>> >> [with >>>>> >> >>>>> InputIterator=thrust::detail::normal_iterator>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): here >>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>> >> InputIterator, UnaryFunction) [with >>>>> >> >>>>> >> >>>>> InputIterator=thrust::detail::normal_iterator>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> (367): here >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>> ScalarType) >>>>> >> [with >>>>> >> >>>>> ForwardIterator=thrust::detail::normal_iterator>, >>>>> >> ScalarType=std::complex]" >>>>> >> (748): here >>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>> >> ScalarType) [with Array=cusp::array1d>>>> >> cusp::device_memory>, ScalarType=std::complex]" >>>>> >> veccusp.cu(1185): here >>>>> >> >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>>> >> "_ZNSt7complexIdE9_ComplexTE" >>>>> >> >>>>> >> " >>>>> >> However, I further realize simiar codes as >>>>> >> " >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> #include >>>>> >> >>>>> >> int main(void) >>>>> >> { >>>>> >> cusp::array1d, cusp::host_memory> *x; >>>>> >> >>>>> >> x=new cusp::array1d, >>>>> cusp::host_memory>(2,0.0); >>>>> >> >>>>> >> std::complex alpha(1,2.0); >>>>> >> cusp::blas::scal(*x,alpha); >>>>> >> >>>>> >> return 0; >>>>> >> } >>>>> >> " >>>>> >> >>>>> >> When I complied it using "nvcc gputest.cu -o gputest", I only meet >>>>> >> warning information as follows: >>>>> >> " >>>>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>> >> warning: calling a __host__ function from a __host__ __device__ >>>>> >> function is not allowed >>>>> >> detected during: >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>> >> T=std::complex, T2=std::complex]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>> >> here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>> >> UnaryFunction) [with >>>>> >> InputIterator=thrust::detail::normal_iterator >>>>> *>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>> >> here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>> >> UnaryFunction, thrust::host_space_tag) [with >>>>> >> InputIterator=thrust::detail::normal_iterator >>>>> *>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>> >> here >>>>> >> instantiation of "InputIterator >>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>> UnaryFunction) >>>>> >> [with >>>>> InputIterator=thrust::detail::normal_iterator >>>>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> >>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>> >> here >>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>> >> InputIterator, UnaryFunction) [with >>>>> >> InputIterator=thrust::detail::normal_iterator >>>>> *>, >>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>> >> (367): here >>>>> >> instantiation of "void >>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>> ScalarType) >>>>> >> [with >>>>> ForwardIterator=thrust::detail::normal_iterator >>>>> >> *>, ScalarType=std::complex]" >>>>> >> (748): here >>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>> >> ScalarType) [with Array=cusp::array1d, >>>>> >> cusp::host_memory>, ScalarType=std::complex]" >>>>> >> gputest.cu(25): here >>>>> >> >>>>> >> " >>>>> >> There are not errors like >>>>> >> >>>>> >> >>>>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>> >> error: a value of type "int" cannot be assigned to an entity of type >>>>> >> "_ZNSt7complexIdE9_ComplexTE" " >>>>> >> >>>>> >> Furthermore, the warning information is also different between >>>>> >> PETSc-dev and simple codes. >>>>> >> >>>>> >> Could you give me some suggestion for this errors? Thank you very >>>>> much. >>>>> >> >>>>> > >>>>> > The headers are complicated to get right. The whole point of what we >>>>> did is >>>>> > to give a way to use GPU >>>>> > simply through the existing PETSc linear algebra interface. >>>>> > >>>>> > Matt >>>>> > >>>>> > >>>>> >> Best, >>>>> >> Yujie >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > What most experimenters take for granted before they begin their >>>>> > experiments is infinitely more interesting than any results to which >>>>> their >>>>> > experiments lead. >>>>> > -- Norbert Wiener >>>>> > >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Feb 7 11:20:31 2012 From: recrusader at gmail.com (recrusader) Date: Tue, 7 Feb 2012 11:20:31 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: Whether is it possible to find an efficient mechanism to do the conversion between std::complex and cusp::complex when the conversion is necessary. Thanks. Yujie On Tue, Feb 7, 2012 at 11:17 AM, Matthew Knepley wrote: > On Tue, Feb 7, 2012 at 11:15 AM, recrusader wrote: > >> The following is the reply from Thrust developers, >> "The issue here is that the member functions of std::complex are not >> decorated with __host__ __device__ and therefore the compiler complains >> when asked to instantiate such functions in other __host__ __device__ code. >> If you switch std::complex to cusp::complex (which *is* decorated with >> __host__ __device__) then the problem should disappear." >> > > That makes sense, and my reply is the same. PETSc does not support that > complex type. You can try to typedef > PetscScalar to that type, but I have no idea what problems it might cause. > > Matt > > >> Thanks. >> >> On Tue, Feb 7, 2012 at 11:13 AM, Matthew Knepley wrote: >> >>> On Tue, Feb 7, 2012 at 10:19 AM, recrusader wrote: >>> >>>> Dear Matt, >>>> >>>> I got the help from Thrust developers. >>>> The problem is >>>> in PETSc, we define CUSPARRAY using cusp::array1d>>> cusp::device_memory>. Generaly, PetscScalar is std::complex. >>>> However, the CUSPARRAY array cannot be decorated at GPU side. We need >>>> cusp::array1d. >>>> Is there any simple method to process this in PETSc? >>>> >>> >>> This reply does not make any sense to me. What do you mean by the word >>> "decorated"? If you mean that they only support the >>> complex type cusp::complex, then I advise you to configure for real >>> numbers. We do not support cusp::complex as our complex >>> type. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you very much. >>>> >>>> Best, >>>> Yujie >>>> >>>> On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: >>>> >>>>> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >>>>> >>>>>> Thank you very much, Matt, >>>>>> >>>>>> You mean the headers of the simple codes, I further simply the codes >>>>>> as >>>>> >>>>> >>>>> This is a question for the CUSP mailing list. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> " >>>>>> #include >>>>>> #include >>>>>> >>>>>> int main(void) >>>>>> { >>>>>> cusp::array1d, cusp::host_memory> *x; >>>>>> >>>>>> x=new cusp::array1d, >>>>>> cusp::host_memory>(2,0.0); >>>>>> >>>>>> std::complex alpha(1,2.0); >>>>>> cusp::blas::scal(*x,alpha); >>>>>> >>>>>> return 0; >>>>>> }" >>>>>> >>>>>> I got the same compilation results " >>>>>> login1$ nvcc gputest.cu -o gputest >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>>> warning: calling a __host__ function from a __host__ __device__ >>>>>> function is not allowed >>>>>> detected during: >>>>>> instantiation of "void >>>>>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>>> T=std::complex, T2=std::complex]" >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>>> here >>>>>> instantiation of "InputIterator >>>>>> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>>> UnaryFunction) [with >>>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>>> here >>>>>> instantiation of "InputIterator >>>>>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>>> UnaryFunction, thrust::host_space_tag) [with >>>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>>> here >>>>>> instantiation of "InputIterator >>>>>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>>>> [with >>>>>> InputIterator=thrust::detail::normal_iterator >>>>>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>>> here >>>>>> instantiation of "void thrust::for_each(InputIterator, >>>>>> InputIterator, UnaryFunction) [with >>>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> (367): here >>>>>> instantiation of "void >>>>>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, ScalarType) >>>>>> [with >>>>>> ForwardIterator=thrust::detail::normal_iterator >>>>>> *>, ScalarType=std::complex]" >>>>>> (748): here >>>>>> instantiation of "void cusp::blas::scal(Array &, >>>>>> ScalarType) [with Array=cusp::array1d, >>>>>> cusp::host_memory>, ScalarType=std::complex]" >>>>>> gputest.cu(25): here >>>>>> >>>>>> " >>>>>> >>>>>> Thanks a lot. >>>>>> >>>>>> Best, >>>>>> Yujie >>>>>> >>>>>> On 1/29/12, Matthew Knepley wrote: >>>>>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>>>>> wrote: >>>>>> > >>>>>> >> Dear PETSc developers, >>>>>> >> >>>>>> >> With your help, I can successfully PETSc-deve with enabling GPU and >>>>>> >> complex number. >>>>>> >> However, when I compiled the codes, I met some errors. I also >>>>>> tried to >>>>>> >> use simple codes to realize the same function. However, the errors >>>>>> >> disappear. One example is as follows: >>>>>> >> >>>>>> >> for the function "VecScale_SeqCUSP" >>>>>> >> "#undef __FUNCT__ >>>>>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>>>>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>>>>> >> { >>>>>> >> CUSPARRAY *xarray; >>>>>> >> PetscErrorCode ierr; >>>>>> >> >>>>>> >> PetscFunctionBegin; >>>>>> >> if (alpha == 0.0) { >>>>>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>>>>> >> } else if (alpha != 1.0) { >>>>>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>>> >> try { >>>>>> >> cusp::blas::scal(*xarray,alpha); >>>>>> >> } catch(char* ex) { >>>>>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>>>>> >> } >>>>>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>>> >> } >>>>>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>>>>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>>>>> >> PetscFunctionReturn(0); >>>>>> >> } " >>>>>> >> >>>>>> >> When I compiled PETSc-dev, I met the following errors: >>>>>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): >>>>>> warning: >>>>>> >> calling a __host__ function from a __host__ __device__ function is >>>>>> not >>>>>> >> allowed >>>>>> >> detected during: >>>>>> >> instantiation of "void >>>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>>> >> T=std::complex, T2=PetscScalar]" >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>>>>> >> here >>>>>> >> instantiation of "void >>>>>> >> >>>>>> thrust::detail::device::cuda::for_each_n_closure>>>>> >> Size, UnaryFunction>::operator()() [with >>>>>> >> >>>>>> >> >>>>>> RandomAccessIterator=thrust::detail::normal_iterator>, >>>>>> >> Size=long, >>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>>> >> here >>>>>> >> instantiation of "void >>>>>> >> >>>>>> >> >>>>>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>>>>> >> [with >>>>>> >> >>>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>>> >> long, cusp::blas::detail::SCAL>>]" >>>>>> >> >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>>>>> >> here >>>>>> >> instantiation of "size_t >>>>>> >> >>>>>> >> >>>>>> thrust::detail::device::cuda::detail::closure_launcher_base>>>>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>>>>> >> >>>>>> >> >>>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>>> >> long, cusp::blas::detail::SCAL>>, >>>>>> >> launch_by_value=true]" >>>>>> >> >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>>>>> >> here >>>>>> >> instantiation of "thrust::pair >>>>>> >> >>>>>> >> >>>>>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>>>>> >> [with >>>>>> >> >>>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>>>>> >> >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>>>>> >> here >>>>>> >> [ 6 instantiation contexts not shown ] >>>>>> >> instantiation of "InputIterator >>>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>>> >> UnaryFunction, thrust::device_space_tag) [with >>>>>> >> >>>>>> >> >>>>>> InputIterator=thrust::detail::normal_iterator>, >>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): >>>>>> here >>>>>> >> instantiation of "InputIterator >>>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>>> UnaryFunction) >>>>>> >> [with >>>>>> >> >>>>>> InputIterator=thrust::detail::normal_iterator>, >>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): >>>>>> here >>>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>>> >> InputIterator, UnaryFunction) [with >>>>>> >> >>>>>> >> >>>>>> InputIterator=thrust::detail::normal_iterator>, >>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> (367): here >>>>>> >> instantiation of "void >>>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>>> ScalarType) >>>>>> >> [with >>>>>> >> >>>>>> ForwardIterator=thrust::detail::normal_iterator>, >>>>>> >> ScalarType=std::complex]" >>>>>> >> (748): here >>>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>>> >> ScalarType) [with Array=cusp::array1d>>>>> >> cusp::device_memory>, ScalarType=std::complex]" >>>>>> >> veccusp.cu(1185): here >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>>> >> error: a value of type "int" cannot be assigned to an entity of >>>>>> type >>>>>> >> "_ZNSt7complexIdE9_ComplexTE" >>>>>> >> >>>>>> >> " >>>>>> >> However, I further realize simiar codes as >>>>>> >> " >>>>>> >> #include >>>>>> >> #include >>>>>> >> #include >>>>>> >> #include >>>>>> >> #include >>>>>> >> #include >>>>>> >> >>>>>> >> int main(void) >>>>>> >> { >>>>>> >> cusp::array1d, cusp::host_memory> *x; >>>>>> >> >>>>>> >> x=new cusp::array1d, >>>>>> cusp::host_memory>(2,0.0); >>>>>> >> >>>>>> >> std::complex alpha(1,2.0); >>>>>> >> cusp::blas::scal(*x,alpha); >>>>>> >> >>>>>> >> return 0; >>>>>> >> } >>>>>> >> " >>>>>> >> >>>>>> >> When I complied it using "nvcc gputest.cu -o gputest", I only >>>>>> meet >>>>>> >> warning information as follows: >>>>>> >> " >>>>>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>>> >> warning: calling a __host__ function from a __host__ __device__ >>>>>> >> function is not allowed >>>>>> >> detected during: >>>>>> >> instantiation of "void >>>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>>> >> T=std::complex, T2=std::complex]" >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>>> >> here >>>>>> >> instantiation of "InputIterator >>>>>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>>> >> UnaryFunction) [with >>>>>> >> InputIterator=thrust::detail::normal_iterator >>>>>> *>, >>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>>> >> here >>>>>> >> instantiation of "InputIterator >>>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>>> >> UnaryFunction, thrust::host_space_tag) [with >>>>>> >> InputIterator=thrust::detail::normal_iterator >>>>>> *>, >>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>>> >> here >>>>>> >> instantiation of "InputIterator >>>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>>> UnaryFunction) >>>>>> >> [with >>>>>> InputIterator=thrust::detail::normal_iterator >>>>>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> >>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>>> >> here >>>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>>> >> InputIterator, UnaryFunction) [with >>>>>> >> InputIterator=thrust::detail::normal_iterator >>>>>> *>, >>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>> >> (367): here >>>>>> >> instantiation of "void >>>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>>> ScalarType) >>>>>> >> [with >>>>>> ForwardIterator=thrust::detail::normal_iterator >>>>>> >> *>, ScalarType=std::complex]" >>>>>> >> (748): here >>>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>>> >> ScalarType) [with Array=cusp::array1d, >>>>>> >> cusp::host_memory>, ScalarType=std::complex]" >>>>>> >> gputest.cu(25): here >>>>>> >> >>>>>> >> " >>>>>> >> There are not errors like >>>>>> >> >>>>>> >> >>>>>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>>> >> error: a value of type "int" cannot be assigned to an entity of >>>>>> type >>>>>> >> "_ZNSt7complexIdE9_ComplexTE" " >>>>>> >> >>>>>> >> Furthermore, the warning information is also different between >>>>>> >> PETSc-dev and simple codes. >>>>>> >> >>>>>> >> Could you give me some suggestion for this errors? Thank you very >>>>>> much. >>>>>> >> >>>>>> > >>>>>> > The headers are complicated to get right. The whole point of what >>>>>> we did is >>>>>> > to give a way to use GPU >>>>>> > simply through the existing PETSc linear algebra interface. >>>>>> > >>>>>> > Matt >>>>>> > >>>>>> > >>>>>> >> Best, >>>>>> >> Yujie >>>>>> >> >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > What most experimenters take for granted before they begin their >>>>>> > experiments is infinitely more interesting than any results to >>>>>> which their >>>>>> > experiments lead. >>>>>> > -- Norbert Wiener >>>>>> > >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 7 11:29:19 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 7 Feb 2012 11:29:19 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 11:20 AM, recrusader wrote: > Whether is it possible to find an efficient mechanism to do the conversion > between std::complex and cusp::complex when the conversion is necessary. > That does not matter. This is a compile error. We are not going to change this right now, and it seems like you are not going make the necessary changes, so I would say that complex numbers are not supported with our GPU code right now. The change would involve using cusp::complex for PetscScalar, and I am not sure how much work that would entail. Matt > Thanks. > Yujie > > On Tue, Feb 7, 2012 at 11:17 AM, Matthew Knepley wrote: > >> On Tue, Feb 7, 2012 at 11:15 AM, recrusader wrote: >> >>> The following is the reply from Thrust developers, >>> "The issue here is that the member functions of std::complex are not >>> decorated with __host__ __device__ and therefore the compiler complains >>> when asked to instantiate such functions in other __host__ __device__ code. >>> If you switch std::complex to cusp::complex (which *is* decorated with >>> __host__ __device__) then the problem should disappear." >>> >> >> That makes sense, and my reply is the same. PETSc does not support that >> complex type. You can try to typedef >> PetscScalar to that type, but I have no idea what problems it might cause. >> >> Matt >> >> >>> Thanks. >>> >>> On Tue, Feb 7, 2012 at 11:13 AM, Matthew Knepley wrote: >>> >>>> On Tue, Feb 7, 2012 at 10:19 AM, recrusader wrote: >>>> >>>>> Dear Matt, >>>>> >>>>> I got the help from Thrust developers. >>>>> The problem is >>>>> in PETSc, we define CUSPARRAY using cusp::array1d>>>> cusp::device_memory>. Generaly, PetscScalar is std::complex. >>>>> However, the CUSPARRAY array cannot be decorated at GPU side. We need >>>>> cusp::array1d. >>>>> Is there any simple method to process this in PETSc? >>>>> >>>> >>>> This reply does not make any sense to me. What do you mean by the word >>>> "decorated"? If you mean that they only support the >>>> complex type cusp::complex, then I advise you to configure for real >>>> numbers. We do not support cusp::complex as our complex >>>> type. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thank you very much. >>>>> >>>>> Best, >>>>> Yujie >>>>> >>>>> On Sun, Jan 29, 2012 at 1:20 PM, Matthew Knepley wrote: >>>>> >>>>>> On Sun, Jan 29, 2012 at 1:05 PM, recrusader wrote: >>>>>> >>>>>>> Thank you very much, Matt, >>>>>>> >>>>>>> You mean the headers of the simple codes, I further simply the codes >>>>>>> as >>>>>> >>>>>> >>>>>> This is a question for the CUSP mailing list. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> " >>>>>>> #include >>>>>>> #include >>>>>>> >>>>>>> int main(void) >>>>>>> { >>>>>>> cusp::array1d, cusp::host_memory> *x; >>>>>>> >>>>>>> x=new cusp::array1d, >>>>>>> cusp::host_memory>(2,0.0); >>>>>>> >>>>>>> std::complex alpha(1,2.0); >>>>>>> cusp::blas::scal(*x,alpha); >>>>>>> >>>>>>> return 0; >>>>>>> }" >>>>>>> >>>>>>> I got the same compilation results " >>>>>>> login1$ nvcc gputest.cu -o gputest >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>>>> warning: calling a __host__ function from a __host__ __device__ >>>>>>> function is not allowed >>>>>>> detected during: >>>>>>> instantiation of "void >>>>>>> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>>>> T=std::complex, T2=std::complex]" >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>>>> here >>>>>>> instantiation of "InputIterator >>>>>>> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>>>> UnaryFunction) [with >>>>>>> InputIterator=thrust::detail::normal_iterator >>>>>>> *>, >>>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>>>> here >>>>>>> instantiation of "InputIterator >>>>>>> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>>>> UnaryFunction, thrust::host_space_tag) [with >>>>>>> InputIterator=thrust::detail::normal_iterator >>>>>>> *>, >>>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>>>> here >>>>>>> instantiation of "InputIterator >>>>>>> thrust::detail::for_each(InputIterator, InputIterator, UnaryFunction) >>>>>>> [with >>>>>>> InputIterator=thrust::detail::normal_iterator >>>>>>> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>>>> here >>>>>>> instantiation of "void thrust::for_each(InputIterator, >>>>>>> InputIterator, UnaryFunction) [with >>>>>>> InputIterator=thrust::detail::normal_iterator >>>>>>> *>, >>>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> (367): here >>>>>>> instantiation of "void >>>>>>> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>>>> ScalarType) >>>>>>> [with >>>>>>> ForwardIterator=thrust::detail::normal_iterator >>>>>>> *>, ScalarType=std::complex]" >>>>>>> (748): here >>>>>>> instantiation of "void cusp::blas::scal(Array &, >>>>>>> ScalarType) [with Array=cusp::array1d, >>>>>>> cusp::host_memory>, ScalarType=std::complex]" >>>>>>> gputest.cu(25): here >>>>>>> >>>>>>> " >>>>>>> >>>>>>> Thanks a lot. >>>>>>> >>>>>>> Best, >>>>>>> Yujie >>>>>>> >>>>>>> On 1/29/12, Matthew Knepley wrote: >>>>>>> > On Sun, Jan 29, 2012 at 12:53 PM, recrusader >>>>>>> wrote: >>>>>>> > >>>>>>> >> Dear PETSc developers, >>>>>>> >> >>>>>>> >> With your help, I can successfully PETSc-deve with enabling GPU >>>>>>> and >>>>>>> >> complex number. >>>>>>> >> However, when I compiled the codes, I met some errors. I also >>>>>>> tried to >>>>>>> >> use simple codes to realize the same function. However, the errors >>>>>>> >> disappear. One example is as follows: >>>>>>> >> >>>>>>> >> for the function "VecScale_SeqCUSP" >>>>>>> >> "#undef __FUNCT__ >>>>>>> >> #define __FUNCT__ "VecScale_SeqCUSP" >>>>>>> >> PetscErrorCode VecScale_SeqCUSP(Vec xin, PetscScalar alpha) >>>>>>> >> { >>>>>>> >> CUSPARRAY *xarray; >>>>>>> >> PetscErrorCode ierr; >>>>>>> >> >>>>>>> >> PetscFunctionBegin; >>>>>>> >> if (alpha == 0.0) { >>>>>>> >> ierr = VecSet_SeqCUSP(xin,alpha);CHKERRQ(ierr); >>>>>>> >> } else if (alpha != 1.0) { >>>>>>> >> ierr = VecCUSPGetArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>>>> >> try { >>>>>>> >> cusp::blas::scal(*xarray,alpha); >>>>>>> >> } catch(char* ex) { >>>>>>> >> SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"CUSP error: %s", ex); >>>>>>> >> } >>>>>>> >> ierr = VecCUSPRestoreArrayReadWrite(xin,&xarray);CHKERRQ(ierr); >>>>>>> >> } >>>>>>> >> ierr = WaitForGPU();CHKERRCUSP(ierr); >>>>>>> >> ierr = PetscLogFlops(xin->map->n);CHKERRQ(ierr); >>>>>>> >> PetscFunctionReturn(0); >>>>>>> >> } " >>>>>>> >> >>>>>>> >> When I compiled PETSc-dev, I met the following errors: >>>>>>> >> " /opt/apps/cuda/4.0/cuda/include/cusp/detail/blas.inl(134): >>>>>>> warning: >>>>>>> >> calling a __host__ function from a __host__ __device__ function >>>>>>> is not >>>>>>> >> allowed >>>>>>> >> detected during: >>>>>>> >> instantiation of "void >>>>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>>>> >> T=std::complex, T2=PetscScalar]" >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/for_each.inl(72): >>>>>>> >> here >>>>>>> >> instantiation of "void >>>>>>> >> >>>>>>> thrust::detail::device::cuda::for_each_n_closure>>>>>> >> Size, UnaryFunction>::operator()() [with >>>>>>> >> >>>>>>> >> >>>>>>> RandomAccessIterator=thrust::detail::normal_iterator>, >>>>>>> >> Size=long, >>>>>>> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>>>> >> here >>>>>>> >> instantiation of "void >>>>>>> >> >>>>>>> >> >>>>>>> thrust::detail::device::cuda::detail::launch_closure_by_value(NullaryFunction) >>>>>>> >> [with >>>>>>> >> >>>>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>>>> >> long, cusp::blas::detail::SCAL>>]" >>>>>>> >> >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(71): >>>>>>> >> here >>>>>>> >> instantiation of "size_t >>>>>>> >> >>>>>>> >> >>>>>>> thrust::detail::device::cuda::detail::closure_launcher_base>>>>>> >> launch_by_value>::block_size_with_maximal_occupancy(size_t) [with >>>>>>> >> >>>>>>> >> >>>>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>>>> >> long, cusp::blas::detail::SCAL>>, >>>>>>> >> launch_by_value=true]" >>>>>>> >> >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(136): >>>>>>> >> here >>>>>>> >> instantiation of "thrust::pair >>>>>>> >> >>>>>>> >> >>>>>>> thrust::detail::device::cuda::detail::closure_launcher::configuration_with_maximal_occupancy(Size) >>>>>>> >> [with >>>>>>> >> >>>>>>> NullaryFunction=thrust::detail::device::cuda::for_each_n_closure>, >>>>>>> >> long, cusp::blas::detail::SCAL>>, Size=long]" >>>>>>> >> >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(145): >>>>>>> >> here >>>>>>> >> [ 6 instantiation contexts not shown ] >>>>>>> >> instantiation of "InputIterator >>>>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>>>> >> UnaryFunction, thrust::device_space_tag) [with >>>>>>> >> >>>>>>> >> >>>>>>> InputIterator=thrust::detail::normal_iterator>, >>>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(51): >>>>>>> here >>>>>>> >> instantiation of "InputIterator >>>>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>>>> UnaryFunction) >>>>>>> >> [with >>>>>>> >> >>>>>>> InputIterator=thrust::detail::normal_iterator>, >>>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> /opt/apps/cuda/4.0/cuda/include/thrust/detail/for_each.inl(67): >>>>>>> here >>>>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>>>> >> InputIterator, UnaryFunction) [with >>>>>>> >> >>>>>>> >> >>>>>>> InputIterator=thrust::detail::normal_iterator>, >>>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> (367): here >>>>>>> >> instantiation of "void >>>>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>>>> ScalarType) >>>>>>> >> [with >>>>>>> >> >>>>>>> ForwardIterator=thrust::detail::normal_iterator>, >>>>>>> >> ScalarType=std::complex]" >>>>>>> >> (748): here >>>>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>>>> >> ScalarType) [with Array=cusp::array1d>>>>>> >> cusp::device_memory>, ScalarType=std::complex]" >>>>>>> >> veccusp.cu(1185): here >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>>>> >> error: a value of type "int" cannot be assigned to an entity of >>>>>>> type >>>>>>> >> "_ZNSt7complexIdE9_ComplexTE" >>>>>>> >> >>>>>>> >> " >>>>>>> >> However, I further realize simiar codes as >>>>>>> >> " >>>>>>> >> #include >>>>>>> >> #include >>>>>>> >> #include >>>>>>> >> #include >>>>>>> >> #include >>>>>>> >> #include >>>>>>> >> >>>>>>> >> int main(void) >>>>>>> >> { >>>>>>> >> cusp::array1d, cusp::host_memory> *x; >>>>>>> >> >>>>>>> >> x=new cusp::array1d, >>>>>>> cusp::host_memory>(2,0.0); >>>>>>> >> >>>>>>> >> std::complex alpha(1,2.0); >>>>>>> >> cusp::blas::scal(*x,alpha); >>>>>>> >> >>>>>>> >> return 0; >>>>>>> >> } >>>>>>> >> " >>>>>>> >> >>>>>>> >> When I complied it using "nvcc gputest.cu -o gputest", I only >>>>>>> meet >>>>>>> >> warning information as follows: >>>>>>> >> " >>>>>>> >> /opt/apps/cuda/4.0/cuda/bin/../include/cusp/detail/blas.inl(134): >>>>>>> >> warning: calling a __host__ function from a __host__ __device__ >>>>>>> >> function is not allowed >>>>>>> >> detected during: >>>>>>> >> instantiation of "void >>>>>>> >> cusp::blas::detail::SCAL::operator()(T2 &) [with >>>>>>> >> T=std::complex, T2=std::complex]" >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/host/for_each.inl(37): >>>>>>> >> here >>>>>>> >> instantiation of "InputIterator >>>>>>> >> thrust::detail::host::for_each(InputIterator, InputIterator, >>>>>>> >> UnaryFunction) [with >>>>>>> >> >>>>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/dispatch/for_each.h(46): >>>>>>> >> here >>>>>>> >> instantiation of "InputIterator >>>>>>> >> thrust::detail::dispatch::for_each(InputIterator, InputIterator, >>>>>>> >> UnaryFunction, thrust::host_space_tag) [with >>>>>>> >> >>>>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(51): >>>>>>> >> here >>>>>>> >> instantiation of "InputIterator >>>>>>> >> thrust::detail::for_each(InputIterator, InputIterator, >>>>>>> UnaryFunction) >>>>>>> >> [with >>>>>>> InputIterator=thrust::detail::normal_iterator >>>>>>> >> *>, UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> >>>>>>> /opt/apps/cuda/4.0/cuda/bin/../include/thrust/detail/for_each.inl(67): >>>>>>> >> here >>>>>>> >> instantiation of "void thrust::for_each(InputIterator, >>>>>>> >> InputIterator, UnaryFunction) [with >>>>>>> >> >>>>>>> InputIterator=thrust::detail::normal_iterator *>, >>>>>>> >> UnaryFunction=cusp::blas::detail::SCAL>]" >>>>>>> >> (367): here >>>>>>> >> instantiation of "void >>>>>>> >> cusp::blas::detail::scal(ForwardIterator, ForwardIterator, >>>>>>> ScalarType) >>>>>>> >> [with >>>>>>> ForwardIterator=thrust::detail::normal_iterator >>>>>>> >> *>, ScalarType=std::complex]" >>>>>>> >> (748): here >>>>>>> >> instantiation of "void cusp::blas::scal(Array &, >>>>>>> >> ScalarType) [with Array=cusp::array1d, >>>>>>> >> cusp::host_memory>, ScalarType=std::complex]" >>>>>>> >> gputest.cu(25): here >>>>>>> >> >>>>>>> >> " >>>>>>> >> There are not errors like >>>>>>> >> >>>>>>> >> >>>>>>> "/opt/apps/cuda/4.0/cuda/include/thrust/detail/device/cuda/detail/launch_closure.inl(51): >>>>>>> >> error: a value of type "int" cannot be assigned to an entity of >>>>>>> type >>>>>>> >> "_ZNSt7complexIdE9_ComplexTE" " >>>>>>> >> >>>>>>> >> Furthermore, the warning information is also different between >>>>>>> >> PETSc-dev and simple codes. >>>>>>> >> >>>>>>> >> Could you give me some suggestion for this errors? Thank you very >>>>>>> much. >>>>>>> >> >>>>>>> > >>>>>>> > The headers are complicated to get right. The whole point of what >>>>>>> we did is >>>>>>> > to give a way to use GPU >>>>>>> > simply through the existing PETSc linear algebra interface. >>>>>>> > >>>>>>> > Matt >>>>>>> > >>>>>>> > >>>>>>> >> Best, >>>>>>> >> Yujie >>>>>>> >> >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > -- >>>>>>> > What most experimenters take for granted before they begin their >>>>>>> > experiments is infinitely more interesting than any results to >>>>>>> which their >>>>>>> > experiments lead. >>>>>>> > -- Norbert Wiener >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Tue Feb 7 12:10:22 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 7 Feb 2012 12:10:22 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 11:29 AM, Matthew Knepley wrote: > On Tue, Feb 7, 2012 at 11:20 AM, recrusader wrote: > >> Whether is it possible to find an efficient mechanism to do the >> conversion between std::complex and cusp::complex when the conversion is >> necessary. >> > > That does not matter. This is a compile error. We are not going to change > this right now, and it seems like you are not going > make the necessary changes, so I would say that complex numbers are not > supported with our GPU code right now. The > change would involve using cusp::complex for PetscScalar, and I am not > sure how much work that would entail. > > Matt > Matt, You might be interested to hear that the C++03 standard states that "The effect of instantiating the template complex for any type other than float, double or long double is unspecified". Thus, complex quad precision with it is probably a bad idea and, if I'm not mistaken, the standard does not state that the class must store data in the form double real, imag; so this could potentially break interfaces (e.g., to BLAS or LAPACK). Maybe it would be worthwhile to avoid usage of std::complex and simultaneously fix the compatibility issue with cusp::complex. I recently ripped std::complex out of Elemental for the above reasons. Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From chetan.jhurani at gmail.com Tue Feb 7 13:19:13 2012 From: chetan.jhurani at gmail.com (Chetan Jhurani) Date: Tue, 7 Feb 2012 11:19:13 -0800 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: <4f317923.aa95320a.6248.078d@mx.google.com> I've faced similar issues with std::complex, but the material below, which is not in the C++ standard, has suppressed my fears. Not eliminated though. http://fftw.org/doc/Complex-numbers.html (has a broken link, corrected below) http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1388.pdf Chetan From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Jack Poulson Sent: Tuesday, February 07, 2012 10:10 AM To: PETSc users list Subject: Re: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number On Tue, Feb 7, 2012 at 11:29 AM, Matthew Knepley wrote: On Tue, Feb 7, 2012 at 11:20 AM, recrusader wrote: Whether is it possible to find an efficient mechanism to do the conversion between std::complex and cusp::complex when the conversion is necessary. That does not matter. This is a compile error. We are not going to change this right now, and it seems like you are not going make the necessary changes, so I would say that complex numbers are not supported with our GPU code right now. The change would involve using cusp::complex for PetscScalar, and I am not sure how much work that would entail. Matt Matt, You might be interested to hear that the C++03 standard states that "The effect of instantiating the template complex for any type other than float, double or long double is unspecified". Thus, complex quad precision with it is probably a bad idea and, if I'm not mistaken, the standard does not state that the class must store data in the form double real, imag; so this could potentially break interfaces (e.g., to BLAS or LAPACK). Maybe it would be worthwhile to avoid usage of std::complex and simultaneously fix the compatibility issue with cusp::complex. I recently ripped std::complex out of Elemental for the above reasons. Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Feb 7 13:26:10 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 7 Feb 2012 13:26:10 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: Message-ID: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> Jack, PetscScalar is defined in petscmath.h to be one of several things: typedef std::complex PetscScalar; typedef std::complex PetscScalar; typedef float complex PetscScalar; typedef double complex PetscScalar; typedef float PetscScalar; typedef double PetscScalar; typedef __float128 PetscScalar; Matt's point is that WE (the guys hacking on PETSc everyday) are not going to add other possibilities for cusp and complex at this point in time. We simply don't have the time/reason to add all this functionality now. If recursader wants that functionality he is free to hack the code and add it; if he does it in a clean way that he can provide patches then we will put the patches into petsc-dev. We simply don't have the resources to add all stuff to PETSc that anyone wants anytime and we have to focus on adding functionality that is commonly needed and will be widely used (especially within DOE). Barry On Feb 7, 2012, at 12:10 PM, Jack Poulson wrote: > On Tue, Feb 7, 2012 at 11:29 AM, Matthew Knepley wrote: > On Tue, Feb 7, 2012 at 11:20 AM, recrusader wrote: > Whether is it possible to find an efficient mechanism to do the conversion between std::complex and cusp::complex when the conversion is necessary. > > That does not matter. This is a compile error. We are not going to change this right now, and it seems like you are not going > make the necessary changes, so I would say that complex numbers are not supported with our GPU code right now. The > change would involve using cusp::complex for PetscScalar, and I am not sure how much work that would entail. > > Matt > > Matt, > > You might be interested to hear that the C++03 standard states that "The effect of instantiating the template complex for any type other than float, double or long double is unspecified". Thus, complex quad precision with it is probably a bad idea and, if I'm not mistaken, the standard does not state that the class must store data in the form > > double real, imag; > > so this could potentially break interfaces (e.g., to BLAS or LAPACK). Maybe it would be worthwhile to avoid usage of std::complex and simultaneously fix the compatibility issue with cusp::complex. I recently ripped std::complex out of Elemental for the above reasons. > > Jack From jack.poulson at gmail.com Tue Feb 7 14:08:45 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 7 Feb 2012 14:08:45 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: <4f317923.aa95320a.6248.078d@mx.google.com> References: <4f317923.aa95320a.6248.078d@mx.google.com> Message-ID: Thank you for the relevant links! The downside is that there does not seem to be a proposal to allow for more general base types within std::complex, e.g., __float128. This is actually my major complaint (that and fear of unstable complex arithmetic, especially division). Jack On Tue, Feb 7, 2012 at 1:19 PM, Chetan Jhurani wrote: > I?ve faced similar issues with std::complex, but the material below,**** > > which is not in the C++ standard, has suppressed my fears. Not**** > > eliminated though.**** > > ** ** > > http://fftw.org/doc/Complex-numbers.html (has a broken link, corrected > below)**** > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1388.pdf**** > > ** ** > > Chetan**** > > ** ** > > *From:* petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Jack Poulson > *Sent:* Tuesday, February 07, 2012 10:10 AM > *To:* PETSc users list > *Subject:* Re: [petsc-users] one compilation error in PETSc-dev with > enabling GPU and complex number**** > > ** ** > > On Tue, Feb 7, 2012 at 11:29 AM, Matthew Knepley > wrote:**** > > On Tue, Feb 7, 2012 at 11:20 AM, recrusader wrote:* > *** > > Whether is it possible to find an efficient mechanism to do the conversion > between std::complex and cusp::complex when the conversion is necessary.** > ** > > ** ** > > That does not matter. This is a compile error. We are not going to change > this right now, and it seems like you are not going**** > > make the necessary changes, so I would say that complex numbers are not > supported with our GPU code right now. The**** > > change would involve using cusp::complex for PetscScalar, and I am not > sure how much work that would entail.**** > > ** ** > > Matt**** > > > Matt, **** > > > You might be interested to hear that the C++03 standard states that "The > effect of instantiating the template complex for any type other than float, > double or long double is unspecified". Thus, complex quad precision with it > is probably a bad idea and, if I'm not mistaken, the standard does not > state that the class must store data in the form > > double real, imag; > > so this could potentially break interfaces (e.g., to BLAS or LAPACK). > Maybe it would be worthwhile to avoid usage of std::complex and > simultaneously fix the compatibility issue with cusp::complex. I recently > ripped std::complex out of Elemental for the above reasons. > > Jack**** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Tue Feb 7 14:12:44 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 7 Feb 2012 14:12:44 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> Message-ID: Barry, Believe me, I apologize if it came across as me trying to dump something else on your plates. I was simply trying to point out some shortcomings of standard approaches to complex numbers. I know that PETSc supports quad precision for real numbers, but it seems that there isn't yet a good way to do so for the complex case, as it would require a custom complex class. If not, please let me know! Jack On Tue, Feb 7, 2012 at 1:26 PM, Barry Smith wrote: > > Jack, > > PetscScalar is defined in petscmath.h to be one of several things: > > typedef std::complex PetscScalar; > typedef std::complex PetscScalar; > typedef float complex PetscScalar; > typedef double complex PetscScalar; > typedef float PetscScalar; > typedef double PetscScalar; > typedef __float128 PetscScalar; > > Matt's point is that WE (the guys hacking on PETSc everyday) are not going > to add other possibilities for cusp and complex at this point in time. We > simply don't have the time/reason to add all this functionality now. If > recursader wants that functionality he is free to hack the code and add it; > if he does it in a clean way that he can provide patches then we will put > the patches into petsc-dev. We simply don't have the resources to add all > stuff to PETSc that anyone wants anytime and we have to focus on adding > functionality that is commonly needed and will be widely used (especially > within DOE). > > Barry > > > > On Feb 7, 2012, at 12:10 PM, Jack Poulson wrote: > > > On Tue, Feb 7, 2012 at 11:29 AM, Matthew Knepley > wrote: > > On Tue, Feb 7, 2012 at 11:20 AM, recrusader > wrote: > > Whether is it possible to find an efficient mechanism to do the > conversion between std::complex and cusp::complex when the conversion is > necessary. > > > > That does not matter. This is a compile error. We are not going to > change this right now, and it seems like you are not going > > make the necessary changes, so I would say that complex numbers are not > supported with our GPU code right now. The > > change would involve using cusp::complex for PetscScalar, and I am not > sure how much work that would entail. > > > > Matt > > > > Matt, > > > > You might be interested to hear that the C++03 standard states that "The > effect of instantiating the template complex for any type other than float, > double or long double is unspecified". Thus, complex quad precision with it > is probably a bad idea and, if I'm not mistaken, the standard does not > state that the class must store data in the form > > > > double real, imag; > > > > so this could potentially break interfaces (e.g., to BLAS or LAPACK). > Maybe it would be worthwhile to avoid usage of std::complex and > simultaneously fix the compatibility issue with cusp::complex. I recently > ripped std::complex out of Elemental for the above reasons. > > > > Jack > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Feb 7 14:18:38 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 7 Feb 2012 14:18:38 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> Message-ID: <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> On Feb 7, 2012, at 2:12 PM, Jack Poulson wrote: > Barry, > > Believe me, I apologize if it came across as me trying to dump something else on your plates. I was simply trying to point out some shortcomings of standard approaches to complex numbers. No apology needed. > > I know that PETSc supports quad precision for real numbers, but it seems that there isn't yet a good way to do so for the complex case, as it would require a custom complex class. If not, please let me know! I have not tried complex quad precision. If you are right than it may not be possible right out of the box with std:complex. We are not in position to provide a proper complex class for this case (for the same reasons I listed before) so if someone wants PETSc with complex quad they may have their work cut out for them. Barry > > Jack > > On Tue, Feb 7, 2012 at 1:26 PM, Barry Smith wrote: > > Jack, > > PetscScalar is defined in petscmath.h to be one of several things: > > typedef std::complex PetscScalar; > typedef std::complex PetscScalar; > typedef float complex PetscScalar; > typedef double complex PetscScalar; > typedef float PetscScalar; > typedef double PetscScalar; > typedef __float128 PetscScalar; > > Matt's point is that WE (the guys hacking on PETSc everyday) are not going to add other possibilities for cusp and complex at this point in time. We simply don't have the time/reason to add all this functionality now. If recursader wants that functionality he is free to hack the code and add it; if he does it in a clean way that he can provide patches then we will put the patches into petsc-dev. We simply don't have the resources to add all stuff to PETSc that anyone wants anytime and we have to focus on adding functionality that is commonly needed and will be widely used (especially within DOE). > > Barry > > > > On Feb 7, 2012, at 12:10 PM, Jack Poulson wrote: > > > On Tue, Feb 7, 2012 at 11:29 AM, Matthew Knepley wrote: > > On Tue, Feb 7, 2012 at 11:20 AM, recrusader wrote: > > Whether is it possible to find an efficient mechanism to do the conversion between std::complex and cusp::complex when the conversion is necessary. > > > > That does not matter. This is a compile error. We are not going to change this right now, and it seems like you are not going > > make the necessary changes, so I would say that complex numbers are not supported with our GPU code right now. The > > change would involve using cusp::complex for PetscScalar, and I am not sure how much work that would entail. > > > > Matt > > > > Matt, > > > > You might be interested to hear that the C++03 standard states that "The effect of instantiating the template complex for any type other than float, double or long double is unspecified". Thus, complex quad precision with it is probably a bad idea and, if I'm not mistaken, the standard does not state that the class must store data in the form > > > > double real, imag; > > > > so this could potentially break interfaces (e.g., to BLAS or LAPACK). Maybe it would be worthwhile to avoid usage of std::complex and simultaneously fix the compatibility issue with cusp::complex. I recently ripped std::complex out of Elemental for the above reasons. > > > > Jack > > From jedbrown at mcs.anl.gov Tue Feb 7 15:34:34 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 8 Feb 2012 00:34:34 +0300 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> Message-ID: On Tue, Feb 7, 2012 at 23:18, Barry Smith wrote: > I have not tried complex quad precision. If you are right than it may not > be possible right out of the box with std:complex. We are not in > position to provide a proper complex class for this case (for the same > reasons I listed before) so if someone wants PETSc with complex quad they > may have their work cut out for them. It's spelled __complex128. http://gcc.gnu.org/onlinedocs/gcc-4.6.2/libquadmath.pdf Perhaps it's a shame they didn't use C99 complex __float128, but such is life. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Tue Feb 7 15:43:41 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 7 Feb 2012 15:43:41 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> Message-ID: On Tue, Feb 7, 2012 at 3:34 PM, Jed Brown wrote: > On Tue, Feb 7, 2012 at 23:18, Barry Smith wrote: > >> I have not tried complex quad precision. If you are right than it may not >> be possible right out of the box with std:complex. We are not in >> position to provide a proper complex class for this case (for the same >> reasons I listed before) so if someone wants PETSc with complex quad they >> may have their work cut out for them. > > > It's spelled __complex128. > http://gcc.gnu.org/onlinedocs/gcc-4.6.2/libquadmath.pdf > > Perhaps it's a shame they didn't use C99 complex __float128, but such is > life. > Good to know; I didn't know libquadmath provided that. The catch is that it does not allow one to easily extend templates that previously handled 32-bit and 64-bit base types. On the other hand, a custom complex class could easily just call the built-in functions for __complex128 when instantiated with a base type of __float128. Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Feb 7 15:46:58 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 8 Feb 2012 00:46:58 +0300 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> Message-ID: On Wed, Feb 8, 2012 at 00:43, Jack Poulson wrote: > Good to know; I didn't know libquadmath provided that. The catch is that > it does not allow one to easily extend templates that previously handled > 32-bit and 64-bit base types. On the other hand, a custom complex class > could easily just call the built-in functions for __complex128 when > instantiated with a base type of __float128. It'll work fine with PETSc's typedef of PetscReal and PetscScalar. I never like that C++ crap anyway. ;-D (You could template over the real and complex types separately, but I guess you are assuming a certain syntax for converting between reals and complex.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack.poulson at gmail.com Tue Feb 7 16:12:54 2012 From: jack.poulson at gmail.com (Jack Poulson) Date: Tue, 7 Feb 2012 16:12:54 -0600 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> Message-ID: On Tue, Feb 7, 2012 at 3:46 PM, Jed Brown wrote: > On Wed, Feb 8, 2012 at 00:43, Jack Poulson wrote: > >> Good to know; I didn't know libquadmath provided that. The catch is that >> it does not allow one to easily extend templates that previously handled >> 32-bit and 64-bit base types. On the other hand, a custom complex class >> could easily just call the built-in functions for __complex128 when >> instantiated with a base type of __float128. > > > It'll work fine with PETSc's typedef of PetscReal and PetscScalar. I never > like that C++ crap anyway. ;-D > > (You could template over the real and complex types separately, but I > guess you are assuming a certain syntax for converting between reals and > complex.) > Yes, one could argue that PetscScalar can behave like a custom complex class, but with only a single instantiation choice through the preprocessor where each implementation was completely separate. If this type of logic was pulled into the language instead of lying at the preprocessor stage, then it would result in a custom complex class like I am arguing for. Either way std::complex is not being used for quads. Any way, it sounds like the current approach will work just fine for PETSc, so I will shut up. The only other argument would be if someone wanted to compute using some exotic datatype like the Gaussian integers, but that would require a subset of the functionality of standard complex class that only assumed a ring instead of a field. Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Feb 7 16:19:01 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 8 Feb 2012 01:19:01 +0300 Subject: [petsc-users] one compilation error in PETSc-dev with enabling GPU and complex number In-Reply-To: References: <92BFC31B-068B-4EC7-A7C4-401E3F2A3496@mcs.anl.gov> <00C34071-4B02-461B-AA8F-DF68F0BAFDBA@mcs.anl.gov> Message-ID: On Wed, Feb 8, 2012 at 01:12, Jack Poulson wrote: > Yes, one could argue that PetscScalar can behave like a custom complex > class, but with only a single instantiation choice through the preprocessor > where each implementation was completely separate. If this type of logic > was pulled into the language instead of lying at the preprocessor stage, > then it would result in a custom complex class like I am arguing for. > Either way std::complex is not being used for quads. > I think the typedefs are good enough for the 99%, but I was just teasing. > > Any way, it sounds like the current approach will work just fine for > PETSc, so I will shut up. The only other argument would be if someone > wanted to compute using some exotic datatype like the Gaussian integers, > but that would require a subset of the functionality of standard complex > class that only assumed a ring instead of a field. > ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Tue Feb 7 20:48:05 2012 From: friedmud at gmail.com (Derek Gaston) Date: Tue, 7 Feb 2012 19:48:05 -0700 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: <3491C895-12B3-43D7-A2F3-7932E24BFCA3@dsic.upv.es> References: <3491C895-12B3-43D7-A2F3-7932E24BFCA3@dsic.upv.es> Message-ID: On Tue, Feb 7, 2012 at 12:45 AM, Jose E. Roman wrote: > > Try with PADB: http://padb.pittman.org.uk/ > Jose Thanks! That looks like just the thing I need! Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Tue Feb 7 20:52:21 2012 From: friedmud at gmail.com (Derek Gaston) Date: Tue, 7 Feb 2012 19:52:21 -0700 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Mon, Feb 6, 2012 at 11:20 PM, Jed Brown wrote: > > Hmm, progress semantics of MPI should ensure completion. Stalling the > process with gdb should not change anything (assuming you weren't actually > making changes with gdb). Can you run with MPICH2? > Ok - an update on this. I recompiled my whole stack with mvapich2... and it still is hanging in the same place: #0 0x00002b336a732f40 in PMI_Get_rank () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #1 0x00002b336a6bf453 in MPIDI_CH3I_MRAILI_Cq_poll () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #2 0x00002b336a675818 in MPIDI_CH3I_read_progress () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #3 0x00002b336a67485b in MPIDI_CH3I_Progress () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #4 0x00002b336a6bea96 in MPIC_Wait () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #5 0x00002b336a6be9db in MPIC_Sendrecv () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #6 0x00002b336a6be8aa in MPIC_Sendrecv_ft () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #7 0x00002b336a652db1 in MPIR_Allgather_intra_MV2 () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #8 0x00002b336a652965 in MPIR_Allgather_MV2 () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #9 0x00002b336a651846 in MPIR_Allgather_impl () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #10 0x00002b336a6517b1 in PMPI_Allgather () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #11 0x00000000004a1f23 in PetscLayoutSetUp () #12 0x000000000054e469 in MatMPIAIJSetPreallocation_MPIAIJ () #13 0x000000000055584a in MatCreateMPIAIJ () It's been hung there for about 35 minutes. This particular job has ~100 million DoFs with 512 MPI processes. Any ideas? Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 7 20:56:45 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 7 Feb 2012 20:56:45 -0600 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 8:52 PM, Derek Gaston wrote: > On Mon, Feb 6, 2012 at 11:20 PM, Jed Brown wrote: > >> >> Hmm, progress semantics of MPI should ensure completion. Stalling the >> process with gdb should not change anything (assuming you weren't actually >> making changes with gdb). Can you run with MPICH2? >> > > Ok - an update on this. I recompiled my whole stack with mvapich2... and > it still is hanging in the same place: > > #0 0x00002b336a732f40 in PMI_Get_rank () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #1 0x00002b336a6bf453 in MPIDI_CH3I_MRAILI_Cq_poll () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #2 0x00002b336a675818 in MPIDI_CH3I_read_progress () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #3 0x00002b336a67485b in MPIDI_CH3I_Progress () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #4 0x00002b336a6bea96 in MPIC_Wait () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #5 0x00002b336a6be9db in MPIC_Sendrecv () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #6 0x00002b336a6be8aa in MPIC_Sendrecv_ft () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #7 0x00002b336a652db1 in MPIR_Allgather_intra_MV2 () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #8 0x00002b336a652965 in MPIR_Allgather_MV2 () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #9 0x00002b336a651846 in MPIR_Allgather_impl () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #10 0x00002b336a6517b1 in PMPI_Allgather () from > /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 > #11 0x00000000004a1f23 in PetscLayoutSetUp () > #12 0x000000000054e469 in MatMPIAIJSetPreallocation_MPIAIJ () > #13 0x000000000055584a in MatCreateMPIAIJ () > > It's been hung there for about 35 minutes. > > This particular job has ~100 million DoFs with 512 MPI processes. Any > ideas? > Same question: are you sure every process is there. I will bet $10 there is at least one missing. Matt > Derek > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Tue Feb 7 21:00:51 2012 From: friedmud at gmail.com (Derek Gaston) Date: Tue, 7 Feb 2012 20:00:51 -0700 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 7:56 PM, Matthew Knepley wrote: > > Same question: are you sure every process is there. I will bet $10 there > is at least one missing. > Ok - I'll get that info - I tried to compile PADB a moment ago... but it didn't compile. It configured just fine... but compile is a no-go. It doesn't look like too bad of an error, but I don't have time to fix it now so I'll just script something up with GDB. At least I got the processor count down for this job so it won't be too much data to sort through. Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Tue Feb 7 21:17:00 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Tue, 7 Feb 2012 21:17:00 -0600 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: Since the processor count is down, can you run it with debugging and see what the arguments to PetscLayout are? Dmitry On Tue, Feb 7, 2012 at 9:00 PM, Derek Gaston wrote: > > > On Tue, Feb 7, 2012 at 7:56 PM, Matthew Knepley wrote: > >> >> Same question: are you sure every process is there. I will bet $10 there >> is at least one missing. >> > > Ok - I'll get that info - I tried to compile PADB a moment ago... but it > didn't compile. It configured just fine... but compile is a no-go. It > doesn't look like too bad of an error, but I don't have time to fix it now > so I'll just script something up with GDB. At least I got the processor > count down for this job so it won't be too much data to sort through. > > Derek > -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Wed Feb 8 04:34:37 2012 From: friedmud at gmail.com (Derek Gaston) Date: Wed, 8 Feb 2012 03:34:37 -0700 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 7:56 PM, Matthew Knepley wrote: > Same question: are you sure every process is there. I will bet $10 there > is at least one missing. > Well, of course you guys were right! I was able to write a short piece of python (pasted at the bottom of this email in case others find it interesting / useful) to look through the stack trace of every one of my processes in given job on the cluster and tell me which ones weren't in the place I was expecting them to be. Once I identified these processes I was able to ssh to the individual nodes and use gdb to attach to those processes and get stacktraces out to figure out what in the heck they were doing. Two things came up, both in a library our software depends on: 1. There was an inadvertent vector "localize" operation happening on the solution vector. This means that we were making a complete _local_ copy of the parallel solution vector to every processor! That would sometimes fail with 400 Million Dofs and 8000+ MPI ;-) 2. A small optimization problem having to do with threading and allocation / deallocation of small vectors. This was just slowing some of the nodes down so much that it would look like the processes had hung. After fixing up both of these things the jobs are now moving. Thanks for the suggestion ;-) Below is the python script I used to figure this stuff out. It works by calling the script with the job # of the job you want to analyze. It is set up for PBS, so if your cluster doesn't use PBS you'll have to disregard the top part where I'm just parsing out the list of nodes the job is running on. "fission-\d\d\d\d" is the regex pattern my cluster's nodes are named in. MatCreateMPIAIJ was what I was looking for in the stack trace (I wanted to see if every process had made it there). gastdr was my username, replace with yours. "marmot" was the name of the executable I was running. "bad_hosts" gets filled up with the number of processes owned by you on each node that have a stack trace containing the string you were looking for. Then at the end I analyzed that to see if it matched how many MPI per node I was running (in this case 4). Any host that had less than 4 processes on it that were where I was expecting them to be got spit out at the end. Then it was time to ssh to that node and attach to the processes and figure out what was going on. It's 3:30AM here right now, so you'll have to excuse some of the rough edges in the script. I really just hacked it together for myself but thought others might find some pieces useful from it. Oh, and yes, I did use os.popen()... after all these years I still find it more straightforward to use than any of the subprocesses stuff in Python. It has been deprecated for a _long_ time now... but I hope they never remove it ;-) Happy hunting all! Derek ------- import os import sys import re command = "qstat -n " + sys.argv[1] output = os.popen(command).readlines() regex = re.compile("(fission-\d\d\d\d)") hosts = [] for line in output: f = regex.findall(line) for i in f: hosts.append(i) matcreates = 0 bad_hosts = {} #host = hosts[0] for host in hosts: command = "ssh " + host + " \"ps aux | grep 'gastdr .*marmot' | grep -v grep | awk '{print \$2}' | xargs -I {} gdb --batch --pid={} -ex bt | grep 'MatCreateMPIAIJ' 2>/dev/null \"" lines = os.popen(command).readlines() for line in lines: if line.find("MatCreateMPIAIJ") != -1: matcreates = matcreates + 1 if host in bad_hosts: bad_hosts[host] += 1 else: bad_hosts[host] = 1 print bad_hosts print "Num matches: " + str(matcreates) print "Bad Hosts: " for host, num in bad_hosts.items(): if num != 4: print host -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Wed Feb 8 07:41:38 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Wed, 8 Feb 2012 07:41:38 -0600 Subject: [petsc-users] Hang at PetscLayoutSetUp() In-Reply-To: References: Message-ID: Good to hear you got it sorted out! I understand that PetscLayoutSetup() and VecScatterCreate() problems were unrelated, is that right? The PetscLayoutSetup() was hanging because some nodes took too long to enter the call -- they were waiting for mutex locks somewhere else; VecScatterCreate() was "hanging" due to the sheer number of indices in the IS (and building it as a PtoP, while it should have been MPI_ToAll)? Dmitry. On Wed, Feb 8, 2012 at 4:34 AM, Derek Gaston wrote: > On Tue, Feb 7, 2012 at 7:56 PM, Matthew Knepley wrote: > >> Same question: are you sure every process is there. I will bet $10 there >> is at least one missing. >> > > Well, of course you guys were right! I was able to write a short piece of > python (pasted at the bottom of this email in case others find it > interesting / useful) to look through the stack trace of every one of my > processes in given job on the cluster and tell me which ones weren't in the > place I was expecting them to be. Once I identified these processes I was > able to ssh to the individual nodes and use gdb to attach to those > processes and get stacktraces out to figure out what in the heck they were > doing. > > Two things came up, both in a library our software depends on: > > 1. There was an inadvertent vector "localize" operation happening on the > solution vector. This means that we were making a complete _local_ copy of > the parallel solution vector to every processor! That would sometimes fail > with 400 Million Dofs and 8000+ MPI ;-) > > 2. A small optimization problem having to do with threading and > allocation / deallocation of small vectors. This was just slowing some of > the nodes down so much that it would look like the processes had hung. > > After fixing up both of these things the jobs are now moving. > > Thanks for the suggestion ;-) > > Below is the python script I used to figure this stuff out. It works by > calling the script with the job # of the job you want to analyze. It is > set up for PBS, so if your cluster doesn't use PBS you'll have to disregard > the top part where I'm just parsing out the list of nodes the job is > running on. "fission-\d\d\d\d" is the regex pattern my cluster's nodes are > named in. > > MatCreateMPIAIJ was what I was looking for in the stack trace (I wanted to > see if every process had made it there). gastdr was my username, replace > with yours. "marmot" was the name of the executable I was running. > "bad_hosts" gets filled up with the number of processes owned by you on > each node that have a stack trace containing the string you were looking > for. Then at the end I analyzed that to see if it matched how many MPI per > node I was running (in this case 4). Any host that had less than 4 > processes on it that were where I was expecting them to be got spit out at > the end. Then it was time to ssh to that node and attach to the processes > and figure out what was going on. > > It's 3:30AM here right now, so you'll have to excuse some of the rough > edges in the script. I really just hacked it together for myself but > thought others might find some pieces useful from it. Oh, and yes, I did > use os.popen()... after all these years I still find it more > straightforward to use than any of the subprocesses stuff in Python. It > has been deprecated for a _long_ time now... but I hope they never remove > it ;-) > > Happy hunting all! > > Derek > > > ------- > > import os > import sys > import re > > command = "qstat -n " + sys.argv[1] > > output = os.popen(command).readlines() > > regex = re.compile("(fission-\d\d\d\d)") > > hosts = [] > > for line in output: > f = regex.findall(line) > for i in f: > hosts.append(i) > > matcreates = 0 > bad_hosts = {} > #host = hosts[0] > for host in hosts: > command = "ssh " + host + " \"ps aux | grep 'gastdr .*marmot' | grep -v > grep | awk '{print \$2}' | xargs -I {} gdb --batch --pid={} -ex bt | grep > 'MatCreateMPIAIJ' 2>/dev/null \"" > lines = os.popen(command).readlines() > for line in lines: > if line.find("MatCreateMPIAIJ") != -1: > matcreates = matcreates + 1 > if host in bad_hosts: > bad_hosts[host] += 1 > else: > bad_hosts[host] = 1 > > print bad_hosts > > print "Num matches: " + str(matcreates) > > print "Bad Hosts: " > for host, num in bad_hosts.items(): > if num != 4: > print host > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Wed Feb 8 14:53:30 2012 From: recrusader at gmail.com (recrusader) Date: Wed, 8 Feb 2012 14:53:30 -0600 Subject: [petsc-users] code changes from CPU to GPU Message-ID: Dear PETSc developers, I have FEM codes. It works very well in CPU computation. Now, I add '-vec_type seqcusp -mat_type aijcusp' when running the codes, I met the following errors. My question that whether there is an example to demonstrate how to revise the codes for the conversion from CPU to GPU. In addition, 'seqcusp' and 'seqaijcusp' are used when vec and mat are saved in one GPU card? Thank you very much. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Vector type seqcusp does not have local representation! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 10:28:33 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: /work/01820/ylu/libmesh_svn01232012/examples/myproj_sp1/myproj-opt on a westmere- named c300-205.ls4.tacc.utexas.edu by ylu Wed Feb 8 14:34:46 2012 [0]PETSC ERROR: Libraries linked from /opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/westmere-cuda/lib [0]PETSC ERROR: Configure run at Fri Dec 16 11:27:43 2011 [0]PETSC ERROR: Configure options --with-x=0 -with-pic --with-external-packages-dir=/var/tmp/petsc-3.2-buildroot//opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/externalpackages-cuda --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 --with-scalar-type=real --with-dynamic-loading=0 --with-shared-libraries=0 --with-chaco=1 --download-chaco=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/Chaco-2.2.tar.gz --with-spai=1 --download-spai=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spai_3.0-mar-06.tar.gz --with-hypre=1 --download-hypre=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/hypre-2.6.0b.tar.gz --with-mumps=1 --download-mumps=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/MUMPS_4.9.2.tar.gz --with-scalapack=1 --download-scalapack=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/scalapack.tgz --with-blacs=1 --download-blacs=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/blacs-dev.tar.gz --with-spooles=1 --download-spooles=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spooles-2.2-dec-2008.tar.gz --with-superlu=1 --download-superlu=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_4.1-December_20_2010.tar.gz --with-superlu_dist=1 --download-superlu_dist=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_DIST_2.5-December_21_2010.tar.gz --with-parmetis=1 --download-parmetis=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/ParMetis-dev-p3.tar.gz --with-debugging=no --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t --with-mpiexec=mpirun_rsh --with-cuda=1 --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ --with-cusp-dir=/opt/apps/cuda/4.0/cuda/ --with-thrust-dir=/opt/apps/cuda/4.0/cuda/ --COPTFLAGS=-xW --CXXOPTFLAGS=-xW --FOPTFLAGS=-xW [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: VecGhostGetLocalForm() line 82 in src/vec/vec/impls/mpi/commonmpvec.c [0]PETSC ERROR: zero() line 974 in "unknowndirectory/"/work/01820/ylu/libmesh_svn01232012/include/numerics/petsc_vector.h application called MPI_Abort(comm=0x84000002, 62) - process 0 Best, Yujie From bsmith at mcs.anl.gov Wed Feb 8 14:58:28 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 8 Feb 2012 14:58:28 -0600 Subject: [petsc-users] code changes from CPU to GPU In-Reply-To: References: Message-ID: The "ghosted" forms of vectors is not supported currently with GPUs. Barry On Feb 8, 2012, at 2:53 PM, recrusader wrote: > Dear PETSc developers, > > I have FEM codes. It works very well in CPU computation. > Now, I add '-vec_type seqcusp -mat_type aijcusp' when running the > codes, I met the following errors. My question that whether there is > an example to demonstrate how to revise the codes for the conversion > from CPU to GPU. > > In addition, 'seqcusp' and 'seqaijcusp' are used when vec and mat are > saved in one GPU card? > > Thank you very much. > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Vector type seqcusp does not have local representation! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 > 10:28:33 CDT 2011 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: > /work/01820/ylu/libmesh_svn01232012/examples/myproj_sp1/myproj-opt on > a westmere- named c300-205.ls4.tacc.utexas.edu by ylu Wed Feb 8 > 14:34:46 2012 > [0]PETSC ERROR: Libraries linked from > /opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/westmere-cuda/lib > [0]PETSC ERROR: Configure run at Fri Dec 16 11:27:43 2011 > [0]PETSC ERROR: Configure options --with-x=0 -with-pic > --with-external-packages-dir=/var/tmp/petsc-3.2-buildroot//opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/externalpackages-cuda > --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 > --with-scalar-type=real --with-dynamic-loading=0 > --with-shared-libraries=0 --with-chaco=1 > --download-chaco=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/Chaco-2.2.tar.gz > --with-spai=1 --download-spai=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spai_3.0-mar-06.tar.gz > --with-hypre=1 --download-hypre=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/hypre-2.6.0b.tar.gz > --with-mumps=1 --download-mumps=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/MUMPS_4.9.2.tar.gz > --with-scalapack=1 > --download-scalapack=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/scalapack.tgz > --with-blacs=1 --download-blacs=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/blacs-dev.tar.gz > --with-spooles=1 > --download-spooles=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spooles-2.2-dec-2008.tar.gz > --with-superlu=1 > --download-superlu=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_4.1-December_20_2010.tar.gz > --with-superlu_dist=1 > --download-superlu_dist=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_DIST_2.5-December_21_2010.tar.gz > --with-parmetis=1 > --download-parmetis=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/ParMetis-dev-p3.tar.gz > --with-debugging=no > --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t > --with-mpiexec=mpirun_rsh --with-cuda=1 > --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ > --with-cusp-dir=/opt/apps/cuda/4.0/cuda/ > --with-thrust-dir=/opt/apps/cuda/4.0/cuda/ --COPTFLAGS=-xW > --CXXOPTFLAGS=-xW --FOPTFLAGS=-xW > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: VecGhostGetLocalForm() line 82 in > src/vec/vec/impls/mpi/commonmpvec.c > [0]PETSC ERROR: zero() line 974 in > "unknowndirectory/"/work/01820/ylu/libmesh_svn01232012/include/numerics/petsc_vector.h > application called MPI_Abort(comm=0x84000002, 62) - process 0 > > Best, > Yujie From recrusader at gmail.com Wed Feb 8 15:03:24 2012 From: recrusader at gmail.com (recrusader) Date: Wed, 8 Feb 2012 15:03:24 -0600 Subject: [petsc-users] code changes from CPU to GPU In-Reply-To: References: Message-ID: Thanks, Barry. On Wed, Feb 8, 2012 at 2:58 PM, Barry Smith wrote: > > The "ghosted" forms of vectors is not supported currently with GPUs. > > Barry > > On Feb 8, 2012, at 2:53 PM, recrusader wrote: > > > Dear PETSc developers, > > > > I have FEM codes. It works very well in CPU computation. > > Now, I add '-vec_type seqcusp -mat_type aijcusp' when running the > > codes, I met the following errors. My question that whether there is > > an example to demonstrate how to revise the codes for the conversion > > from CPU to GPU. > > > > In addition, 'seqcusp' and 'seqaijcusp' are used when vec and mat are > > saved in one GPU card? > > > > Thank you very much. > > > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > [0]PETSC ERROR: Invalid argument! > > [0]PETSC ERROR: Vector type seqcusp does not have local representation! > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 > > 10:28:33 CDT 2011 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: > > /work/01820/ylu/libmesh_svn01232012/examples/myproj_sp1/myproj-opt on > > a westmere- named c300-205.ls4.tacc.utexas.edu by ylu Wed Feb 8 > > 14:34:46 2012 > > [0]PETSC ERROR: Libraries linked from > > /opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/westmere-cuda/lib > > [0]PETSC ERROR: Configure run at Fri Dec 16 11:27:43 2011 > > [0]PETSC ERROR: Configure options --with-x=0 -with-pic > > > --with-external-packages-dir=/var/tmp/petsc-3.2-buildroot//opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/externalpackages-cuda > > --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 > > --with-scalar-type=real --with-dynamic-loading=0 > > --with-shared-libraries=0 --with-chaco=1 > > > --download-chaco=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/Chaco-2.2.tar.gz > > --with-spai=1 > --download-spai=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spai_3.0-mar-06.tar.gz > > --with-hypre=1 > --download-hypre=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/hypre-2.6.0b.tar.gz > > --with-mumps=1 > --download-mumps=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/MUMPS_4.9.2.tar.gz > > --with-scalapack=1 > > > --download-scalapack=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/scalapack.tgz > > --with-blacs=1 > --download-blacs=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/blacs-dev.tar.gz > > --with-spooles=1 > > > --download-spooles=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spooles-2.2-dec-2008.tar.gz > > --with-superlu=1 > > > --download-superlu=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_4.1-December_20_2010.tar.gz > > --with-superlu_dist=1 > > > --download-superlu_dist=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_DIST_2.5-December_21_2010.tar.gz > > --with-parmetis=1 > > > --download-parmetis=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/ParMetis-dev-p3.tar.gz > > --with-debugging=no > > --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t > > --with-mpiexec=mpirun_rsh --with-cuda=1 > > --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ > > --with-cusp-dir=/opt/apps/cuda/4.0/cuda/ > > --with-thrust-dir=/opt/apps/cuda/4.0/cuda/ --COPTFLAGS=-xW > > --CXXOPTFLAGS=-xW --FOPTFLAGS=-xW > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: VecGhostGetLocalForm() line 82 in > > src/vec/vec/impls/mpi/commonmpvec.c > > [0]PETSC ERROR: zero() line 974 in > > > "unknowndirectory/"/work/01820/ylu/libmesh_svn01232012/include/numerics/petsc_vector.h > > application called MPI_Abort(comm=0x84000002, 62) - process 0 > > > > Best, > > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Wed Feb 8 18:20:47 2012 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Wed, 08 Feb 2012 19:20:47 -0500 Subject: [petsc-users] TS question Message-ID: <4F33115F.2030908@cims.nyu.edu> Hi, Folks -- I was wondering if there was a reference for the time-stepping method implemented by TSARKIMEX2E? Thanks! -- Boyce From jedbrown at mcs.anl.gov Wed Feb 8 19:21:23 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 9 Feb 2012 04:21:23 +0300 Subject: [petsc-users] TS question In-Reply-To: <4F33115F.2030908@cims.nyu.edu> References: <4F33115F.2030908@cims.nyu.edu> Message-ID: On Thu, Feb 9, 2012 at 03:20, Boyce Griffith wrote: > I was wondering if there was a reference for the time-stepping method > implemented by TSARKIMEX2E? Emil, is there something to cite yet? Boyce, we are working on a paper with several new schemes and comparisons of methods for a few applications. How has your experience been with this method? -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Wed Feb 8 20:56:34 2012 From: recrusader at gmail.com (recrusader) Date: Wed, 8 Feb 2012 20:56:34 -0600 Subject: [petsc-users] code changes from CPU to GPU In-Reply-To: References: Message-ID: Dear Barry, I just found your discussion about this problem. Will you add relevant functions for it recently? Thanks a lot. Best, Yujie On Wed, Feb 8, 2012 at 2:58 PM, Barry Smith wrote: > > The "ghosted" forms of vectors is not supported currently with GPUs. > > Barry > > On Feb 8, 2012, at 2:53 PM, recrusader wrote: > > > Dear PETSc developers, > > > > I have FEM codes. It works very well in CPU computation. > > Now, I add '-vec_type seqcusp -mat_type aijcusp' when running the > > codes, I met the following errors. My question that whether there is > > an example to demonstrate how to revise the codes for the conversion > > from CPU to GPU. > > > > In addition, 'seqcusp' and 'seqaijcusp' are used when vec and mat are > > saved in one GPU card? > > > > Thank you very much. > > > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > [0]PETSC ERROR: Invalid argument! > > [0]PETSC ERROR: Vector type seqcusp does not have local representation! > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 > > 10:28:33 CDT 2011 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: > > /work/01820/ylu/libmesh_svn01232012/examples/myproj_sp1/myproj-opt on > > a westmere- named c300-205.ls4.tacc.utexas.edu by ylu Wed Feb 8 > > 14:34:46 2012 > > [0]PETSC ERROR: Libraries linked from > > /opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/westmere-cuda/lib > > [0]PETSC ERROR: Configure run at Fri Dec 16 11:27:43 2011 > > [0]PETSC ERROR: Configure options --with-x=0 -with-pic > > > --with-external-packages-dir=/var/tmp/petsc-3.2-buildroot//opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/externalpackages-cuda > > --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 > > --with-scalar-type=real --with-dynamic-loading=0 > > --with-shared-libraries=0 --with-chaco=1 > > > --download-chaco=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/Chaco-2.2.tar.gz > > --with-spai=1 > --download-spai=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spai_3.0-mar-06.tar.gz > > --with-hypre=1 > --download-hypre=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/hypre-2.6.0b.tar.gz > > --with-mumps=1 > --download-mumps=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/MUMPS_4.9.2.tar.gz > > --with-scalapack=1 > > > --download-scalapack=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/scalapack.tgz > > --with-blacs=1 > --download-blacs=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/blacs-dev.tar.gz > > --with-spooles=1 > > > --download-spooles=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spooles-2.2-dec-2008.tar.gz > > --with-superlu=1 > > > --download-superlu=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_4.1-December_20_2010.tar.gz > > --with-superlu_dist=1 > > > --download-superlu_dist=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_DIST_2.5-December_21_2010.tar.gz > > --with-parmetis=1 > > > --download-parmetis=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/ParMetis-dev-p3.tar.gz > > --with-debugging=no > > --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t > > --with-mpiexec=mpirun_rsh --with-cuda=1 > > --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ > > --with-cusp-dir=/opt/apps/cuda/4.0/cuda/ > > --with-thrust-dir=/opt/apps/cuda/4.0/cuda/ --COPTFLAGS=-xW > > --CXXOPTFLAGS=-xW --FOPTFLAGS=-xW > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: VecGhostGetLocalForm() line 82 in > > src/vec/vec/impls/mpi/commonmpvec.c > > [0]PETSC ERROR: zero() line 974 in > > > "unknowndirectory/"/work/01820/ylu/libmesh_svn01232012/include/numerics/petsc_vector.h > > application called MPI_Abort(comm=0x84000002, 62) - process 0 > > > > Best, > > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 8 21:06:33 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 8 Feb 2012 21:06:33 -0600 Subject: [petsc-users] code changes from CPU to GPU In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 8:56 PM, recrusader wrote: > Dear Barry, > > I just found your discussion about this problem. > Will you add relevant functions for it recently? > No Matt > Thanks a lot. > > Best, > Yujie > > On Wed, Feb 8, 2012 at 2:58 PM, Barry Smith wrote: > >> >> The "ghosted" forms of vectors is not supported currently with GPUs. >> >> Barry >> >> On Feb 8, 2012, at 2:53 PM, recrusader wrote: >> >> > Dear PETSc developers, >> > >> > I have FEM codes. It works very well in CPU computation. >> > Now, I add '-vec_type seqcusp -mat_type aijcusp' when running the >> > codes, I met the following errors. My question that whether there is >> > an example to demonstrate how to revise the codes for the conversion >> > from CPU to GPU. >> > >> > In addition, 'seqcusp' and 'seqaijcusp' are used when vec and mat are >> > saved in one GPU card? >> > >> > Thank you very much. >> > >> > [0]PETSC ERROR: --------------------- Error Message >> > ------------------------------------ >> > [0]PETSC ERROR: Invalid argument! >> > [0]PETSC ERROR: Vector type seqcusp does not have local representation! >> > [0]PETSC ERROR: >> > ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 3, Fri Sep 30 >> > 10:28:33 CDT 2011 >> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> > [0]PETSC ERROR: See docs/index.html for manual pages. >> > [0]PETSC ERROR: >> > ------------------------------------------------------------------------ >> > [0]PETSC ERROR: >> > /work/01820/ylu/libmesh_svn01232012/examples/myproj_sp1/myproj-opt on >> > a westmere- named c300-205.ls4.tacc.utexas.edu by ylu Wed Feb 8 >> > 14:34:46 2012 >> > [0]PETSC ERROR: Libraries linked from >> > /opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/westmere-cuda/lib >> > [0]PETSC ERROR: Configure run at Fri Dec 16 11:27:43 2011 >> > [0]PETSC ERROR: Configure options --with-x=0 -with-pic >> > >> --with-external-packages-dir=/var/tmp/petsc-3.2-buildroot//opt/apps/intel11_1/mvapich2_1_6/petsc/3.2/externalpackages-cuda >> > --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 >> > --with-scalar-type=real --with-dynamic-loading=0 >> > --with-shared-libraries=0 --with-chaco=1 >> > >> --download-chaco=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/Chaco-2.2.tar.gz >> > --with-spai=1 >> --download-spai=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spai_3.0-mar-06.tar.gz >> > --with-hypre=1 >> --download-hypre=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/hypre-2.6.0b.tar.gz >> > --with-mumps=1 >> --download-mumps=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/MUMPS_4.9.2.tar.gz >> > --with-scalapack=1 >> > >> --download-scalapack=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/scalapack.tgz >> > --with-blacs=1 >> --download-blacs=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/blacs-dev.tar.gz >> > --with-spooles=1 >> > >> --download-spooles=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/spooles-2.2-dec-2008.tar.gz >> > --with-superlu=1 >> > >> --download-superlu=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_4.1-December_20_2010.tar.gz >> > --with-superlu_dist=1 >> > >> --download-superlu_dist=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_DIST_2.5-December_21_2010.tar.gz >> > --with-parmetis=1 >> > >> --download-parmetis=/home1/0000/build/rpms/SOURCES/petsc-externalpackages/ParMetis-dev-p3.tar.gz >> > --with-debugging=no >> > --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t >> > --with-mpiexec=mpirun_rsh --with-cuda=1 >> > --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ >> > --with-cusp-dir=/opt/apps/cuda/4.0/cuda/ >> > --with-thrust-dir=/opt/apps/cuda/4.0/cuda/ --COPTFLAGS=-xW >> > --CXXOPTFLAGS=-xW --FOPTFLAGS=-xW >> > [0]PETSC ERROR: >> > ------------------------------------------------------------------------ >> > [0]PETSC ERROR: VecGhostGetLocalForm() line 82 in >> > src/vec/vec/impls/mpi/commonmpvec.c >> > [0]PETSC ERROR: zero() line 974 in >> > >> "unknowndirectory/"/work/01820/ylu/libmesh_svn01232012/include/numerics/petsc_vector.h >> > application called MPI_Abort(comm=0x84000002, 62) - process 0 >> > >> > Best, >> > Yujie >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From buyong.huier at gmail.com Thu Feb 9 08:09:20 2012 From: buyong.huier at gmail.com (Hui Zhang) Date: Thu, 9 Feb 2012 15:09:20 +0100 Subject: [petsc-users] Is sparsity of Vec exploited in MatMult? Message-ID: <8F08F2A4-8AD5-434C-830A-C546B78B2CAA@hotmail.com> Hi, Is sparsity of Vec exploited in MatMult? I'm implementing some domain decomposition methods. The residual is non-zero or significant only at very few nodes. Does MatMult knows that most entries of the Vec are zeros? Is there a threshold to regard an value as zero? Thanks, Hui From bsmith at mcs.anl.gov Thu Feb 9 08:17:33 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 9 Feb 2012 08:17:33 -0600 Subject: [petsc-users] Is sparsity of Vec exploited in MatMult? In-Reply-To: <8F08F2A4-8AD5-434C-830A-C546B78B2CAA@hotmail.com> References: <8F08F2A4-8AD5-434C-830A-C546B78B2CAA@hotmail.com> Message-ID: <32FD17A9-9823-4F1E-8C6F-50148E362623@mcs.anl.gov> There is no concept of sparsity in vectors in PETSc. If much of your Vec is zero entries then it is expected that you define subproblems that work with only the relevant part of the vectors in your algorithms and not expect it to come "for free" from the usual PETSc operations. Barry On Feb 9, 2012, at 8:09 AM, Hui Zhang wrote: > Hi, > > Is sparsity of Vec exploited in MatMult? I'm implementing some domain decomposition > methods. The residual is non-zero or significant only at very few nodes. Does > MatMult knows that most entries of the Vec are zeros? Is there a threshold to > regard an value as zero? > > Thanks, > Hui From knepley at gmail.com Thu Feb 9 08:26:00 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 9 Feb 2012 08:26:00 -0600 Subject: [petsc-users] Is sparsity of Vec exploited in MatMult? In-Reply-To: <8F08F2A4-8AD5-434C-830A-C546B78B2CAA@hotmail.com> References: <8F08F2A4-8AD5-434C-830A-C546B78B2CAA@hotmail.com> Message-ID: On Thu, Feb 9, 2012 at 8:09 AM, Hui Zhang wrote: > Hi, > > Is sparsity of Vec exploited in MatMult? I'm implementing some domain > decomposition > methods. The residual is non-zero or significant only at very few nodes. > Does > MatMult knows that most entries of the Vec are zeros? Is there a threshold > to > regard an value as zero? > No. Matt > Thanks, > Hui -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Feb 9 17:10:57 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 9 Feb 2012 15:10:57 -0800 Subject: [petsc-users] VecScatter Question Message-ID: Hi guys, I'm just wondering if I understand how the VecScatter works. Considering (petsc 3.2-p6 manual page 53): VecScatterCreate(Vec x,IS ix,Vec y,IS iy,VecScatter *ctx); VecScatterBegin(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER FORWARD); VecScatterEnd(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER FORWARD); VecScatterDestroy(VecScatter *ctx); is the following statement correct? VecScatter looks into "ix" and "iy" index sets and `matches' the global indecies between the two to copy data from vector "x" to vector "y". For example, if "ix" maps local index "1" to global index "10", VecScatter looks inside "iy" to find a local index that is mapped to global index "10" and sends the data accordingly to the correct processor. Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Feb 9 17:17:06 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 9 Feb 2012 17:17:06 -0600 (CST) Subject: [petsc-users] VecScatter Question In-Reply-To: References: Message-ID: On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: > Hi guys, > > I'm just wondering if I understand how the VecScatter works. Considering > (petsc 3.2-p6 manual page 53): > > VecScatterCreate(Vec x,IS ix,Vec y,IS iy,VecScatter *ctx); > VecScatterBegin(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER FORWARD); > VecScatterEnd(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER FORWARD); > VecScatterDestroy(VecScatter *ctx); > > is the following statement correct? > > VecScatter looks into "ix" and "iy" index sets and `matches' the global > indecies between the two to copy data from vector "x" to vector "y". For > example, if "ix" maps local index "1" to global index "10", VecScatter > looks inside "iy" to find a local index that is mapped to global index "10" > and sends the data accordingly to the correct processor. nope - it means - if you have x[10],y[10]: ix = {1,5,9} iy = {0,2,1} Then you get: y[0] = x[1] y[2] = x[5] y[1] = x[9] [all numbers above are global indices] Satish > > Thanks, > Mohammad > From mirzadeh at gmail.com Thu Feb 9 17:29:16 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 9 Feb 2012 15:29:16 -0800 Subject: [petsc-users] VecScatter Question In-Reply-To: References: Message-ID: So this actually means it somehow matches the local indecies? What i mean by that is, y[ iy[i] ] = x[ ix[i] ]. Is that why ix and iy should have the same size? Mohammad On Thu, Feb 9, 2012 at 3:17 PM, Satish Balay wrote: > On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: > > > Hi guys, > > > > I'm just wondering if I understand how the VecScatter works. Considering > > (petsc 3.2-p6 manual page 53): > > > > VecScatterCreate(Vec x,IS ix,Vec y,IS iy,VecScatter *ctx); > > VecScatterBegin(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER > FORWARD); > > VecScatterEnd(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER FORWARD); > > VecScatterDestroy(VecScatter *ctx); > > > > is the following statement correct? > > > > VecScatter looks into "ix" and "iy" index sets and `matches' the global > > indecies between the two to copy data from vector "x" to vector "y". For > > example, if "ix" maps local index "1" to global index "10", VecScatter > > looks inside "iy" to find a local index that is mapped to global index > "10" > > and sends the data accordingly to the correct processor. > > nope - it means - if you have x[10],y[10]: > > ix = {1,5,9} > iy = {0,2,1} > > > Then you get: > y[0] = x[1] > y[2] = x[5] > y[1] = x[9] > > [all numbers above are global indices] > > Satish > > > > > Thanks, > > Mohammad > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Feb 9 17:42:58 2012 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 9 Feb 2012 17:42:58 -0600 (CST) Subject: [petsc-users] VecScatter Question In-Reply-To: References: Message-ID: On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: > So this actually means it somehow matches the local indices? Hm - no local indices here. > What i mean by that is, > > y[ iy[i] ] = x[ ix[i] ]. Sure - with global indices. > > Is that why ix and iy should have the same size? When you specify data movement - you specify both source and destination for each element that is to be moved. If you are moving n elements - you have n sources, and n destination values - hence ix[n], iy[n] Satish > > Mohammad > > On Thu, Feb 9, 2012 at 3:17 PM, Satish Balay wrote: > > > On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: > > > > > Hi guys, > > > > > > I'm just wondering if I understand how the VecScatter works. Considering > > > (petsc 3.2-p6 manual page 53): > > > > > > VecScatterCreate(Vec x,IS ix,Vec y,IS iy,VecScatter *ctx); > > > VecScatterBegin(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER > > FORWARD); > > > VecScatterEnd(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER FORWARD); > > > VecScatterDestroy(VecScatter *ctx); > > > > > > is the following statement correct? > > > > > > VecScatter looks into "ix" and "iy" index sets and `matches' the global > > > indecies between the two to copy data from vector "x" to vector "y". For > > > example, if "ix" maps local index "1" to global index "10", VecScatter > > > looks inside "iy" to find a local index that is mapped to global index > > "10" > > > and sends the data accordingly to the correct processor. > > > > nope - it means - if you have x[10],y[10]: > > > > ix = {1,5,9} > > iy = {0,2,1} > > > > > > Then you get: > > y[0] = x[1] > > y[2] = x[5] > > y[1] = x[9] > > > > [all numbers above are global indices] > > > > Satish > > > > > > > > Thanks, > > > Mohammad > > > > > > > > From mirzadeh at gmail.com Thu Feb 9 17:52:53 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 9 Feb 2012 15:52:53 -0800 Subject: [petsc-users] VecScatter Question In-Reply-To: References: Message-ID: Thanks Satish. That all makes sense. Truth is I am trying to move data from a parallel vector to a serial one on rank 0. I make an index set, in parallel, with stride 1 and use it for both source and destination and i believe that should do the job ... but it does not. Somehow it seems that the values from other processors do not get inserted in the serial vector on rank 0! This seems it should be so simple and I'm not sure if there is a bug somewhere in my code or I'm still confused about IS and VecScatter! Mohammad On Thu, Feb 9, 2012 at 3:42 PM, Satish Balay wrote > On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: > > > So this actually means it somehow matches the local indices? > > Hm - no local indices here. > > > What i mean by that is, > > > > y[ iy[i] ] = x[ ix[i] ]. > > Sure - with global indices. > > > > > Is that why ix and iy should have the same size? > > When you specify data movement - you specify both source and > destination for each element that is to be moved. If you are moving n > elements - you have n sources, and n destination values - hence ix[n], > iy[n] > > Satish > > > > > Mohammad > > > > On Thu, Feb 9, 2012 at 3:17 PM, Satish Balay wrote: > > > > > On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: > > > > > > > Hi guys, > > > > > > > > I'm just wondering if I understand how the VecScatter works. > Considering > > > > (petsc 3.2-p6 manual page 53): > > > > > > > > VecScatterCreate(Vec x,IS ix,Vec y,IS iy,VecScatter *ctx); > > > > VecScatterBegin(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER > > > FORWARD); > > > > VecScatterEnd(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER > FORWARD); > > > > VecScatterDestroy(VecScatter *ctx); > > > > > > > > is the following statement correct? > > > > > > > > VecScatter looks into "ix" and "iy" index sets and `matches' the > global > > > > indecies between the two to copy data from vector "x" to vector "y". > For > > > > example, if "ix" maps local index "1" to global index "10", > VecScatter > > > > looks inside "iy" to find a local index that is mapped to global > index > > > "10" > > > > and sends the data accordingly to the correct processor. > > > > > > nope - it means - if you have x[10],y[10]: > > > > > > ix = {1,5,9} > > > iy = {0,2,1} > > > > > > > > > Then you get: > > > y[0] = x[1] > > > y[2] = x[5] > > > y[1] = x[9] > > > > > > [all numbers above are global indices] > > > > > > Satish > > > > > > > > > > > Thanks, > > > > Mohammad > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Feb 9 18:07:20 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 9 Feb 2012 16:07:20 -0800 Subject: [petsc-users] VecScatter Question In-Reply-To: References: Message-ID: Alright just got it :D The IS needs to be build in serial otherwise each processor only copies the portion it owns! Thanks, Mohammad On Thu, Feb 9, 2012 at 3:52 PM, Mohammad Mirzadeh wrote: > Thanks Satish. That all makes sense. > > Truth is I am trying to move data from a parallel vector to a serial one > on rank 0. I make an index set, in parallel, with stride 1 and use it for > both source and destination and i believe that should do the job ... but it > does not. Somehow it seems that the values from other processors do not get > inserted in the serial vector on rank 0! > > This seems it should be so simple and I'm not sure if there is a bug > somewhere in my code or I'm still confused about IS and VecScatter! > > Mohammad > > > On Thu, Feb 9, 2012 at 3:42 PM, Satish Balay wrote > > On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: >> >> > So this actually means it somehow matches the local indices? >> >> Hm - no local indices here. >> >> > What i mean by that is, >> > >> > y[ iy[i] ] = x[ ix[i] ]. >> >> Sure - with global indices. >> >> > >> > Is that why ix and iy should have the same size? >> >> When you specify data movement - you specify both source and >> destination for each element that is to be moved. If you are moving n >> elements - you have n sources, and n destination values - hence ix[n], >> iy[n] >> >> Satish >> >> > >> > Mohammad >> > >> > On Thu, Feb 9, 2012 at 3:17 PM, Satish Balay wrote: >> > >> > > On Thu, 9 Feb 2012, Mohammad Mirzadeh wrote: >> > > >> > > > Hi guys, >> > > > >> > > > I'm just wondering if I understand how the VecScatter works. >> Considering >> > > > (petsc 3.2-p6 manual page 53): >> > > > >> > > > VecScatterCreate(Vec x,IS ix,Vec y,IS iy,VecScatter *ctx); >> > > > VecScatterBegin(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER >> > > FORWARD); >> > > > VecScatterEnd(VecScatter ctx,Vec x,Vec y,INSERT VALUES,SCATTER >> FORWARD); >> > > > VecScatterDestroy(VecScatter *ctx); >> > > > >> > > > is the following statement correct? >> > > > >> > > > VecScatter looks into "ix" and "iy" index sets and `matches' the >> global >> > > > indecies between the two to copy data from vector "x" to vector >> "y". For >> > > > example, if "ix" maps local index "1" to global index "10", >> VecScatter >> > > > looks inside "iy" to find a local index that is mapped to global >> index >> > > "10" >> > > > and sends the data accordingly to the correct processor. >> > > >> > > nope - it means - if you have x[10],y[10]: >> > > >> > > ix = {1,5,9} >> > > iy = {0,2,1} >> > > >> > > >> > > Then you get: >> > > y[0] = x[1] >> > > y[2] = x[5] >> > > y[1] = x[9] >> > > >> > > [all numbers above are global indices] >> > > >> > > Satish >> > > >> > > > >> > > > Thanks, >> > > > Mohammad >> > > > >> > > >> > > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredva at ifi.uio.no Fri Feb 10 01:59:03 2012 From: fredva at ifi.uio.no (Fredrik Heffer Valdmanis) Date: Fri, 10 Feb 2012 08:59:03 +0100 Subject: [petsc-users] VECMPICUSP with ghosted vector In-Reply-To: <13AEC6C8-F05D-4275-83A3-03DBDE6A5146@mcs.anl.gov> References: <13AEC6C8-F05D-4275-83A3-03DBDE6A5146@mcs.anl.gov> Message-ID: 2012/2/6 Barry Smith > > > Fredrik, > > This question belongs on petsc-dev at mcs.anl.gov since it involves > additions/extensions to PETSc so I am moving the discussion over to there. > > We have not done the required work to have ghosted vectors work with > CUSP yet, so this will require some additions to PETSc. We can help you > with that process but since the PETSc team does not have a CUSP person > developing PETSc full time you will need to actual contribute some code but > I'll try to guide you in the right direction. > > The first observation is that ghosted vectors in PETSc are actually > handled with largely the same code as VECMPI vectors (with just no ghost > points by default) so in theory little work needs to be done to get the > functionality you need. What makes the needed changes non-trivial is the > current interface where one calls VecCreateGhost() to create the vectors. > This is one of our "easy" interfaces and it is somewhat legacy in that > there is no way to control the types of the vectors since it creates > everything about the vector in one step. Note that we have the same > issues with regard to the pthread versions of the PETSc vectors and > ghosting. > > So before we even talk about what code to change/add we need to decide > on the interface. Presumably you want to be able to decide at runtime > whether to use regular VECMPI, VECMPICUSP or VECMPIPTHREAD in your ghosted > vectors. How do we get that information in there? An additional argument to > VecCreateGhost() (ugly?)? Options database (by calling VecSetFromOptions() > ?), other ways? So for example one could have: > > VecCreateGhost(......) > VecSetFromOptions(......) > > to set the specific type cusp or pthread? What about > > VecCreateGhost(......) > VecSetType(......,VECMPICUSP); > > which as you note doesn't currently work. Note that the PTHREAD version > needs to do its own memory allocation so essentially has to undo much of > what VecCreateGhost() already did, is that a bad thing? > > Or do we get rid of VecCreateGhost() completely and change the model to > something like > > VecCreate() > VecSetType() > VecSetGhosted() > > or > > VecCreate() > VecSetTypeFromOptions() > VecSetGhosted() > > or even > > VecCreate() > VecSetGhosted() which will just default to regular MPI ghosted. > > this model allows a clean implementation that doesn't require undoing > previously built internals. > > > Everyone chime in with observations so we can figure out any > refactorizations needed. > > Hi Barry, Thanks for your thorough answer. I am very much interested in contributing. I am however a little pressed on time for this project, as I am doing it as part of my master thesis which is due May 1st. I am afraid this might be out of scope for me, but I am willing to give it a try. Regarding the user interface, I prefer the suggestions that lie as close to the original vector interface as possible, ie. VecCreate() VecSetType() VecSetGhosted() And that VecSetGhosted sets the type to mpi unless the type is already explicitly set by the user. I suppose others should give their opinion as well before we decide. -- Fredrik -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckontzialis at lycos.com Fri Feb 10 06:24:46 2012 From: ckontzialis at lycos.com (Konstantinos Kontzialis) Date: Fri, 10 Feb 2012 14:24:46 +0200 Subject: [petsc-users] PetscViewerASCIISynchronizedPrintf Message-ID: <4F350C8E.2090207@lycos.com> Dear all, I am trying to print data from all the processes in a single file and I use the following command ierr = PetscViewerASCIIOpen(sys.comm, "cp.dat", &viewer1); CHKERRQ(ierr); ierr = PetscViewerASCIISynchronizedAllow(viewer1, PETSC_TRUE); CHKERRQ(ierr); . . /* Some user operations */ . . . ierr = PetscViewerASCIISynchronizedPrintf(viewer1, "%le\t%le\t%le\n", cp, x[0], x[1]); CHKERRQ(ierr); ierr = PetscViewerFlush(viewer1); CHKERRQ(ierr); but at run time the code hangs. What should I do? Kostas From bsmith at mcs.anl.gov Fri Feb 10 08:17:05 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 10 Feb 2012 08:17:05 -0600 Subject: [petsc-users] PetscViewerASCIISynchronizedPrintf In-Reply-To: <4F350C8E.2090207@lycos.com> References: <4F350C8E.2090207@lycos.com> Message-ID: <9572D099-F022-464B-88A8-B4EA68263759@mcs.anl.gov> On Feb 10, 2012, at 6:24 AM, Konstantinos Kontzialis wrote: > Dear all, > > I am trying to print data from all the processes in a single file and I use the following command > > ierr = PetscViewerASCIIOpen(sys.comm, "cp.dat", &viewer1); > CHKERRQ(ierr); > > ierr = PetscViewerASCIISynchronizedAllow(viewer1, PETSC_TRUE); > CHKERRQ(ierr); > > . > . > > /* Some user operations */ > . > . > . > ierr = PetscViewerASCIISynchronizedPrintf(viewer1, > "%le\t%le\t%le\n", cp, x[0], x[1]); > CHKERRQ(ierr); > > ierr = PetscViewerFlush(viewer1); > CHKERRQ(ierr); > > but at run time the code hangs. What should I do? > > Kostas ASCII IO should only be used very sparingly. If you have lots of print statements and many processes it may appear to hang because it is taking forever to process the IO. You need to use binary IO for large amounts of data and or many processes. Barry From knepley at gmail.com Fri Feb 10 08:53:19 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 10 Feb 2012 08:53:19 -0600 Subject: [petsc-users] PetscViewerASCIISynchronizedPrintf In-Reply-To: <9572D099-F022-464B-88A8-B4EA68263759@mcs.anl.gov> References: <4F350C8E.2090207@lycos.com> <9572D099-F022-464B-88A8-B4EA68263759@mcs.anl.gov> Message-ID: On Fri, Feb 10, 2012 at 8:17 AM, Barry Smith wrote: > > On Feb 10, 2012, at 6:24 AM, Konstantinos Kontzialis wrote: > > > Dear all, > > > > I am trying to print data from all the processes in a single file and I > use the following command > > > > ierr = PetscViewerASCIIOpen(sys.comm, "cp.dat", &viewer1); > > CHKERRQ(ierr); > > > > ierr = PetscViewerASCIISynchronizedAllow(viewer1, PETSC_TRUE); > > CHKERRQ(ierr); > > > > . > > . > > > > /* Some user operations */ > > . > > . > > . > > ierr = PetscViewerASCIISynchronizedPrintf(viewer1, > > "%le\t%le\t%le\n", cp, x[0], x[1]); > > CHKERRQ(ierr); > > > > ierr = PetscViewerFlush(viewer1); > > CHKERRQ(ierr); > > > > but at run time the code hangs. What should I do? > > > > Kostas > > ASCII IO should only be used very sparingly. If you have lots of print > statements and many processes it may appear to hang because it is taking > forever to process the IO. You need to use binary IO for large amounts of > data and or many processes. Also, those operations are collective, so you need every process to call that. Matt > > Barry -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 10 14:24:36 2012 From: recrusader at gmail.com (recrusader) Date: Fri, 10 Feb 2012 14:24:36 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP Message-ID: Dear PETSc developers, I use Libmesh with PETSc for my FEM simulation. The program works when using MATSEQCUSP. However, When I test MATMPICUSP with 2 GPU cards, I met the following errors: " [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: Argument out of range! [1]PETSC ERROR: New nonzero at (1,5) caused a malloc! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Development HG revision: b4086d236fb35071ea565635e24ea99d70deeaac HG Date: Sat Jan 21 21:50:39 2012 -0600 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: /work/01820/ylu/libmesh_svn01232012_petscme/examples/myproj_sp1/myproj-opt on a linux named c300-202.ls4.tacc.utexas.edu by ylu Fri Feb 10 13:52:39 2012 [1]PETSC ERROR: Libraries linked from /work/01820/ylu/petsc-dev01222012_real/linux/lib [1]PETSC ERROR: Configure run at Thu Feb 9 10:42:41 2012 [1]PETSC ERROR: Configure options --with-clanguage=C++ --with-debugging=1 --with-shared-libraries=1 --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t/ --with-cuda=1 --with-cusp=1 --with-thrust=1 --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ --with-cusp-dir=/work/01820/ylu --with-valgrind-dir=/opt/apps/valgrind/3.6.0/ [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: MatSetValues_MPIAIJ() line 505 in src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: MatSetValues() line 1119 in src/mat/interface/matrix.c [1]PETSC ERROR: add_matrix() line 468 in "unknowndirectory/"src/numerics/petsc_matrix.C application called MPI_Abort(comm=0x84000004, 63) - process 1 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: New nonzero at (2,54) caused a malloc! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Development HG revision: b4086d236fb35071ea565635e24ea99d70deeaac HG Date: Sat Jan 21 21:50:39 2012 -0600 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: /work/01820/ylu/libmesh_svn01232012_petscme/examples/myproj_sp1/myproj-opt on a linux named c300-202.ls4.tacc.utexas.edu by ylu Fri Feb 10 13:52:39 2012 [0]PETSC ERROR: Libraries linked from /work/01820/ylu/petsc-dev01222012_real/linux/lib [0]PETSC ERROR: Configure run at Thu Feb 9 10:42:41 2012 [0]PETSC ERROR: Configure options --with-clanguage=C++ --with-debugging=1 --with-shared-libraries=1 --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/intel11_1/mvapich2/1.6 --with-blas-lapack-dir=/opt/apps/intel/11.1/mkl/lib/em64t/ --with-cuda=1 --with-cusp=1 --with-thrust=1 --with-cuda-dir=/opt/apps/cuda/4.0/cuda/ --with-cusp-dir=/work/01820/ylu --with-valgrind-dir=/opt/apps/valgrind/3.6.0/ [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatSetValues_MPIAIJ() line 505 in src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: MatSetValues() line 1119 in src/mat/interface/matrix.c [0]PETSC ERROR: add_matrix() line 468 in "unknowndirectory/"src/numerics/petsc_matrix.C application called MPI_Abort(comm=0xC4000000, 63) - process 0 " It seem MatSetValues sets some values out of range. In parallel FEM simulation, I partition the mesh into 2 submeshes. Test is performed with 2GPU cards/2 CPU cores. Each CPU is responsible for one submesh for matrix assembly (I think MatSetValues should set the values within the submesh, otherwise it should generate errors like the above). However, I don't understand why there are errors. Is anything wrong? Thank you very much, Yujie From jedbrown at mcs.anl.gov Fri Feb 10 14:26:44 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 10 Feb 2012 14:26:44 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 14:24, recrusader wrote: > I use Libmesh with PETSc for my FEM simulation. > The program works when using MATSEQCUSP. However, When I test > MATMPICUSP with 2 GPU cards, I met the following errors: > " > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: Argument out of range! > [1]PETSC ERROR: New nonzero at (1,5) caused a malloc! > Can you try with plain MATMPIAIJ? I suspect you are not preallocating correctly. *Preallocation routines now automatically set MAT_NEW_NONZERO_ALLOCATION_ERR, if you intentionally preallocate less than necessary then use MatSetOption(mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) to disable the error generation.* http://www.mcs.anl.gov/petsc/documentation/changes/dev.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 10 14:35:05 2012 From: recrusader at gmail.com (recrusader) Date: Fri, 10 Feb 2012 14:35:05 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Dear Jed, MATMPIAIJ works ;(. Best, Yujie On Fri, Feb 10, 2012 at 2:26 PM, Jed Brown wrote: > On Fri, Feb 10, 2012 at 14:24, recrusader wrote: > >> I use Libmesh with PETSc for my FEM simulation. >> The program works when using MATSEQCUSP. However, When I test >> MATMPICUSP with 2 GPU cards, I met the following errors: >> " >> [1]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [1]PETSC ERROR: Argument out of range! >> [1]PETSC ERROR: New nonzero at (1,5) caused a malloc! >> > > Can you try with plain MATMPIAIJ? I suspect you are not preallocating > correctly. > > *Preallocation routines now automatically set > MAT_NEW_NONZERO_ALLOCATION_ERR, if you intentionally preallocate less than > necessary then use > MatSetOption(mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) to disable the > error generation.* > > http://www.mcs.anl.gov/petsc/documentation/changes/dev.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 10 14:44:23 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 10 Feb 2012 14:44:23 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 14:35, recrusader wrote: > MATMPIAIJ works ;(. Can you send a reproducible test case? (Instructions for how to run a Libmesh example that would cause this problem is okay, producing it with a PETSc example would be better.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 10 14:49:14 2012 From: recrusader at gmail.com (recrusader) Date: Fri, 10 Feb 2012 14:49:14 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Since MATMPIAIJ works (I didn't change anything. Just set the vec and mat types to mpicusp and mpiaijcusp for GPU), I think the problem is likely from MatSetValues_MPIAIJ(). Which PETSc examples can test this function? I can try to use it to test. Thanks a lot, Jed. Best, Yujie On Fri, Feb 10, 2012 at 2:44 PM, Jed Brown wrote: > On Fri, Feb 10, 2012 at 14:35, recrusader wrote: > >> MATMPIAIJ works ;(. > > > Can you send a reproducible test case? (Instructions for how to run a > Libmesh example that would cause this problem is okay, producing it with a > PETSc example would be better.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 10 14:51:50 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 10 Feb 2012 14:51:50 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 14:49, recrusader wrote: > Since MATMPIAIJ works (I didn't change anything. Just set the vec and mat > types to mpicusp and mpiaijcusp for GPU), I think the problem is likely > from MatSetValues_MPIAIJ(). > > Which PETSc examples can test this function? Try src/ksp/ksp/examples/tutorials/ex43.c and src/snes/examples/tutorials/ex48.c -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 10 15:12:30 2012 From: recrusader at gmail.com (recrusader) Date: Fri, 10 Feb 2012 15:12:30 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Dear Jed, The first example works. However, the example uses MatSetValuesStencil() not MatSetValues(). Are they same? Thanks a lot, Yujie On Fri, Feb 10, 2012 at 2:51 PM, Jed Brown wrote: > On Fri, Feb 10, 2012 at 14:49, recrusader wrote: > >> Since MATMPIAIJ works (I didn't change anything. Just set the vec and mat >> types to mpicusp and mpiaijcusp for GPU), I think the problem is likely >> from MatSetValues_MPIAIJ(). >> >> Which PETSc examples can test this function? > > > Try src/ksp/ksp/examples/tutorials/ex43.c and > src/snes/examples/tutorials/ex48.c > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 10 15:14:15 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 10 Feb 2012 15:14:15 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 3:12 PM, recrusader wrote: > Dear Jed, > > The first example works. However, the example uses MatSetValuesStencil() > not MatSetValues(). Are they same? > MatSetValuesStencil() calls MatSetValues(). I suspect that your MPIAIJ matrix does not have the option set to throw an error when inserting a new nonzero, and your MPICUSP matrix does. Matt > Thanks a lot, > Yujie > > On Fri, Feb 10, 2012 at 2:51 PM, Jed Brown wrote: > >> On Fri, Feb 10, 2012 at 14:49, recrusader wrote: >> >>> Since MATMPIAIJ works (I didn't change anything. Just set the vec and >>> mat types to mpicusp and mpiaijcusp for GPU), I think the problem is likely >>> from MatSetValues_MPIAIJ(). >>> >>> Which PETSc examples can test this function? >> >> >> Try src/ksp/ksp/examples/tutorials/ex43.c and >> src/snes/examples/tutorials/ex48.c >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 10 23:10:34 2012 From: recrusader at gmail.com (recrusader) Date: Fri, 10 Feb 2012 23:10:34 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Dear Matt, I added the print codes in libmesh after creating the matrix as follows: " ierr = MatCreateMPIAIJ (libMesh::COMM_WORLD, m_local, n_local, m_global, n_global, PETSC_NULL, (int*) &n_nz[0], PETSC_NULL, (int*) &n_oz[0], &_mat); CHKERRABORT(libMesh::COMM_WORLD,ierr); MatSetOption(_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); //by Yujie std::cout<<"MatSetOption"< wrote: > On Fri, Feb 10, 2012 at 3:12 PM, recrusader wrote: > >> Dear Jed, >> >> The first example works. However, the example uses MatSetValuesStencil() >> not MatSetValues(). Are they same? >> > > MatSetValuesStencil() calls MatSetValues(). I suspect that your MPIAIJ > matrix does not have the option set to > throw an error when inserting a new nonzero, and your MPICUSP matrix does. > > Matt > > >> Thanks a lot, >> Yujie >> >> On Fri, Feb 10, 2012 at 2:51 PM, Jed Brown wrote: >> >>> On Fri, Feb 10, 2012 at 14:49, recrusader wrote: >>> >>>> Since MATMPIAIJ works (I didn't change anything. Just set the vec and >>>> mat types to mpicusp and mpiaijcusp for GPU), I think the problem is likely >>>> from MatSetValues_MPIAIJ(). >>>> >>>> Which PETSc examples can test this function? >>> >>> >>> Try src/ksp/ksp/examples/tutorials/ex43.c and >>> src/snes/examples/tutorials/ex48.c >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 10 23:30:20 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 10 Feb 2012 23:30:20 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 11:10 PM, recrusader wrote: > Dear Matt, > > I added the print codes in libmesh after creating the matrix as follows: > " ierr = MatCreateMPIAIJ (libMesh::COMM_WORLD, > m_local, n_local, > m_global, n_global, > PETSC_NULL, (int*) &n_nz[0], > PETSC_NULL, (int*) &n_oz[0], &_mat); > CHKERRABORT(libMesh::COMM_WORLD,ierr); > > MatSetOption(_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); //by > Yujie > std::cout<<"MatSetOption"< > I run the same codes in CPU and GPU modes (the same parameters except that > GPU uses '-vec_type mpicusp -mat_type mpiaijcusp'). I can find > "MatSetOption" output from both the modes. Does that mean that the codes > set the options for both the modes? > Thank you very much. > Yes, so you should have no problem with allocation errors. Partial reports like this help no one. It would be somewhat helpful to include a stack trace, to verify that after this change you see an error. If might actually enable us to find your error if you sent a small test code which failed. Thanks, Matt > Best, > Yujie > > > On Fri, Feb 10, 2012 at 3:14 PM, Matthew Knepley wrote: > >> On Fri, Feb 10, 2012 at 3:12 PM, recrusader wrote: >> >>> Dear Jed, >>> >>> The first example works. However, the example uses MatSetValuesStencil() >>> not MatSetValues(). Are they same? >>> >> >> MatSetValuesStencil() calls MatSetValues(). I suspect that your MPIAIJ >> matrix does not have the option set to >> throw an error when inserting a new nonzero, and your MPICUSP matrix does. >> >> Matt >> >> >>> Thanks a lot, >>> Yujie >>> >>> On Fri, Feb 10, 2012 at 2:51 PM, Jed Brown wrote: >>> >>>> On Fri, Feb 10, 2012 at 14:49, recrusader wrote: >>>> >>>>> Since MATMPIAIJ works (I didn't change anything. Just set the vec and >>>>> mat types to mpicusp and mpiaijcusp for GPU), I think the problem is likely >>>>> from MatSetValues_MPIAIJ(). >>>>> >>>>> Which PETSc examples can test this function? >>>> >>>> >>>> Try src/ksp/ksp/examples/tutorials/ex43.c and >>>> src/snes/examples/tutorials/ex48.c >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Feb 11 00:00:33 2012 From: recrusader at gmail.com (recrusader) Date: Sat, 11 Feb 2012 00:00:33 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Dear Matt, in http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex43.c.html Why are two matrices set to MATAIJ? if I set MATAIJCUSP, is them changed? 1468: DMCreateMatrix (da_Stokes,MATAIJ ,&A); 1469: DMCreateMatrix (da_Stokes,MATAIJ ,&B); Thanks a lot. Best, Yujie On Fri, Feb 10, 2012 at 11:30 PM, Matthew Knepley wrote: > On Fri, Feb 10, 2012 at 11:10 PM, recrusader wrote: > >> Dear Matt, >> >> I added the print codes in libmesh after creating the matrix as follows: >> " ierr = MatCreateMPIAIJ (libMesh::COMM_WORLD, >> m_local, n_local, >> m_global, n_global, >> PETSC_NULL, (int*) &n_nz[0], >> PETSC_NULL, (int*) &n_oz[0], &_mat); >> CHKERRABORT(libMesh::COMM_WORLD,ierr); >> >> MatSetOption(_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); //by >> Yujie >> std::cout<<"MatSetOption"<> >> I run the same codes in CPU and GPU modes (the same parameters except >> that GPU uses '-vec_type mpicusp -mat_type mpiaijcusp'). I can find >> "MatSetOption" output from both the modes. Does that mean that the codes >> set the options for both the modes? >> Thank you very much. >> > > Yes, so you should have no problem with allocation errors. Partial reports > like this help no one. It would be > somewhat helpful to include a stack trace, to verify that after this > change you see an error. If might actually > enable us to find your error if you sent a small test code which failed. > > Thanks, > > Matt > > >> Best, >> Yujie >> >> >> On Fri, Feb 10, 2012 at 3:14 PM, Matthew Knepley wrote: >> >>> On Fri, Feb 10, 2012 at 3:12 PM, recrusader wrote: >>> >>>> Dear Jed, >>>> >>>> The first example works. However, the example uses >>>> MatSetValuesStencil() not MatSetValues(). Are they same? >>>> >>> >>> MatSetValuesStencil() calls MatSetValues(). I suspect that your MPIAIJ >>> matrix does not have the option set to >>> throw an error when inserting a new nonzero, and your MPICUSP matrix >>> does. >>> >>> Matt >>> >>> >>>> Thanks a lot, >>>> Yujie >>>> >>>> On Fri, Feb 10, 2012 at 2:51 PM, Jed Brown wrote: >>>> >>>>> On Fri, Feb 10, 2012 at 14:49, recrusader wrote: >>>>> >>>>>> Since MATMPIAIJ works (I didn't change anything. Just set the vec and >>>>>> mat types to mpicusp and mpiaijcusp for GPU), I think the problem is likely >>>>>> from MatSetValues_MPIAIJ(). >>>>>> >>>>>> Which PETSc examples can test this function? >>>>> >>>>> >>>>> Try src/ksp/ksp/examples/tutorials/ex43.c and >>>>> src/snes/examples/tutorials/ex48.c >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Feb 11 04:43:41 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 11 Feb 2012 04:43:41 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Sat, Feb 11, 2012 at 00:00, recrusader wrote: > in > http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex43.c.html > > Why are two matrices set to MATAIJ? if I set MATAIJCUSP, is them changed? > > 1468: DMCreateMatrix > (da_Stokes,MATAIJ > ,&A); > > 1469: DMCreateMatrix (da_Stokes,MATAIJ ,&B); > > You have to follow the code down to where those matrices are used. 1478: AssembleA_Stokes(A,da_Stokes,da_prop,properties); 1479: AssembleA_PCStokes(B,da_Stokes,da_prop,properties); This uses a "block preconditioner", the B matrix is intentionally different from the A matrix in order to handle the saddle point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Sat Feb 11 06:06:09 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Sat, 11 Feb 2012 13:06:09 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains Message-ID: About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' in a subcommunicator consisting of processors supporting the subdomain 's'? The source code of PCGASMCreateSubdomains2D() seemingly does so. Thanks, Hui From ckontzialis at lycos.com Sat Feb 11 07:27:48 2012 From: ckontzialis at lycos.com (Konstantinos Kontzialis) Date: Sat, 11 Feb 2012 15:27:48 +0200 Subject: [petsc-users] TSSetPostStep Message-ID: <4F366CD4.3030809@lycos.com> Dear all, I use the TSSetPostStep function to apply a function at the end of each time step. I code the following: ierr = TSCreate(sys.comm, &sys.ts); CHKERRQ(ierr); ierr = TSSetApplicationContext(sys.ts, &sys); CHKERRQ(ierr); ierr = TSSetProblemType(sys.ts, TS_NONLINEAR); CHKERRQ(ierr); ierr = TSSetSolution(sys.ts, sys.gsv); CHKERRQ(ierr); ierr = TSSetPostStep(sys.ts, limiter_implicit); CHKERRQ(ierr); ierr = TSSetIFunction(sys.ts, sys.gres0, base_residual_implicit, &sys); CHKERRQ(ierr); ierr = TSGetSNES(sys.ts, &sys.snes); CHKERRQ(ierr); ierr = MatCreateSNESMF(sys.snes, &sys.J); CHKERRQ(ierr); and in the function : limiter_implicit set in TSSetPostStep I just put the command: puts("out\n"). But nothing prints on the the output at the end of each step. Any suggestions? Kostas From jedbrown at mcs.anl.gov Sat Feb 11 07:42:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 11 Feb 2012 07:42:35 -0600 Subject: [petsc-users] TSSetPostStep In-Reply-To: <4F366CD4.3030809@lycos.com> References: <4F366CD4.3030809@lycos.com> Message-ID: Are you calling TSSolve() or TSStep()? What version of PETSc? On Sat, Feb 11, 2012 at 07:27, Konstantinos Kontzialis < ckontzialis at lycos.com> wrote: > Dear all, > > I use the TSSetPostStep function to apply a function at the end of each > time step. > I code the following: > > ierr = TSCreate(sys.comm, &sys.ts); > CHKERRQ(ierr); > > ierr = TSSetApplicationContext(sys.**ts, &sys); > CHKERRQ(ierr); > > ierr = TSSetProblemType(sys.ts, TS_NONLINEAR); > CHKERRQ(ierr); > > ierr = TSSetSolution(sys.ts, sys.gsv); > CHKERRQ(ierr); > > ierr = TSSetPostStep(sys.ts, limiter_implicit); > CHKERRQ(ierr); > > ierr = TSSetIFunction(sys.ts, sys.gres0, base_residual_implicit, &sys); > CHKERRQ(ierr); > > ierr = TSGetSNES(sys.ts, &sys.snes); > CHKERRQ(ierr); > > ierr = MatCreateSNESMF(sys.snes, &sys.J); > CHKERRQ(ierr); > > and in the function : > > limiter_implicit set in TSSetPostStep I just put the command: > puts("out\n"). But nothing prints on the the output at the end of each > step. Any suggestions? > > Kostas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Feb 11 07:46:22 2012 From: recrusader at gmail.com (recrusader) Date: Sat, 11 Feb 2012 07:46:22 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Dear Jed, I am sorry that I cannot generate the errors using PETSc by itself. However, it easy to generate the errors with libmesh. There is an example from libmesh, that is http://libmesh.sourceforge.net/ex4.php I can run it using the command "ibrun -n 2 -o 0 ./introduction_ex4-dbg -d 3 -pc_type none -ksp_type gmres -vec_type mpicusp -mat_type mpiaijcusp -ksp_view -ksp_monitor -log_summary -malloc_debug -cuda_show_devices" (You should replace ibrun -n 2 -o 0 using mpiexec -n 2) I get the same errors. However, when using "ibrun -n 2 -o 0 ./introduction_ex4-dbg -d 3 -pc_type none -ksp_type gmres -ksp_view -ksp_monitor -log_summary". I can run it successfully. Thank you very much. Best, Yujie On Sat, Feb 11, 2012 at 4:43 AM, Jed Brown wrote: > On Sat, Feb 11, 2012 at 00:00, recrusader wrote: > >> in >> http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex43.c.html >> >> Why are two matrices set to MATAIJ? if I set MATAIJCUSP, is them changed? >> >> 1468: DMCreateMatrix >> (da_Stokes,MATAIJ >> ,&A); >> >> 1469: DMCreateMatrix (da_Stokes,MATAIJ ,&B); >> >> > You have to follow the code down to where those matrices are used. > > 1478: AssembleA_Stokes(A,da_Stokes,da_prop,properties); > 1479: AssembleA_PCStokes(B,da_Stokes,da_prop,properties); > > This uses a "block preconditioner", the B matrix is intentionally > different from the A matrix in order to handle the saddle point. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Sat Feb 11 08:36:27 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Sat, 11 Feb 2012 08:36:27 -0600 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: Message-ID: Yes, that's right. There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. It is expected that they are generated from mesh subdomains. Each IS does carry the subdomains subcomm. There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, each having the indices with the same color and the subcomm that supports that color. It is largely untested, though. You could try using it and give us feedback on any problems you encounter. Dmitry. On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: > About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by > multiple processors, shall I always create the arguments 'is[s]' and > 'is_local[s]' > in a subcommunicator consisting of processors supporting the subdomain 's'? > > The source code of PCGASMCreateSubdomains2D() seemingly does so. > > Thanks, > Hui > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Feb 11 10:52:17 2012 From: recrusader at gmail.com (recrusader) Date: Sat, 11 Feb 2012 10:52:17 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: Dear Jed, When I removed 'if (NONEW == -2) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"New nonzero at (%D,%D) caused a malloc",ROW,COL); \' from the following function in src/mat/impls/aij/seq/aij.h. It works in GPU mode. Do you have any comments? Thanks a lot. #define MatSeqXAIJReallocateAIJ(Amat,AM,BS2,NROW,ROW,COL,RMAX,AA,AI,AJ,RP,AP,AIMAX,NONEW,datatype) \ if (NROW >= RMAX) {\ Mat_SeqAIJ *Ain = (Mat_SeqAIJ*)Amat->data;\ /* there is no extra room in row, therefore enlarge */ \ PetscInt CHUNKSIZE = 15,new_nz = AI[AM] + CHUNKSIZE,len,*new_i=0,*new_j=0; \ datatype *new_a; \ \ /*if (NONEW == -2) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"New nonzero at (%D,%D) caused a malloc",ROW,COL); \ */ \ /* malloc new storage space */ \ ierr = PetscMalloc3(BS2*new_nz,datatype,&new_a,new_nz,PetscInt,&new_j,AM+1,PetscInt,&new_i);CHKERRQ(ierr);\ \ /* copy over old data into new slots */ \ for (ii=0; iia,&Ain->j,&Ain->i);CHKERRQ(ierr);\ AA = new_a; \ Ain->a = (MatScalar*) new_a; \ AI = Ain->i = new_i; AJ = Ain->j = new_j; \ Ain->singlemalloc = PETSC_TRUE; \ \ RP = AJ + AI[ROW]; AP = AA + BS2*AI[ROW]; \ RMAX = AIMAX[ROW] = AIMAX[ROW] + CHUNKSIZE; \ Ain->maxnz += BS2*CHUNKSIZE; \ Ain->reallocs++; \ } \ Best, Yujie On 2/11/12, Jed Brown wrote: > On Sat, Feb 11, 2012 at 00:00, recrusader wrote: > >> in >> http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex43.c.html >> >> Why are two matrices set to MATAIJ? if I set MATAIJCUSP, is them changed? >> >> 1468: >> DMCreateMatrix >> (da_Stokes,MATAIJ >> ,&A); >> >> 1469: DMCreateMatrix >> (da_Stokes,MATAIJ >> ,&B); >> >> > You have to follow the code down to where those matrices are used. > > 1478: AssembleA_Stokes(A,da_Stokes,da_prop,properties); > 1479: AssembleA_PCStokes(B,da_Stokes,da_prop,properties); > > This uses a "block preconditioner", the B matrix is intentionally different > from the A matrix in order to handle the saddle point. > From jedbrown at mcs.anl.gov Sat Feb 11 10:56:19 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 11 Feb 2012 10:56:19 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Sat, Feb 11, 2012 at 10:52, recrusader wrote: > When I removed 'if (NONEW == -2) > SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"New nonzero at > (%D,%D) caused a malloc",ROW,COL); \' from the following function in > src/mat/impls/aij/seq/aij.h. It works in GPU mode. Do you have any > comments? Thanks a lot. > If you want that effect, you can MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); or -mat_new_nonzero_allocation_err 0. The more serious problem is that preallocation information seems to be getting lost. If you don't fix that, assembly will be horrendously slow. Are you sure the Mat type is being set *before* the call to MatMPIAIJSetPreallocation()? -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Feb 11 10:58:20 2012 From: recrusader at gmail.com (recrusader) Date: Sat, 11 Feb 2012 10:58:20 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: As you suggest, I added the print codes in libmesh after creating the matrix and do MatSetOption as follows: " ierr = MatCreateMPIAIJ (libMesh::COMM_WORLD, m_local, n_local, m_global, n_global, PETSC_NULL, (int*) &n_nz[0], PETSC_NULL, (int*) &n_oz[0], &_mat); CHKERRABORT(libMesh::COMM_WORLD,ierr); MatSetOption(_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); //by Yujie std::cout<<"MatSetOption"< wrote: > On Sat, Feb 11, 2012 at 10:52, recrusader wrote: > >> When I removed 'if (NONEW == -2) >> SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"New nonzero at >> (%D,%D) caused a malloc",ROW,COL); \' from the following function in >> src/mat/impls/aij/seq/aij.h. It works in GPU mode. Do you have any >> comments? Thanks a lot. >> > > If you want that effect, you can > > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); > > or -mat_new_nonzero_allocation_err 0. > > The more serious problem is that preallocation information seems to be > getting lost. If you don't fix that, assembly will be horrendously slow. > Are you sure the Mat type is being set *before* the call to > MatMPIAIJSetPreallocation()? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckontzialis at lycos.com Sat Feb 11 10:58:45 2012 From: ckontzialis at lycos.com (Konstantinos Kontzialis) Date: Sat, 11 Feb 2012 18:58:45 +0200 Subject: [petsc-users] TSSetPostStep In-Reply-To: References: Message-ID: <4F369E45.4030000@lycos.com> Dear Jed, I use TSSolve() and version 3.2-p5. Kostas From jedbrown at mcs.anl.gov Sat Feb 11 11:06:58 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 11 Feb 2012 11:06:58 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: On Sat, Feb 11, 2012 at 10:58, recrusader wrote: > " ierr = MatCreateMPIAIJ (libMesh::COMM_WORLD, > > m_local, n_local, > m_global, n_global, > PETSC_NULL, (int*) &n_nz[0], > PETSC_NULL, (int*) &n_oz[0], &_mat); > CHKERRABORT(libMesh::COMM_WORLD,ierr); > > MatSetOption(_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); //by > Yujie > std::cout<<"MatSetOption"< > I run the same codes in CPU and GPU modes (the same parameters except that > GPU uses '-vec_type mpicusp -mat_type mpiaijcusp'). I can find > "MatSetOption" output from both the modes. Does that mean that the codes > set the options for both the modes? > Libmesh is calling MatSetFromOptions() after MatCreateMPIAIJ() which means the preallocation information will be lost if the type is changed. The code should be written differently ierr = MatCreate(comm,A);CHKERRQ(ierr); ierr = MatSetSizes(*A,m,n,M,N);CHKERRQ(ierr); ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr); ierr = MatSetType(*A,MATAIJ);CHKERRQ(ierr); ierr = MatSetOptionsPrefix(*A,optional_prefix);CHKERRQ(ierr); ierr = MatSetFromOptions(*A);CHKERRQ(ierr); ierr = MatMPIAIJSetPreallocation(*A,d_nz,d_nnz,o_nz,o_nnz);CHKERRQ(ierr); ierr = MatSeqAIJSetPreallocation(*A,d_nz,d_nnz);CHKERRQ(ierr); I can talk to the libmesh developers about making this change. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Feb 11 11:09:19 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 11 Feb 2012 11:09:19 -0600 Subject: [petsc-users] TSSetPostStep In-Reply-To: <4F369E45.4030000@lycos.com> References: <4F369E45.4030000@lycos.com> Message-ID: On Sat, Feb 11, 2012 at 10:58, Konstantinos Kontzialis < ckontzialis at lycos.com> wrote: > Dear Jed, > > I use TSSolve() and version 3.2-p5. > The loop in TSSolve() is simple. Maybe you can use a debugger to find out what happened to post-step function pointer. while (!ts->reason) { ierr = TSPreStep(ts);CHKERRQ(ierr); ierr = TSStep(ts);CHKERRQ(ierr); ierr = TSPostStep(ts);CHKERRQ(ierr); ierr = TSMonitor(ts,ts->steps,ts->ptime,ts->vec_sol);CHKERRQ(ierr); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Feb 11 14:03:48 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 11 Feb 2012 14:03:48 -0600 Subject: [petsc-users] access to matnest block (0,1) ? In-Reply-To: References: Message-ID: On Fri, Feb 3, 2012 at 08:07, Klaij, Christiaan wrote: > > I should add a MatNestGetISs() so you can get out the > automatically-created > > ISs. They will have the structure described below. If you want to work > with > > the released version, you should create the ISs yourself. > > As a user, having MatNestGetISs would be most convenient. > Pushed to petsc-dev, you have to allocate arrays to hold the returned ISs. > > > A has a contigous distribution, so the ISs must respect that. Did you try > > creating the index sets described above? Please explain "something's > wrong". > > Finally I understand the relation between the nesting and the > index sets. After replacing my incorrect ISs with the correct > ones, everything's fine (no more [0]PETSC ERROR: Arguments are > incompatible! [0]PETSC ERROR: Could not find index set!). > > Thanks a lot for your help, Jed, I really appreciate it. > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Sat Feb 11 18:40:21 2012 From: recrusader at gmail.com (recrusader) Date: Sat, 11 Feb 2012 18:40:21 -0600 Subject: [petsc-users] MatSetValues() for MATMPICUSP In-Reply-To: References: Message-ID: That's great. Thank you very much, Jed :). Best, Yujie On Sat, Feb 11, 2012 at 11:06 AM, Jed Brown wrote: > On Sat, Feb 11, 2012 at 10:58, recrusader wrote: > >> " ierr = MatCreateMPIAIJ (libMesh::COMM_WORLD, >> >> m_local, n_local, >> m_global, n_global, >> PETSC_NULL, (int*) &n_nz[0], >> PETSC_NULL, (int*) &n_oz[0], &_mat); >> CHKERRABORT(libMesh::COMM_WORLD,ierr); >> >> MatSetOption(_mat,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE); //by >> Yujie >> std::cout<<"MatSetOption"<> >> I run the same codes in CPU and GPU modes (the same parameters except >> that GPU uses '-vec_type mpicusp -mat_type mpiaijcusp'). I can find >> "MatSetOption" output from both the modes. Does that mean that the codes >> set the options for both the modes? >> > > Libmesh is calling MatSetFromOptions() after MatCreateMPIAIJ() which means > the preallocation information will be lost if the type is changed. The code > should be written differently > > ierr = MatCreate(comm,A);CHKERRQ(ierr); > ierr = MatSetSizes(*A,m,n,M,N);CHKERRQ(ierr); > ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr); > ierr = MatSetType(*A,MATAIJ);CHKERRQ(ierr); > ierr = MatSetOptionsPrefix(*A,optional_prefix);CHKERRQ(ierr); > ierr = MatSetFromOptions(*A);CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(*A,d_nz,d_nnz,o_nz,o_nnz);CHKERRQ(ierr); > ierr = MatSeqAIJSetPreallocation(*A,d_nz,d_nnz);CHKERRQ(ierr); > > I can talk to the libmesh developers about making this change. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Feb 13 18:17:27 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 13 Feb 2012 16:17:27 -0800 Subject: [petsc-users] -mat_partitioning_view Message-ID: Hi, When using -mat_partitioning_view flag, PETSc reports number of edge cuts. Is this the number of edge cuts in the graph for the new partitioning (i.e. using new global numbering)? Also, does the partitioning favors minimizing the edge cuts over having equal number of points per processor? If so, how can this be altered? Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 13 18:22:20 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 13 Feb 2012 18:22:20 -0600 Subject: [petsc-users] -mat_partitioning_view In-Reply-To: References: Message-ID: On Mon, Feb 13, 2012 at 18:17, Mohammad Mirzadeh wrote: > When using -mat_partitioning_view flag, PETSc reports number of edge cuts. > Is this the number of edge cuts in the graph for the new partitioning (i.e. > using new global numbering)? > yes > > Also, does the partitioning favors minimizing the edge cuts over having > equal number of points per processor? If so, how can this be altered? > Metis and ParMetis balance the partitions "equally" (within 1 vertex), then tries to minimize the edge cut. Assuming that is the partitioner you are using, that is what you will get. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Feb 13 18:27:43 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 13 Feb 2012 16:27:43 -0800 Subject: [petsc-users] -mat_partitioning_view In-Reply-To: References: Message-ID: Thanks Jed. Then I should be doing something wrong. I'm trying a simple graph and counting the number of edge cuts manually but the numbers don't match. I'll keep looking into it. Mohammad On Mon, Feb 13, 2012 at 4:22 PM, Jed Brown wrote: > On Mon, Feb 13, 2012 at 18:17, Mohammad Mirzadeh wrote: > >> When using -mat_partitioning_view flag, PETSc reports number of edge >> cuts. Is this the number of edge cuts in the graph for the new partitioning >> (i.e. using new global numbering)? >> > > yes > > >> >> Also, does the partitioning favors minimizing the edge cuts over having >> equal number of points per processor? If so, how can this be altered? >> > > Metis and ParMetis balance the partitions "equally" (within 1 vertex), > then tries to minimize the edge cut. Assuming that is the partitioner you > are using, that is what you will get. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Tue Feb 14 09:20:23 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Tue, 14 Feb 2012 16:20:23 +0100 Subject: [petsc-users] Null space of discrete Laplace with periodic boundary conditions Message-ID: <4F3A7BB7.9060504@tu-dresden.de> I discretize the Laplace operator (using finite element) on the unit square equipped with periodic boundary conditions on all four edges. Is it correct that the null space is still constant? I wounder, because when I run the same code on a sphere (so a 2D surface embedded in 3D), the resulting matrix is non-singular. I thought, that both cases should be somehow equal with respect to the null space? Thomas From jedbrown at mcs.anl.gov Tue Feb 14 09:25:58 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 14 Feb 2012 09:25:58 -0600 Subject: [petsc-users] Null space of discrete Laplace with periodic boundary conditions In-Reply-To: <4F3A7BB7.9060504@tu-dresden.de> References: <4F3A7BB7.9060504@tu-dresden.de> Message-ID: On Tue, Feb 14, 2012 at 09:20, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > I discretize the Laplace operator (using finite element) on the unit > square equipped with periodic boundary conditions on all four edges. Is it > correct that the null space is still constant? I wounder, because when I > run the same code on a sphere (so a 2D surface embedded in 3D), the > resulting matrix is non-singular. I thought, that both cases should be > somehow equal with respect to the null space? > The continuum operators for both cases have a constant null space, so if either is nonsingular in your finite element code, it's a discretization problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Tue Feb 14 16:47:25 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 14 Feb 2012 14:47:25 -0800 Subject: [petsc-users] Null space of discrete Laplace with periodic boundary conditions In-Reply-To: References: <4F3A7BB7.9060504@tu-dresden.de> Message-ID: What do you set on the sphere? If you impose a Dirichlet BC that makes it nonsingular Mohammad On Feb 14, 2012 7:27 AM, "Jed Brown" wrote: > On Tue, Feb 14, 2012 at 09:20, Thomas Witkowski < > thomas.witkowski at tu-dresden.de> wrote: > >> I discretize the Laplace operator (using finite element) on the unit >> square equipped with periodic boundary conditions on all four edges. Is it >> correct that the null space is still constant? I wounder, because when I >> run the same code on a sphere (so a 2D surface embedded in 3D), the >> resulting matrix is non-singular. I thought, that both cases should be >> somehow equal with respect to the null space? >> > > The continuum operators for both cases have a constant null space, so if > either is nonsingular in your finite element code, it's a discretization > problem. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Feb 14 16:48:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 14 Feb 2012 16:48:35 -0600 Subject: [petsc-users] Null space of discrete Laplace with periodic boundary conditions In-Reply-To: References: <4F3A7BB7.9060504@tu-dresden.de> Message-ID: There is no boundary. On Feb 14, 2012 5:47 PM, "Mohammad Mirzadeh" wrote: > What do you set on the sphere? If you impose a Dirichlet BC that makes it > nonsingular > > Mohammad > On Feb 14, 2012 7:27 AM, "Jed Brown" wrote: > >> On Tue, Feb 14, 2012 at 09:20, Thomas Witkowski < >> thomas.witkowski at tu-dresden.de> wrote: >> >>> I discretize the Laplace operator (using finite element) on the unit >>> square equipped with periodic boundary conditions on all four edges. Is it >>> correct that the null space is still constant? I wounder, because when I >>> run the same code on a sphere (so a 2D surface embedded in 3D), the >>> resulting matrix is non-singular. I thought, that both cases should be >>> somehow equal with respect to the null space? >>> >> >> The continuum operators for both cases have a constant null space, so if >> either is nonsingular in your finite element code, it's a discretization >> problem. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Tue Feb 14 16:51:16 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 14 Feb 2012 14:51:16 -0800 Subject: [petsc-users] Null space of discrete Laplace with periodic boundary conditions In-Reply-To: References: <4F3A7BB7.9060504@tu-dresden.de> Message-ID: Oh sorry. This is surface Laplacian . My bad On Feb 14, 2012 2:48 PM, "Jed Brown" wrote: > > There is no boundary. > > On Feb 14, 2012 5:47 PM, "Mohammad Mirzadeh" wrote: >> >> What do you set on the sphere? If you impose a Dirichlet BC that makes it nonsingular >> >> Mohammad >> >> On Feb 14, 2012 7:27 AM, "Jed Brown" wrote: >>> >>> On Tue, Feb 14, 2012 at 09:20, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: >>>> >>>> I discretize the Laplace operator (using finite element) on the unit square equipped with periodic boundary conditions on all four edges. Is it correct that the null space is still constant? I wounder, because when I run the same code on a sphere (so a 2D surface embedded in 3D), the resulting matrix is non-singular. I thought, that both cases should be somehow equal with respect to the null space? >>> >>> >>> The continuum operators for both cases have a constant null space, so if either is nonsingular in your finite element code, it's a discretization problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Wed Feb 15 03:32:32 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 15 Feb 2012 01:32:32 -0800 Subject: [petsc-users] valgrind Message-ID: Hi, This might just not be important but I'm still curious about it. When I run valgrind on this: int main (int argc, char **argv){ PetscInitialize(&argc, &argv,(char *)0,help); PetscFinalize(); return 0; } I get, ==7262== HEAP SUMMARY: ==7262== in use at exit: 300 bytes in 11 blocks ==7262== total heap usage: 203 allocs, 192 frees, 643,759 bytes allocated ==7262== ==7262== LEAK SUMMARY: ==7262== definitely lost: 60 bytes in 1 blocks ==7262== indirectly lost: 240 bytes in 10 blocks ==7262== possibly lost: 0 bytes in 0 blocks ==7262== still reachable: 0 bytes in 0 blocks ==7262== suppressed: 0 bytes in 0 blocks ==7262== Rerun with --leak-check=full to see details of leaked memory Is this normal? Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From bibrakc at gmail.com Wed Feb 15 04:12:32 2012 From: bibrakc at gmail.com (Bibrak Qamar) Date: Wed, 15 Feb 2012 14:12:32 +0400 Subject: [petsc-users] =?windows-1252?q?PETSc_should_not_be_used_to_attemp?= =?windows-1252?q?t_to_provide_a_=93parallel_linear_solver=94_=3F?= =?windows-1252?q?=3F?= Message-ID: Hello all, I am new to PETSc so here comes a newbie question. On the first page of the manual it says that "? PETSc should not be used to attempt to provide a ?parallel linear solver? in an otherwise sequential code. Certainly all parts of a previously sequential code need not be parallelized but the matrix generation portion must be parallelized to expect any kind of reasonable performance. Do not expect to generate your matrix sequentially and then ?use PETSc? to solve the linear system in parallel. " Please help me understand what is meant by it. What if I read the matrix from a file (sequential) and distribute it to other processes and then call any linear solver, how is it going to affect the performance of the code? because reading or generating the matrix is a one time cost for one timestep, or may be that CAUTION is for multiple timesteps? Thnaks Bibrak -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmnasr at gmail.com Wed Feb 15 04:11:00 2012 From: mmnasr at gmail.com (Mohamad M. Nasr-Azadani) Date: Wed, 15 Feb 2012 02:11:00 -0800 Subject: [petsc-users] valgrind In-Reply-To: References: Message-ID: That "300 bytes" sounds familiar. :-) I have also seen this before. According to this, http://lists.mcs.anl.gov/pipermail/petsc-users/2011-December/011690.html I think it has something to do with OS and not important. Best, Mohamad Mohamad On Wed, Feb 15, 2012 at 1:32 AM, Mohammad Mirzadeh wrote: > Hi, > > This might just not be important but I'm still curious about it. When I > run valgrind on this: > > int main (int argc, char **argv){ > > PetscInitialize(&argc, &argv,(char *)0,help); > PetscFinalize(); > > > return 0; > } > > I get, > > ==7262== HEAP SUMMARY: > ==7262== in use at exit: 300 bytes in 11 blocks > ==7262== total heap usage: 203 allocs, 192 frees, 643,759 bytes allocated > ==7262== > ==7262== LEAK SUMMARY: > ==7262== definitely lost: 60 bytes in 1 blocks > ==7262== indirectly lost: 240 bytes in 10 blocks > ==7262== possibly lost: 0 bytes in 0 blocks > ==7262== still reachable: 0 bytes in 0 blocks > ==7262== suppressed: 0 bytes in 0 blocks > ==7262== Rerun with --leak-check=full to see details of leaked memory > > > Is this normal? > > Thanks, > Mohammad > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Wed Feb 15 04:19:54 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Wed, 15 Feb 2012 11:19:54 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: Message-ID: Hi Dmitry, thanks a lot! Currently, I'm not using ISColoring. Just comes another question on PCGASMSetModifySubMatrices(). The user provided function has the prototype func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); I think the coloumns from the parameter 'col' are always the same as the rows from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts index sets but not rows and columns. Has I misunderstood something? thanks, Hui On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: > Yes, that's right. > There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. > It is expected that they are generated from mesh subdomains. > Each IS does carry the subdomains subcomm. > > There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, > each having the indices with the same color and the subcomm that supports that color. It is > largely untested, though. You could try using it and give us feedback on any problems you encounter. > > Dmitry. > > > On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: > About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by > multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' > in a subcommunicator consisting of processors supporting the subdomain 's'? > > The source code of PCGASMCreateSubdomains2D() seemingly does so. > > Thanks, > Hui > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From longmin.ran at gmail.com Wed Feb 15 04:22:36 2012 From: longmin.ran at gmail.com (Longmin RAN) Date: Wed, 15 Feb 2012 11:22:36 +0100 Subject: [petsc-users] MatMult doesn't give the correct result Message-ID: Dear all, I tried to use MatMult on a 2*2 matrix but could not get the right answer. With MatView I saw the Matrix was correctly defined as : row 0: (0, 0.707107) row 1: (1, 0.707107) The vector to be multiplied is 0 -0.0141421 But MatMult gave the result vector as 0 -0.01 Has anyone a clue to the problem? Thank you for you help in advance. Cheers, Longmin -------------- next part -------------- An HTML attachment was scrubbed... URL: From longmin.ran at gmail.com Wed Feb 15 04:43:35 2012 From: longmin.ran at gmail.com (Longmin RAN) Date: Wed, 15 Feb 2012 11:43:35 +0100 Subject: [petsc-users] MatMult doesn't give the correct result In-Reply-To: References: Message-ID: Sorry, the result is indeed correct. Please just forget my precedent mail, and this one too. On Wed, Feb 15, 2012 at 11:22 AM, Longmin RAN wrote: > Dear all, > > I tried to use MatMult on a 2*2 matrix but could not get the right answer. > With MatView I saw the Matrix was correctly defined as : > row 0: (0, 0.707107) > row 1: (1, 0.707107) > > The vector to be multiplied is > 0 > -0.0141421 > > But MatMult gave the result vector as > 0 > -0.01 > > Has anyone a clue to the problem? Thank you for you help in advance. > > > Cheers, > > Longmin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Feb 15 06:39:14 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 15 Feb 2012 07:39:14 -0500 Subject: [petsc-users] =?utf-8?q?PETSc_should_not_be_used_to_attempt_to_pr?= =?utf-8?b?b3ZpZGUgYSDigJxwYXJhbGxlbCBsaW5lYXIgc29sdmVy4oCdID8/?= In-Reply-To: References: Message-ID: On Wed, Feb 15, 2012 at 05:12, Bibrak Qamar wrote: > "? PETSc should not be used to attempt to provide a ?parallel linear > solver? in an otherwise sequential > code. Certainly all parts of a previously sequential code need not be > parallelized but the matrix > generation portion must be parallelized to expect any kind of reasonable > performance. Do not expect > to generate your matrix sequentially and then ?use PETSc? to solve the > linear system in parallel. > " > Please help me understand what is meant by it. What if I read the matrix > from a file (sequential) and distribute it to other processes and then call > any linear solver, how is it going to affect the performance of the code? > because reading or generating the matrix is a one time cost for one > timestep, or may be that CAUTION is for multiple timesteps? > You should work on the assumption that reading the matrix from a file is at least 100 times more expensive than solving a linear system (more if you're running on thousands of processors). If you are going to solve with the same matrix enough times, then it's okay. Note that if you save the matrix in binary format, you can read in parallel using MatLoad(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Wed Feb 15 10:07:57 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Wed, 15 Feb 2012 17:07:57 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: Message-ID: On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: > Hi Dmitry, > > thanks a lot! Currently, I'm not using ISColoring. Just comes another question > on PCGASMSetModifySubMatrices(). The user provided function has the prototype > > func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); > > I think the coloumns from the parameter 'col' are always the same as the rows > from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts > index sets but not rows and columns. Has I misunderstood something? As I tested, the row and col are always the same. I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's in the above func()? thanks, Hui > > thanks, > Hui > > > On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: > >> Yes, that's right. >> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >> It is expected that they are generated from mesh subdomains. >> Each IS does carry the subdomains subcomm. >> >> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >> each having the indices with the same color and the subcomm that supports that color. It is >> largely untested, though. You could try using it and give us feedback on any problems you encounter. >> >> Dmitry. >> >> >> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >> in a subcommunicator consisting of processors supporting the subdomain 's'? >> >> The source code of PCGASMCreateSubdomains2D() seemingly does so. >> >> Thanks, >> Hui >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Feb 15 10:18:48 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 15 Feb 2012 10:18:48 -0600 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: Message-ID: <47C4D9F5-860B-4D5C-B7AF-4DA138EED927@mcs.anl.gov> try it On Feb 15, 2012, at 10:07 AM, Hui Zhang wrote: > On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: > >> Hi Dmitry, >> >> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >> >> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >> >> I think the coloumns from the parameter 'col' are always the same as the rows >> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >> index sets but not rows and columns. Has I misunderstood something? > > As I tested, the row and col are always the same. > > I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's > in the above func()? > > thanks, > Hui > >> >> thanks, >> Hui >> >> >> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >> >>> Yes, that's right. >>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>> It is expected that they are generated from mesh subdomains. >>> Each IS does carry the subdomains subcomm. >>> >>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>> each having the indices with the same color and the subcomm that supports that color. It is >>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>> >>> Dmitry. >>> >>> >>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>> >>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>> >>> Thanks, >>> Hui >>> >>> >> > From mike.hui.zhang at hotmail.com Wed Feb 15 10:26:46 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Wed, 15 Feb 2012 17:26:46 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: <47C4D9F5-860B-4D5C-B7AF-4DA138EED927@mcs.anl.gov> References: <47C4D9F5-860B-4D5C-B7AF-4DA138EED927@mcs.anl.gov> Message-ID: On Feb 15, 2012, at 5:18 PM, Barry Smith wrote: > > try it Yes, I'm trying. Just one more question: why there is no *Get* LocalToGlobalMapping so that when a Mat is with some LocalToGlobalMapping we can temporarily change with another mapping and reset back to the original mapping. > > On Feb 15, 2012, at 10:07 AM, Hui Zhang wrote: > >> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >> >>> Hi Dmitry, >>> >>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>> >>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>> >>> I think the coloumns from the parameter 'col' are always the same as the rows >>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>> index sets but not rows and columns. Has I misunderstood something? >> >> As I tested, the row and col are always the same. >> >> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >> in the above func()? >> >> thanks, >> Hui >> >>> >>> thanks, >>> Hui >>> >>> >>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>> >>>> Yes, that's right. >>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>> It is expected that they are generated from mesh subdomains. >>>> Each IS does carry the subdomains subcomm. >>>> >>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>> each having the indices with the same color and the subcomm that supports that color. It is >>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>> >>>> Dmitry. >>>> >>>> >>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>> >>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>> >>>> Thanks, >>>> Hui >>>> >>>> >>> >> > > From bsmith at mcs.anl.gov Wed Feb 15 10:31:16 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 15 Feb 2012 10:31:16 -0600 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <47C4D9F5-860B-4D5C-B7AF-4DA138EED927@mcs.anl.gov> Message-ID: <464DBE01-B998-4ECA-8A06-60ECE6796B13@mcs.anl.gov> On Feb 15, 2012, at 10:26 AM, Hui Zhang wrote: > > On Feb 15, 2012, at 5:18 PM, Barry Smith wrote: > >> >> try it > > Yes, I'm trying. Just one more question: why there is no *Get* LocalToGlobalMapping > so that when a Mat is with some LocalToGlobalMapping we can temporarily change with > another mapping and reset back to the original mapping. There is. Perhaps it is only in petsc-dev Barry > >> >> On Feb 15, 2012, at 10:07 AM, Hui Zhang wrote: >> >>> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >>> >>>> Hi Dmitry, >>>> >>>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>>> >>>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>>> >>>> I think the coloumns from the parameter 'col' are always the same as the rows >>>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>>> index sets but not rows and columns. Has I misunderstood something? >>> >>> As I tested, the row and col are always the same. >>> >>> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >>> in the above func()? >>> >>> thanks, >>> Hui >>> >>>> >>>> thanks, >>>> Hui >>>> >>>> >>>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>>> >>>>> Yes, that's right. >>>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>>> It is expected that they are generated from mesh subdomains. >>>>> Each IS does carry the subdomains subcomm. >>>>> >>>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>>> each having the indices with the same color and the subcomm that supports that color. It is >>>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>>> >>>>> Dmitry. >>>>> >>>>> >>>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>>> >>>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>>> >>>>> Thanks, >>>>> Hui >>>>> >>>>> >>>> >>> >> >> > From sylbar.vainbot at gmail.com Wed Feb 15 17:25:44 2012 From: sylbar.vainbot at gmail.com (Sylvain Barbot) Date: Wed, 15 Feb 2012 15:25:44 -0800 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel Message-ID: Hi All, I'm implementing an elasticity multi-grid solver with Petsc with matrix shells. I am using the Red-Black Gauss-Siedel smoother. In the middle of the Red-Black passes, I need to update the ghost points between processors. To do that effectively, I'd like to update only the ghost points between the different processors, instead of the whole array. The goal is to do the most possible operations in place in local array lx, and having to update global vector x only once. The working matrix multiply smoothing operation is shown below. I'm using Petsc 3.1. I read about http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html, but I'm not entirely clear whether this does what I want or not. SUBROUTINE matrixsmooth(A,b,omega,flag,shift,its,level,x,ierr) USE types USE heap USE petscsys USE petscda USE petscis USE petscvec USE petscmat USE petscpc USE petscksp IMPLICIT NONE Mat, INTENT(IN) :: A Vec, INTENT(IN) :: b Vec, INTENT(INOUT) :: x PetscReal, INTENT(IN) :: omega,shift MatSORType, INTENT(IN) :: flag PetscInt, INTENT(IN) :: its, level ! iterations, multi-grid level PetscErrorCode, INTENT(OUT) :: ierr Vec :: lx,lb PetscInt :: istart,iend PetscScalar, POINTER :: px(:) ! pointer to solution array PetscScalar, POINTER :: pb(:) ! pointer to body-force array INTEGER :: rank,isize,i,k INTEGER :: sw,off,p INTEGER :: i2,i3,i2i,i3i, & i000,i0p0,i00p, & i0m0,i00m TYPE(DALOCALINFOF90) :: info CALL MPI_COMM_RANK(PETSC_COMM_WORLD,rank,ierr) CALL MPI_COMM_SIZE(PETSC_COMM_WORLD,isize,ierr) CALL VecGetOwnershipRange(x,istart,iend,ierr) ! allocate memory for local vector with ghost points CALL DACreateLocalVector(c%daul(1+level),lx,ierr) CALL VecDuplicate(lx,lb,ierr) ! retrieve forcing term b with ghost points CALL DAGlobalToLocalBegin(c%daul(1+level),b,INSERT_VALUES,lb,ierr) CALL DAGlobalToLocalEnd(c%daul(1+level),b,INSERT_VALUES,lb,ierr) ! obtain a pointer to local data CALL VecGetArrayF90(lx,px,ierr); CALL VecGetArrayF90(lb,pb,ierr); ! geometry info about local vector and padding CALL DAGetLocalInfoF90(c%daul(1+level),info,ierr); ! retrieve stencil width (ghost-point padding) CALL DAGetInfo(c%daul(1+level),PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & sw,PETSC_NULL_INTEGER, & PETSC_NULL_INTEGER,ierr); ! offset due to node topology off=MOD(info%xm+info%ym,2) ! fast smoothing (relaxation) its=number of iteration DO k=1,its ! Red-Black Gauss-Siedel scheme DO p=0,1 ! retrieve initial guess x with ghost points CALL DAGlobalToLocalBegin(c%daul(1+level),x,INSERT_VALUES,lx,ierr) CALL DAGlobalToLocalEnd(c%daul(1+level),x,INSERT_VALUES,lx,ierr) ! smoothing (relaxation) DO i=istart+p+off,iend-1,info%dof*2 i3=(i-istart)/(info%xm*info%dof) i2=(i-istart-i3*info%xm*info%dof)/info%dof i3=i3+i3i ! i3 in ( 0 .. sx3-1 ) i2=i2+i2i ! i2 in ( 0 .. sx2-1 ) i000=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof i0p0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+1)*info%dof i00p=1+((sw+i3-i3i+1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof i0m0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i-1)*info%dof i00m=1+((sw+i3-i3i-1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof px(i000+IU1)=px(i000+IU1)+ & (pb(i000+IU1)-((+(px(i0p0+IU1)-px(i000+IU1) )-(px(i000+IU1)-px(i0m0+IU1) )) & +(+(px(i00p+IU1)-px(i000+IU1) )-(px(i000+IU1)-px(i00m+IU1) )))) / (-4._8) END DO ! publish new values of x for its global vector CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) END DO END DO ! dispose of the local vectors CALL VecRestoreArrayF90(lx,px,ierr) CALL VecRestoreArrayF90(lb,pb,ierr) CALL DARestoreLocalVector(c%daul(1+level),lx,ierr) CALL DARestoreLocalVector(c%daul(1+level),lb,ierr) END SUBROUTINE matrixsmooth Could this be changed to something like: ... INITIALIZE... CALL DAGlobalToLocalBegin(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) CALL DAGlobalToLocalEnd(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) ! fast smoothing (relaxation) its=number of iteration DO k=1,its ! Red-Black Gauss-Siedel scheme DO p=0,1 ! smoothing (relaxation) DO i=istart+p+off,iend-1,info%dof*2 ...STENCIL OPERATION... END DO ...UPDATE GHOST POINTS... END DO END DO ! publish new values of x for its global vector CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) ...CLEAN MEMORY... Any recommendation? Best wishes, Sylvain Barbot From knepley at gmail.com Wed Feb 15 17:30:23 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 15 Feb 2012 17:30:23 -0600 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel In-Reply-To: References: Message-ID: On Wed, Feb 15, 2012 at 5:25 PM, Sylvain Barbot wrote: > Hi All, > > I'm implementing an elasticity multi-grid solver with Petsc with > matrix shells. I am using the Red-Black Gauss-Siedel smoother. In the > middle of the Red-Black passes, I need to update the ghost points > between processors. To do that effectively, I'd like to update only > the ghost points between the different processors, instead of the > whole array. The goal is to do the most possible operations in place > What does that mean? Updating ghost dofs means taking the value of that dof on the process that owns it, and copying it to all the processes which hold that as a ghost dof. This operation has no meaning for interior points. Matt > in local array lx, and having to update global vector x only once. The > working matrix multiply smoothing operation is shown below. I'm using > Petsc 3.1. I read about > > http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html > , > but I'm not entirely clear whether this does what I want or not. > > > SUBROUTINE matrixsmooth(A,b,omega,flag,shift,its,level,x,ierr) > USE types > USE heap > USE petscsys > USE petscda > USE petscis > USE petscvec > USE petscmat > USE petscpc > USE petscksp > > IMPLICIT NONE > > Mat, INTENT(IN) :: A > Vec, INTENT(IN) :: b > Vec, INTENT(INOUT) :: x > PetscReal, INTENT(IN) :: omega,shift > MatSORType, INTENT(IN) :: flag > PetscInt, INTENT(IN) :: its, level ! iterations, multi-grid level > PetscErrorCode, INTENT(OUT) :: ierr > > Vec :: lx,lb > PetscInt :: istart,iend > PetscScalar, POINTER :: px(:) ! pointer to solution array > PetscScalar, POINTER :: pb(:) ! pointer to body-force array > INTEGER :: rank,isize,i,k > INTEGER :: sw,off,p > INTEGER :: i2,i3,i2i,i3i, & > i000,i0p0,i00p, & > i0m0,i00m > TYPE(DALOCALINFOF90) :: info > > CALL MPI_COMM_RANK(PETSC_COMM_WORLD,rank,ierr) > CALL MPI_COMM_SIZE(PETSC_COMM_WORLD,isize,ierr) > > CALL VecGetOwnershipRange(x,istart,iend,ierr) > > ! allocate memory for local vector with ghost points > CALL DACreateLocalVector(c%daul(1+level),lx,ierr) > CALL VecDuplicate(lx,lb,ierr) > > ! retrieve forcing term b with ghost points > CALL DAGlobalToLocalBegin(c%daul(1+level),b,INSERT_VALUES,lb,ierr) > CALL DAGlobalToLocalEnd(c%daul(1+level),b,INSERT_VALUES,lb,ierr) > > ! obtain a pointer to local data > CALL VecGetArrayF90(lx,px,ierr); > CALL VecGetArrayF90(lb,pb,ierr); > > ! geometry info about local vector and padding > CALL DAGetLocalInfoF90(c%daul(1+level),info,ierr); > > ! retrieve stencil width (ghost-point padding) > CALL DAGetInfo(c%daul(1+level),PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & > > > PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, > & > PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & > sw,PETSC_NULL_INTEGER, & > PETSC_NULL_INTEGER,ierr); > > ! offset due to node topology > off=MOD(info%xm+info%ym,2) > > ! fast smoothing (relaxation) its=number of iteration > DO k=1,its > ! Red-Black Gauss-Siedel scheme > DO p=0,1 > ! retrieve initial guess x with ghost points > CALL DAGlobalToLocalBegin(c%daul(1+level),x,INSERT_VALUES,lx,ierr) > CALL DAGlobalToLocalEnd(c%daul(1+level),x,INSERT_VALUES,lx,ierr) > ! smoothing (relaxation) > DO i=istart+p+off,iend-1,info%dof*2 > i3=(i-istart)/(info%xm*info%dof) > i2=(i-istart-i3*info%xm*info%dof)/info%dof > i3=i3+i3i ! i3 in ( 0 .. sx3-1 ) > i2=i2+i2i ! i2 in ( 0 .. sx2-1 ) > > i000=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof > > i0p0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+1)*info%dof > i00p=1+((sw+i3-i3i+1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof > i0m0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i-1)*info%dof > i00m=1+((sw+i3-i3i-1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof > > px(i000+IU1)=px(i000+IU1)+ & > (pb(i000+IU1)-((+(px(i0p0+IU1)-px(i000+IU1) > )-(px(i000+IU1)-px(i0m0+IU1) )) & > > +(+(px(i00p+IU1)-px(i000+IU1) )-(px(i000+IU1)-px(i00m+IU1) )))) / > (-4._8) > END DO > > ! publish new values of x for its global vector > CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) > > END DO > END DO > > ! dispose of the local vectors > CALL VecRestoreArrayF90(lx,px,ierr) > CALL VecRestoreArrayF90(lb,pb,ierr) > CALL DARestoreLocalVector(c%daul(1+level),lx,ierr) > CALL DARestoreLocalVector(c%daul(1+level),lb,ierr) > > END SUBROUTINE matrixsmooth > > > > > > Could this be changed to something like: > > > > ... INITIALIZE... > > CALL DAGlobalToLocalBegin(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) > CALL DAGlobalToLocalEnd(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) > > ! fast smoothing (relaxation) its=number of iteration > DO k=1,its > ! Red-Black Gauss-Siedel scheme > DO p=0,1 > ! smoothing (relaxation) > DO i=istart+p+off,iend-1,info%dof*2 > ...STENCIL OPERATION... > END DO > > ...UPDATE GHOST POINTS... > > END DO > END DO > > ! publish new values of x for its global vector > CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) > > ...CLEAN MEMORY... > > > > > > Any recommendation? > > Best wishes, > Sylvain Barbot > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Feb 15 17:32:04 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 15 Feb 2012 18:32:04 -0500 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel In-Reply-To: References: Message-ID: On Wed, Feb 15, 2012 at 18:25, Sylvain Barbot wrote: > Hi All, > > I'm implementing an elasticity multi-grid solver with Petsc with > matrix shells. I am using the Red-Black Gauss-Siedel smoother. In the > middle of the Red-Black passes, I need to update the ghost points > between processors. To do that effectively, I'd like to update only > the ghost points between the different processors, instead of the > whole array. The goal is to do the most possible operations in place > in local array lx, and having to update global vector x only once. The > working matrix multiply smoothing operation is shown below. I'm using > Petsc 3.1. I read about > > http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html > , > but I'm not entirely clear whether this does what I want or not. > VecGhost does not work with a DA (pretty obvious since the structured grid layout doesn't put those points near each other). Writing just the updated points into the local array will cost as much as updating the whole thing, so don't worry about it. Your red-black Gauss-Seidel will have horrible cache performance anyway, so no need to sweat about the part that isn't the bottleneck. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylbar.vainbot at gmail.com Wed Feb 15 20:53:14 2012 From: sylbar.vainbot at gmail.com (Sylvain Barbot) Date: Wed, 15 Feb 2012 18:53:14 -0800 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel In-Reply-To: References: Message-ID: Hi Matt, You're right, I'm not worried about the interior points, that why I'm looking for updating the ghost points only - the interior points unaffected by the process. Jed seems to indicate that it's not a bottle neck. Jed, Would you recommend a particular step to improve cache performance? Can you be a bit more specific? Sylvain 2012/2/15 Matthew Knepley : > On Wed, Feb 15, 2012 at 5:25 PM, Sylvain Barbot > wrote: >> >> Hi All, >> >> I'm implementing an elasticity multi-grid solver with Petsc with >> matrix shells. I am using the Red-Black Gauss-Siedel smoother. In the >> middle of the Red-Black passes, I need to update the ghost points >> between processors. To do that effectively, I'd like to update only >> the ghost points between the different processors, instead of the >> whole array. The goal is to do the most possible operations in place > > > What does that mean? Updating ghost dofs means taking the value of that > dof on the process that owns it, and copying it to all the processes which > hold > that as a ghost dof. This operation has no meaning for interior points. > > ? ? Matt > >> >> in local array lx, and having to update global vector x only once. The >> working matrix multiply smoothing operation is shown below. I'm using >> Petsc 3.1. I read about >> >> http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html, >> but I'm not entirely clear whether this does what I want or not. >> >> >> SUBROUTINE matrixsmooth(A,b,omega,flag,shift,its,level,x,ierr) >> ? ?USE types >> ? ?USE heap >> ? ?USE petscsys >> ? ?USE petscda >> ? ?USE petscis >> ? ?USE petscvec >> ? ?USE petscmat >> ? ?USE petscpc >> ? ?USE petscksp >> >> ? ?IMPLICIT NONE >> >> ? ?Mat, INTENT(IN) :: A >> ? ?Vec, INTENT(IN) :: b >> ? ?Vec, INTENT(INOUT) :: x >> ? ?PetscReal, INTENT(IN) :: omega,shift >> ? ?MatSORType, INTENT(IN) :: flag >> ? ?PetscInt, INTENT(IN) :: its, level ! iterations, multi-grid level >> ? ?PetscErrorCode, INTENT(OUT) :: ierr >> >> ? ?Vec :: lx,lb >> ? ?PetscInt :: istart,iend >> ? ?PetscScalar, POINTER :: px(:) ! pointer to solution array >> ? ?PetscScalar, POINTER :: pb(:) ! pointer to body-force array >> ? ?INTEGER :: rank,isize,i,k >> ? ?INTEGER :: sw,off,p >> ? ?INTEGER :: i2,i3,i2i,i3i, & >> ? ? ? ? ? ? ? i000,i0p0,i00p, & >> ? ? ? ? ? ? ? i0m0,i00m >> ? ?TYPE(DALOCALINFOF90) :: info >> >> ? ?CALL MPI_COMM_RANK(PETSC_COMM_WORLD,rank,ierr) >> ? ?CALL MPI_COMM_SIZE(PETSC_COMM_WORLD,isize,ierr) >> >> ? ?CALL VecGetOwnershipRange(x,istart,iend,ierr) >> >> ? ?! allocate memory for local vector with ghost points >> ? ?CALL DACreateLocalVector(c%daul(1+level),lx,ierr) >> ? ?CALL VecDuplicate(lx,lb,ierr) >> >> ? ?! retrieve forcing term b with ghost points >> ? ?CALL DAGlobalToLocalBegin(c%daul(1+level),b,INSERT_VALUES,lb,ierr) >> ? ?CALL DAGlobalToLocalEnd(c%daul(1+level),b,INSERT_VALUES,lb,ierr) >> >> ? ?! obtain a pointer to local data >> ? ?CALL VecGetArrayF90(lx,px,ierr); >> ? ?CALL VecGetArrayF90(lb,pb,ierr); >> >> ? ?! geometry info about local vector and padding >> ? ?CALL DAGetLocalInfoF90(c%daul(1+level),info,ierr); >> >> ? ?! retrieve stencil width (ghost-point padding) >> ? ?CALL DAGetInfo(c%daul(1+level),PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & >> >> >> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, >> & >> ? ? ? ? ? ? ? ? ? PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & >> ? ? ? ? ? ? ? ? ? sw,PETSC_NULL_INTEGER, & >> ? ? ? ? ? ? ? ? ? PETSC_NULL_INTEGER,ierr); >> >> ? ?! offset due to node topology >> ? ?off=MOD(info%xm+info%ym,2) >> >> ? ?! fast smoothing (relaxation) its=number of iteration >> ? ?DO k=1,its >> ? ? ? ! Red-Black Gauss-Siedel scheme >> ? ? ? DO p=0,1 >> ? ? ? ? ?! retrieve initial guess x with ghost points >> ? ? ? ? ?CALL >> DAGlobalToLocalBegin(c%daul(1+level),x,INSERT_VALUES,lx,ierr) >> ? ? ? ? ?CALL DAGlobalToLocalEnd(c%daul(1+level),x,INSERT_VALUES,lx,ierr) >> ? ? ? ? ?! smoothing (relaxation) >> ? ? ? ? ?DO i=istart+p+off,iend-1,info%dof*2 >> ? ? ? ? ? ? i3=(i-istart)/(info%xm*info%dof) >> ? ? ? ? ? ? i2=(i-istart-i3*info%xm*info%dof)/info%dof >> ? ? ? ? ? ? i3=i3+i3i ? ?! i3 in ( 0 .. sx3-1 ) >> ? ? ? ? ? ? i2=i2+i2i ? ?! i2 in ( 0 .. sx2-1 ) >> >> ? ? ? ? ? ? i000=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof >> >> ? ? ? ? ? ? i0p0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+1)*info%dof >> ? ? ? ? ? ? i00p=1+((sw+i3-i3i+1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof >> ? ? ? ? ? ? i0m0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i-1)*info%dof >> ? ? ? ? ? ? i00m=1+((sw+i3-i3i-1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof >> >> ? ? ? ? ? ? px(i000+IU1)=px(i000+IU1)+ & >> ? ? ? ? ? ? ? ? ? ? (pb(i000+IU1)-((+(px(i0p0+IU1)-px(i000+IU1) >> )-(px(i000+IU1)-px(i0m0+IU1) )) & >> >> +(+(px(i00p+IU1)-px(i000+IU1) )-(px(i000+IU1)-px(i00m+IU1) )))) / >> (-4._8) >> ? ? ? ? ?END DO >> >> ? ? ? ? ?! publish new values of x for its global vector >> ? ? ? ? ?CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) >> >> ? ? ? END DO >> ? ?END DO >> >> ? ?! dispose of the local vectors >> ? ?CALL VecRestoreArrayF90(lx,px,ierr) >> ? ?CALL VecRestoreArrayF90(lb,pb,ierr) >> ? ?CALL DARestoreLocalVector(c%daul(1+level),lx,ierr) >> ? ?CALL DARestoreLocalVector(c%daul(1+level),lb,ierr) >> >> ?END SUBROUTINE matrixsmooth >> >> >> >> >> >> Could this be changed to something like: >> >> >> >> ? ?... INITIALIZE... >> >> ? ?CALL DAGlobalToLocalBegin(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) >> ? ?CALL DAGlobalToLocalEnd(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) >> >> ? ! fast smoothing (relaxation) its=number of iteration >> ? ?DO k=1,its >> ? ? ? ! Red-Black Gauss-Siedel scheme >> ? ? ? DO p=0,1 >> ? ? ? ? ?! smoothing (relaxation) >> ? ? ? ? ?DO i=istart+p+off,iend-1,info%dof*2 >> ? ? ? ? ? ? ...STENCIL OPERATION... >> ? ? ? ? ?END DO >> >> ? ? ? ? ?...UPDATE GHOST POINTS... >> >> ? ? ? END DO >> ? ?END DO >> >> ? ! publish new values of x for its global vector >> ? ?CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) >> >> ? ...CLEAN MEMORY... >> >> >> >> >> >> Any recommendation? >> >> Best wishes, >> Sylvain Barbot > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener From knepley at gmail.com Wed Feb 15 20:55:44 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 15 Feb 2012 20:55:44 -0600 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel In-Reply-To: References: Message-ID: On Wed, Feb 15, 2012 at 8:53 PM, Sylvain Barbot wrote: > Hi Matt, > > You're right, I'm not worried about the interior points, that why I'm > looking for updating the ghost points only - the interior points > unaffected by the process. Jed seems to indicate that it's not a > bottle neck. > > Jed, > > Would you recommend a particular step to improve cache performance? > Can you be a bit more specific? > R-B has crappy cache performance since you alternately pull data but ignore half of it. Why not just use the GMG in PETSc? Its very flexible, scalable, and efficient. Matt > Sylvain > > 2012/2/15 Matthew Knepley : > > On Wed, Feb 15, 2012 at 5:25 PM, Sylvain Barbot < > sylbar.vainbot at gmail.com> > > wrote: > >> > >> Hi All, > >> > >> I'm implementing an elasticity multi-grid solver with Petsc with > >> matrix shells. I am using the Red-Black Gauss-Siedel smoother. In the > >> middle of the Red-Black passes, I need to update the ghost points > >> between processors. To do that effectively, I'd like to update only > >> the ghost points between the different processors, instead of the > >> whole array. The goal is to do the most possible operations in place > > > > > > What does that mean? Updating ghost dofs means taking the value of that > > dof on the process that owns it, and copying it to all the processes > which > > hold > > that as a ghost dof. This operation has no meaning for interior points. > > > > Matt > > > >> > >> in local array lx, and having to update global vector x only once. The > >> working matrix multiply smoothing operation is shown below. I'm using > >> Petsc 3.1. I read about > >> > >> > http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html > , > >> but I'm not entirely clear whether this does what I want or not. > >> > >> > >> SUBROUTINE matrixsmooth(A,b,omega,flag,shift,its,level,x,ierr) > >> USE types > >> USE heap > >> USE petscsys > >> USE petscda > >> USE petscis > >> USE petscvec > >> USE petscmat > >> USE petscpc > >> USE petscksp > >> > >> IMPLICIT NONE > >> > >> Mat, INTENT(IN) :: A > >> Vec, INTENT(IN) :: b > >> Vec, INTENT(INOUT) :: x > >> PetscReal, INTENT(IN) :: omega,shift > >> MatSORType, INTENT(IN) :: flag > >> PetscInt, INTENT(IN) :: its, level ! iterations, multi-grid level > >> PetscErrorCode, INTENT(OUT) :: ierr > >> > >> Vec :: lx,lb > >> PetscInt :: istart,iend > >> PetscScalar, POINTER :: px(:) ! pointer to solution array > >> PetscScalar, POINTER :: pb(:) ! pointer to body-force array > >> INTEGER :: rank,isize,i,k > >> INTEGER :: sw,off,p > >> INTEGER :: i2,i3,i2i,i3i, & > >> i000,i0p0,i00p, & > >> i0m0,i00m > >> TYPE(DALOCALINFOF90) :: info > >> > >> CALL MPI_COMM_RANK(PETSC_COMM_WORLD,rank,ierr) > >> CALL MPI_COMM_SIZE(PETSC_COMM_WORLD,isize,ierr) > >> > >> CALL VecGetOwnershipRange(x,istart,iend,ierr) > >> > >> ! allocate memory for local vector with ghost points > >> CALL DACreateLocalVector(c%daul(1+level),lx,ierr) > >> CALL VecDuplicate(lx,lb,ierr) > >> > >> ! retrieve forcing term b with ghost points > >> CALL DAGlobalToLocalBegin(c%daul(1+level),b,INSERT_VALUES,lb,ierr) > >> CALL DAGlobalToLocalEnd(c%daul(1+level),b,INSERT_VALUES,lb,ierr) > >> > >> ! obtain a pointer to local data > >> CALL VecGetArrayF90(lx,px,ierr); > >> CALL VecGetArrayF90(lb,pb,ierr); > >> > >> ! geometry info about local vector and padding > >> CALL DAGetLocalInfoF90(c%daul(1+level),info,ierr); > >> > >> ! retrieve stencil width (ghost-point padding) > >> CALL > DAGetInfo(c%daul(1+level),PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & > >> > >> > >> > PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, > >> & > >> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & > >> sw,PETSC_NULL_INTEGER, & > >> PETSC_NULL_INTEGER,ierr); > >> > >> ! offset due to node topology > >> off=MOD(info%xm+info%ym,2) > >> > >> ! fast smoothing (relaxation) its=number of iteration > >> DO k=1,its > >> ! Red-Black Gauss-Siedel scheme > >> DO p=0,1 > >> ! retrieve initial guess x with ghost points > >> CALL > >> DAGlobalToLocalBegin(c%daul(1+level),x,INSERT_VALUES,lx,ierr) > >> CALL > DAGlobalToLocalEnd(c%daul(1+level),x,INSERT_VALUES,lx,ierr) > >> ! smoothing (relaxation) > >> DO i=istart+p+off,iend-1,info%dof*2 > >> i3=(i-istart)/(info%xm*info%dof) > >> i2=(i-istart-i3*info%xm*info%dof)/info%dof > >> i3=i3+i3i ! i3 in ( 0 .. sx3-1 ) > >> i2=i2+i2i ! i2 in ( 0 .. sx2-1 ) > >> > >> i000=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof > >> > >> i0p0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+1)*info%dof > >> i00p=1+((sw+i3-i3i+1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof > >> i0m0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i-1)*info%dof > >> i00m=1+((sw+i3-i3i-1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof > >> > >> px(i000+IU1)=px(i000+IU1)+ & > >> (pb(i000+IU1)-((+(px(i0p0+IU1)-px(i000+IU1) > >> )-(px(i000+IU1)-px(i0m0+IU1) )) & > >> > >> +(+(px(i00p+IU1)-px(i000+IU1) )-(px(i000+IU1)-px(i00m+IU1) )))) / > >> (-4._8) > >> END DO > >> > >> ! publish new values of x for its global vector > >> CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) > >> > >> END DO > >> END DO > >> > >> ! dispose of the local vectors > >> CALL VecRestoreArrayF90(lx,px,ierr) > >> CALL VecRestoreArrayF90(lb,pb,ierr) > >> CALL DARestoreLocalVector(c%daul(1+level),lx,ierr) > >> CALL DARestoreLocalVector(c%daul(1+level),lb,ierr) > >> > >> END SUBROUTINE matrixsmooth > >> > >> > >> > >> > >> > >> Could this be changed to something like: > >> > >> > >> > >> ... INITIALIZE... > >> > >> CALL DAGlobalToLocalBegin(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) > >> CALL DAGlobalToLocalEnd(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr) > >> > >> ! fast smoothing (relaxation) its=number of iteration > >> DO k=1,its > >> ! Red-Black Gauss-Siedel scheme > >> DO p=0,1 > >> ! smoothing (relaxation) > >> DO i=istart+p+off,iend-1,info%dof*2 > >> ...STENCIL OPERATION... > >> END DO > >> > >> ...UPDATE GHOST POINTS... > >> > >> END DO > >> END DO > >> > >> ! publish new values of x for its global vector > >> CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr) > >> > >> ...CLEAN MEMORY... > >> > >> > >> > >> > >> > >> Any recommendation? > >> > >> Best wishes, > >> Sylvain Barbot > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylbar.vainbot at gmail.com Wed Feb 15 21:14:09 2012 From: sylbar.vainbot at gmail.com (Sylvain Barbot) Date: Wed, 15 Feb 2012 19:14:09 -0800 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel In-Reply-To: References: Message-ID: > R-B has crappy cache performance since you alternately pull data but ignore > half of it. Understood. > Why not just use the GMG in PETSc? Its very flexible, scalable, > and efficient. I tried to use Petsc's implementation of multigrid, cf thread "V-cycle multigrid with matrix shells". My concern is performance. I'd like the whole procedure to use matrix shells, or "matrix-free" matrices, because my stencil can have up to 21 off-diagonal terms. Petsc 3.1 did not allow this functionality. I do not know what Petsc 3.2 has to offer in that regard. I have a working multi-grid method that uses the low functionality of Petsc. I would obviously prefer if I could use something more advanced. Cheers, Sylvain From knepley at gmail.com Wed Feb 15 21:37:13 2012 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 15 Feb 2012 21:37:13 -0600 Subject: [petsc-users] Ghost point communication for Red-Black Gauss-Siedel In-Reply-To: References: Message-ID: On Wed, Feb 15, 2012 at 9:14 PM, Sylvain Barbot wrote: > > R-B has crappy cache performance since you alternately pull data but > ignore > > half of it. > > Understood. > > > Why not just use the GMG in PETSc? Its very flexible, scalable, > > and efficient. > > I tried to use Petsc's implementation of multigrid, cf thread "V-cycle > multigrid with matrix shells". My concern is performance. I'd like the > whole procedure to use matrix shells, or "matrix-free" matrices, > because my stencil can have up to 21 off-diagonal terms. Petsc 3.1 did > not allow this functionality. I do not know what Petsc 3.2 has to > offer in that regard. > So you want to calculate the action to avoid the memory bandwidth limit? If so, you can still use MG by specifying the coarse operators as MatShells, rather than using Galerkin. You might want to tweak the interpolator later if it is inadequate. Matt > I have a working multi-grid method that uses the low functionality of > Petsc. I would obviously prefer if I could use something more > advanced. > > Cheers, > Sylvain > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Feb 16 03:38:45 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 16 Feb 2012 01:38:45 -0800 Subject: [petsc-users] ParMETIS_V3_PartGeomKway Message-ID: Hi guys, I'm wondering if there is any implementation for ParMETIS_V3_PartGeomKway()? All I can find is the implementation for ParMETIS_V3_PartKway and I'm wondering if including the vertex positions could help me get a better partitioning? Also I have a general question. Is minimizing number of edge cuts essentially the same as minimizing communication? I understand that they are related, but communication is proportional to the number of ghost points which is not exactly equal (its actually less than) to the number of edge cuts. So then is it possible that this could actually result in larger number of ghost points, at least for some of processors? Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Thu Feb 16 04:35:19 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Thu, 16 Feb 2012 11:35:19 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: <464DBE01-B998-4ECA-8A06-60ECE6796B13@mcs.anl.gov> References: <47C4D9F5-860B-4D5C-B7AF-4DA138EED927@mcs.anl.gov> <464DBE01-B998-4ECA-8A06-60ECE6796B13@mcs.anl.gov> Message-ID: > > On Feb 15, 2012, at 10:26 AM, Hui Zhang wrote: > >> >> On Feb 15, 2012, at 5:18 PM, Barry Smith wrote: >> >>> >>> try it >> >> Yes, I'm trying. Just one more question: why there is no *Get* LocalToGlobalMapping >> so that when a Mat is with some LocalToGlobalMapping we can temporarily change with >> another mapping and reset back to the original mapping. > > There is. Perhaps it is only in petsc-dev > > > Barry I found it. Thank you very much! By GetLocalToGlobalMapping, I also find that the submat's from PCGASM has not been SetLocalToGlobalMapping() yet when ModifySubMatrices was called by PCGASM. Everything is ok and clear now. Thanks again! > >> >>> >>> On Feb 15, 2012, at 10:07 AM, Hui Zhang wrote: >>> >>>> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >>>> >>>>> Hi Dmitry, >>>>> >>>>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>>>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>>>> >>>>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>>>> >>>>> I think the coloumns from the parameter 'col' are always the same as the rows >>>>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>>>> index sets but not rows and columns. Has I misunderstood something? >>>> >>>> As I tested, the row and col are always the same. >>>> >>>> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >>>> in the above func()? >>>> >>>> thanks, >>>> Hui >>>> >>>>> >>>>> thanks, >>>>> Hui >>>>> >>>>> >>>>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>>>> >>>>>> Yes, that's right. >>>>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>>>> It is expected that they are generated from mesh subdomains. >>>>>> Each IS does carry the subdomains subcomm. >>>>>> >>>>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>>>> each having the indices with the same color and the subcomm that supports that color. It is >>>>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>>>> >>>>>> Dmitry. >>>>>> >>>>>> >>>>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>>>> >>>>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>>>> >>>>>> Thanks, >>>>>> Hui >>>>>> >>>>>> >>>>> >>>> >>> >>> >> > > From thomas.witkowski at tu-dresden.de Thu Feb 16 05:49:28 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Thu, 16 Feb 2012 12:49:28 +0100 Subject: [petsc-users] What do with singular blocks in block matrix preconditioning? Message-ID: <4F3CED48.1050605@tu-dresden.de> I consider a 2x2 block matrix (saddle point) with the left upper block being singular due to Neumann boundary conditions. The whole block matrix is still non-singular. I worked on some ideas for block preconditioning, but there is always some problem with the singular block. All publications I know assume the block to be definite. There is also some work on highly singular blocks, but this is here not the case. Does some of you know papers about block preconditioners for some class of 2x2 saddle point problems, where the left upper block is assumed to be positive semi-definite? From a more practical point of view, I have the problem that, independently of a special kind of block preconditioner, one has always to solve (or to approximate the solution) a system with the singular block with an arbitrary right hand side. But in general the right hand side does not fulfill the compatibility condition of having zero mean. Is there a way out of this problem? Thomas From knepley at gmail.com Thu Feb 16 07:29:03 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 16 Feb 2012 07:29:03 -0600 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: Message-ID: On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh wrote: > Hi guys, > > I'm wondering if there is any implementation > for ParMETIS_V3_PartGeomKway()? All I can find is the implementation for > ParMETIS_V3_PartKway and I'm wondering if including the vertex positions > could help me get a better partitioning? > As far as communication goes, it will not help. > Also I have a general question. Is minimizing number of edge cuts > essentially the same as minimizing communication? I understand that they > are related, but communication is proportional to the number of ghost > points which is not exactly equal (its actually less than) to the number of > edge cuts. So then is it possible that this could actually result in larger > number of ghost points, at least for some of processors? > It of course depends on your problem and the graph you draw, but you can always draw a graph where the edge cut is exactly your communication. Matt > Thanks, > Mohammad > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.gorman at imperial.ac.uk Thu Feb 16 07:43:47 2012 From: g.gorman at imperial.ac.uk (Gerard Gorman) Date: Thu, 16 Feb 2012 13:43:47 +0000 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: Message-ID: <4F3D0813.50304@imperial.ac.uk> Matthew Knepley emailed the following on 16/02/12 13:29: > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh > wrote: > > Hi guys, > > I'm wondering if there is any implementation > for ParMETIS_V3_PartGeomKway()? All I can find is the > implementation for ParMETIS_V3_PartKway and I'm wondering if > including the vertex positions could help me get a better > partitioning? > > > As far as communication goes, it will not help. > > > Also I have a general question. Is minimizing number of edge cuts > essentially the same as minimizing communication? I understand > that they are related, but communication is proportional to the > number of ghost points which is not exactly equal (its actually > less than) to the number of edge cuts. So then is it possible that > this could actually result in larger number of ghost points, at > least for some of processors? > > > It of course depends on your problem and the graph you draw, but you > can always draw a graph > where the edge cut is exactly your communication. > > It might also be interesting to look at Zoltan's hypergraph partitioning which can better balance communications - http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html Cheers Gerard From knepley at gmail.com Thu Feb 16 07:46:17 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 16 Feb 2012 07:46:17 -0600 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: <4F3D0813.50304@imperial.ac.uk> References: <4F3D0813.50304@imperial.ac.uk> Message-ID: On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman wrote: > Matthew Knepley emailed the following on 16/02/12 13:29: > > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh > > wrote: > > > > Hi guys, > > > > I'm wondering if there is any implementation > > for ParMETIS_V3_PartGeomKway()? All I can find is the > > implementation for ParMETIS_V3_PartKway and I'm wondering if > > including the vertex positions could help me get a better > > partitioning? > > > > > > As far as communication goes, it will not help. > > > > > > Also I have a general question. Is minimizing number of edge cuts > > essentially the same as minimizing communication? I understand > > that they are related, but communication is proportional to the > > number of ghost points which is not exactly equal (its actually > > less than) to the number of edge cuts. So then is it possible that > > this could actually result in larger number of ghost points, at > > least for some of processors? > > > > > > It of course depends on your problem and the graph you draw, but you > > can always draw a graph > > where the edge cut is exactly your communication. > > > > > > > It might also be interesting to look at Zoltan's hypergraph partitioning > which can better balance communications - > http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html I would caution you that one thing people are rarely careful about is quantifying the effect of latency vs communication volume. I would bet you $10 that the lion's share of slow down is due to load imbalance/latency rather than communication volume (since comm links are so big). Matt > > Cheers > Gerard > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.gorman at imperial.ac.uk Thu Feb 16 08:07:16 2012 From: g.gorman at imperial.ac.uk (Gerard Gorman) Date: Thu, 16 Feb 2012 14:07:16 +0000 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: <4F3D0813.50304@imperial.ac.uk> Message-ID: <4F3D0D94.80202@imperial.ac.uk> Matthew Knepley emailed the following on 16/02/12 13:46: > On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman > > wrote: > > Matthew Knepley emailed the following on 16/02/12 13:29: >> On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh >> >> >> wrote: >> >> Hi guys, >> >> I'm wondering if there is any implementation for >> ParMETIS_V3_PartGeomKway()? All I can find is the implementation >> for ParMETIS_V3_PartKway and I'm wondering if including the vertex >> positions could help me get a better partitioning? >> >> >> As far as communication goes, it will not help. >> >> >> Also I have a general question. Is minimizing number of edge cuts >> essentially the same as minimizing communication? I understand that >> they are related, but communication is proportional to the number >> of ghost points which is not exactly equal (its actually less than) >> to the number of edge cuts. So then is it possible that this could >> actually result in larger number of ghost points, at least for some >> of processors? >> >> >> It of course depends on your problem and the graph you draw, but >> you can always draw a graph where the edge cut is exactly your >> communication. >> >> > > > It might also be interesting to look at Zoltan's hypergraph > partitioning which can better balance communications - > http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html > > > I would caution you that one thing people are rarely careful about is > quantifying the effect of latency vs communication volume. I would > bet you $10 that the lion's share of slow down is due to load > imbalance/latency rather than communication volume (since comm links > are so big). > > Matt > I believe you'd be correct in the vast majority of cases. Anyhow, as Daniel Bernstein says, "Profile. Don't speculate." Cheers Gerard From mark.adams at columbia.edu Thu Feb 16 08:21:21 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Thu, 16 Feb 2012 09:21:21 -0500 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: <4F3D0813.50304@imperial.ac.uk> Message-ID: On Feb 16, 2012, at 8:46 AM, Matthew Knepley wrote: > On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman wrote: > Matthew Knepley emailed the following on 16/02/12 13:29: > > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh > > wrote: > > > > Hi guys, > > > > I'm wondering if there is any implementation > > for ParMETIS_V3_PartGeomKway()? All I can find is the > > implementation for ParMETIS_V3_PartKway and I'm wondering if > > including the vertex positions could help me get a better > > partitioning? > > > > > > As far as communication goes, it will not help. > > > > > > Also I have a general question. Is minimizing number of edge cuts > > essentially the same as minimizing communication? I understand > > that they are related, but communication is proportional to the > > number of ghost points which is not exactly equal (its actually > > less than) to the number of edge cuts. So then is it possible that > > this could actually result in larger number of ghost points, at > > least for some of processors? > > > > > > It of course depends on your problem and the graph you draw, but you > > can always draw a graph > > where the edge cut is exactly your communication. > > > > > > > It might also be interesting to look at Zoltan's hypergraph partitioning > which can better balance communications - > http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html > > I would caution you that one thing people are rarely careful about is quantifying the effect of > latency vs communication volume. I would bet you $10 that the lion's share of slow down is > due to load imbalance/latency rather than communication volume (since comm links are so > big). > Just to add, edge cuts are a good proxy for communication volume on PDE graphs, IMO, but they are not the same and I applaud Zoltan for optimizing this directly. Both edge cuts and #ghosts are decent metrics to indirectly minimize number of partition neighbors (latency). As Matt says this is probably what you want to minimize, assuming good load balance. Furthermore you don't really want to minimize any of these. What you want to minimize is the maximum cuts/neighbors/load of an any partition -- that is load balance (max/optimal). Mark > Matt > > > Cheers > Gerard > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at gmail.com Wed Feb 15 11:46:48 2012 From: karpeev at gmail.com (Dmitry Karpeev) Date: Wed, 15 Feb 2012 09:46:48 -0800 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: Message-ID: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> You should be able to. This behavior is the same as in PCASM, except in GASM the matrices live on subcommunicators. I am in transit right now, but I can take a closer look in Friday. Dmitry On Feb 15, 2012, at 8:07, Hui Zhang wrote: > On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: > >> Hi Dmitry, >> >> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >> >> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >> >> I think the coloumns from the parameter 'col' are always the same as the rows >> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >> index sets but not rows and columns. Has I misunderstood something? > > As I tested, the row and col are always the same. > > I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's > in the above func()? > > thanks, > Hui > >> >> thanks, >> Hui >> >> >> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >> >>> Yes, that's right. >>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>> It is expected that they are generated from mesh subdomains. >>> Each IS does carry the subdomains subcomm. >>> >>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>> each having the indices with the same color and the subcomm that supports that color. It is >>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>> >>> Dmitry. >>> >>> >>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>> >>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>> >>> Thanks, >>> Hui >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vkuhlem at gmail.com Thu Feb 16 09:41:15 2012 From: vkuhlem at gmail.com (Verena Kuhlemann) Date: Thu, 16 Feb 2012 07:41:15 -0800 Subject: [petsc-users] preconditioned eigensolver in slepc Message-ID: Hi, I am trying to use LOBPCG via Slepc and I am a little confused on how to set the preconditioner and operator matrices. I have a shell matrix A and a matrix P that should be used for the preconditioning. Here is the relevant part of my code: EPSCreate(PETSC_COMM_WORLD,&eps); EPSSetOperators(eps, A, PETSC_NULL); EPSSetProblemType(eps, EPS_HEP); EPSSetType(eps, EPSBLOPEX); EPSGetST(eps, &st); STPrecondSetMatForPC(st, P); STGetKSP(st,&ksp); KSPGetPC(ksp, &pc); KSPSetOperators(ksp,A, P, DIFFERENT_NONZERO_PATTERN); //do I have to set this?? If I don't the program complains that a matrix is not set PCSetType(pc, PCJACOBI); EPSSetUp(eps); Is that correct or do I need to do something else? Thanks, Verena -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Feb 16 09:54:00 2012 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 16 Feb 2012 16:54:00 +0100 Subject: [petsc-users] preconditioned eigensolver in slepc In-Reply-To: References: Message-ID: <83422F1F-8632-4CF2-A262-722295AC12DD@dsic.upv.es> El 16/02/2012, a las 16:41, Verena Kuhlemann escribi?: > Hi, > > I am trying to use LOBPCG via Slepc and I am a little confused on how to set the preconditioner and operator matrices. I have a shell matrix A and a matrix P that should be used for the preconditioning. Here is the relevant part of my code: > > EPSCreate(PETSC_COMM_WORLD,&eps); > EPSSetOperators(eps, A, PETSC_NULL); > EPSSetProblemType(eps, EPS_HEP); > EPSSetType(eps, EPSBLOPEX); > EPSGetST(eps, &st); > STPrecondSetMatForPC(st, P); > STGetKSP(st,&ksp); > KSPGetPC(ksp, &pc); > KSPSetOperators(ksp,A, P, DIFFERENT_NONZERO_PATTERN); //do I have to set this?? If I don't the program complains that a matrix is not set > PCSetType(pc, PCJACOBI); > EPSSetUp(eps); > > > Is that correct or do I need to do something else? > > Thanks, > Verena It works for me like this: ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr); ierr = EPSSetOperators(eps,A,PETSC_NULL);CHKERRQ(ierr); ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); ierr = EPSSetType(eps,EPSBLOPEX);CHKERRQ(ierr); ierr = EPSGetST(eps,&st);CHKERRQ(ierr); ierr = STPrecondSetMatForPC(st,P);CHKERRQ(ierr); ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); You don't need to manipulate the KSP or PC objects. If you want to check whether your matrix P is being used in Blopex, you can do ierr = PetscObjectSetName((PetscObject)P,"My precond");CHKERRQ(ierr); then run with -eps_view and you will see "Matrix Object: My precond 1 MPI processes" in the PC section. Jose From vkuhlem at gmail.com Thu Feb 16 10:16:12 2012 From: vkuhlem at gmail.com (Verena Kuhlemann) Date: Thu, 16 Feb 2012 08:16:12 -0800 Subject: [petsc-users] preconditioned eigensolver in slepc In-Reply-To: <83422F1F-8632-4CF2-A262-722295AC12DD@dsic.upv.es> References: <83422F1F-8632-4CF2-A262-722295AC12DD@dsic.upv.es> Message-ID: Thanks for the answer. When I want to set a preconditioner then I will need to do that via STGetKSP and the KSPGetPC ? In the end I want to use a shell preconditioner. On Thu, Feb 16, 2012 at 7:54 AM, Jose E. Roman wrote: > > El 16/02/2012, a las 16:41, Verena Kuhlemann escribi?: > > > Hi, > > > > I am trying to use LOBPCG via Slepc and I am a little confused on how to > set the preconditioner and operator matrices. I have a shell matrix A and a > matrix P that should be used for the preconditioning. Here is the relevant > part of my code: > > > > EPSCreate(PETSC_COMM_WORLD,&eps); > > EPSSetOperators(eps, A, PETSC_NULL); > > EPSSetProblemType(eps, EPS_HEP); > > EPSSetType(eps, EPSBLOPEX); > > EPSGetST(eps, &st); > > STPrecondSetMatForPC(st, P); > > STGetKSP(st,&ksp); > > KSPGetPC(ksp, &pc); > > KSPSetOperators(ksp,A, P, DIFFERENT_NONZERO_PATTERN); //do I have to set > this?? If I don't the program complains that a matrix is not set > > PCSetType(pc, PCJACOBI); > > EPSSetUp(eps); > > > > > > Is that correct or do I need to do something else? > > > > Thanks, > > Verena > > It works for me like this: > > ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr); > ierr = EPSSetOperators(eps,A,PETSC_NULL);CHKERRQ(ierr); > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); > ierr = EPSSetType(eps,EPSBLOPEX);CHKERRQ(ierr); > ierr = EPSGetST(eps,&st);CHKERRQ(ierr); > ierr = STPrecondSetMatForPC(st,P);CHKERRQ(ierr); > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > You don't need to manipulate the KSP or PC objects. > > If you want to check whether your matrix P is being used in Blopex, you > can do > ierr = PetscObjectSetName((PetscObject)P,"My precond");CHKERRQ(ierr); > > then run with -eps_view and you will see "Matrix Object: My precond > 1 MPI processes" in the PC section. > > Jose > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coco at dmi.unict.it Thu Feb 16 12:54:21 2012 From: coco at dmi.unict.it (coco at dmi.unict.it) Date: Thu, 16 Feb 2012 19:54:21 +0100 Subject: [petsc-users] Multigrid as a preconditioner Message-ID: <20120216195421.Horde.yuFYAuph4B9PPVDdocqVeGA@mbox.dmi.unict.it> Dear list, I would like to parallelize a multigrid code by Petsc. I do not want to use the DMMG infrastructure, since it will be replaced in the next PETSc release. Therefore I preferred to use the multigrid as a preconditioner. In practice, I use the Richardson iteration, choosing the same matrix of the linear system as a preconditioner, so that I think the Richardson iteration should converge in only one iteration, and effectively it is like solving the whole linear system by the multigrid. As a first test, I tried to use a one-grid multigrid (then not a truly multigrid). I just set the coarse solver (by a standard KSPSolve), and it should be enough, because the multigrid starts already from the coarsest grid and then it does not need the smoother and the transfer operators. Unfortunately, the iteration scheme (which should be the Richardson scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong solution. On the other hand, if I solve the whole problem with the standard KSPSolve (then withouth setting the multigrid as a preconditioner ...), it converges to the right solution with a reason=2. I thought that the two methods should be the exactly the same method, and I do not understand why they provide different convergence results. Here is the relevant code: // Set the matrix of the linear system Mat Mcc; ierr=MatCreate(PETSC_COMM_WORLD,&Mcc); CHKERRQ(ierr); ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); ierr=MatSetSizes(Mcc,PETSC_DECIDE,PETSC_DECIDE,1000,1000); CHKERRQ(ierr); ierr=setMatrix(Mcc); //It is a routine that set the values of the matrix Mcc // Set the ksp solver with the multigrid as a preconditioner KSP ksp, KspSolver; ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); ierr = KSPSetType(ksp,KSPRICHARDSON); ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PCSetType(pc,PCMG);CHKERRQ(ierr); ierr = PCMGSetLevels(pc,1,&PETSC_COMM_WORLD);CHKERRQ(ierr); ierr = PCMGSetType(pc,PC_MG_MULTIPLICATIVE);CHKERRQ(ierr); ierr = PCMGGetCoarseSolve(pc,&kspCoarseSolve);CHKERRQ(ierr); ierr = KSPSetOperators(kspCoarseSolve,Mcc,Mcc,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(kspCoarseSolve,1.e-12,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSetOperators(ksp,Mcc,Mcc,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-12,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSolve(ksp,RHS,U);CHKERRQ(ierr); // Solve with the standard KSPSolve KSP ksp1; ierr = KSPCreate(PETSC_COMM_WORLD,&ksp1);CHKERRQ(ierr); ierr = KSPSetOperators(ksp1,Mcc,Mcc,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp1,1.e-12/(2*nn123),PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(ierr); At the end, the Vector U and U1 are different. Thank you. Best regards, Armando From jroman at dsic.upv.es Thu Feb 16 13:02:50 2012 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 16 Feb 2012 20:02:50 +0100 Subject: [petsc-users] preconditioned eigensolver in slepc In-Reply-To: References: <83422F1F-8632-4CF2-A262-722295AC12DD@dsic.upv.es> Message-ID: <46DCD1F8-8C3F-45A6-8196-37AD0FFD06CF@dsic.upv.es> El 16/02/2012, a las 17:16, Verena Kuhlemann escribi?: > Thanks for the answer. > > When I want to set a preconditioner then I will need to do that via STGetKSP and the KSPGetPC ? > In the end I want to use a shell preconditioner. Yes, you can do that. Jose From knepley at gmail.com Thu Feb 16 13:39:11 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 16 Feb 2012 13:39:11 -0600 Subject: [petsc-users] Multigrid as a preconditioner In-Reply-To: <20120216195421.Horde.yuFYAuph4B9PPVDdocqVeGA@mbox.dmi.unict.it> References: <20120216195421.Horde.yuFYAuph4B9PPVDdocqVeGA@mbox.dmi.unict.it> Message-ID: On Thu, Feb 16, 2012 at 12:54 PM, wrote: > Dear list, > > I would like to parallelize a multigrid code by Petsc. I do not want to > use the DMMG infrastructure, since it will be replaced in the next PETSc > release. Therefore I preferred to use the multigrid as a preconditioner. In > practice, I use the Richardson iteration, choosing the same matrix of the > linear system as a preconditioner, so that I think the Richardson iteration > should converge in only one iteration, and effectively it is like solving > the whole linear system by the multigrid. > Your understanding of the Richardson iteration is flawed. You can consult Yousef Saad's book for the standard definition and anaysis. > As a first test, I tried to use a one-grid multigrid (then not a truly > multigrid). I just set the coarse solver (by a standard KSPSolve), and it > should be enough, because the multigrid starts already from the coarsest > grid and then it does not need the smoother and the transfer operators. > Unfortunately, the iteration scheme (which should be the Richardson > scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong solution. > On the other hand, if I solve the whole problem with the standard KSPSolve > (then withouth setting the multigrid as a preconditioner ...), it converges > to the right solution with a reason=2. > Yes, give Richardson many more iterations, -ksp_max_it. Matt > I thought that the two methods should be the exactly the same method, and > I do not understand why they provide different convergence results. > > Here is the relevant code: > > // Set the matrix of the linear system > Mat Mcc; > ierr=MatCreate(PETSC_COMM_**WORLD,&Mcc); CHKERRQ(ierr); > ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); > ierr=MatSetSizes(Mcc,PETSC_**DECIDE,PETSC_DECIDE,1000,1000)**; > CHKERRQ(ierr); > ierr=setMatrix(Mcc); //It is a routine that set the values of the matrix > Mcc > > // Set the ksp solver with the multigrid as a preconditioner > KSP ksp, KspSolver; > ierr = KSPCreate(PETSC_COMM_WORLD,&**ksp);CHKERRQ(ierr); > ierr = KSPSetType(ksp,KSPRICHARDSON); > ierr = KSPGetPC(ksp,&pc);CHKERRQ(**ierr); > ierr = PCSetType(pc,PCMG);CHKERRQ(**ierr); > ierr = PCMGSetLevels(pc,1,&PETSC_**COMM_WORLD);CHKERRQ(ierr); > ierr = PCMGSetType(pc,PC_MG_**MULTIPLICATIVE);CHKERRQ(ierr); > ierr = PCMGGetCoarseSolve(pc,&**kspCoarseSolve);CHKERRQ(ierr); > ierr = KSPSetOperators(**kspCoarseSolve,Mcc,Mcc,** > DIFFERENT_NONZERO_PATTERN);**CHKERRQ(ierr); > ierr = KSPSetTolerances(**kspCoarseSolve,1.e-12,PETSC_** > DEFAULT,PETSC_DEFAULT,PETSC_**DEFAULT);CHKERRQ(ierr); > ierr = KSPSetOperators(ksp,Mcc,Mcc,**DIFFERENT_NONZERO_PATTERN);** > CHKERRQ(ierr); > ierr = KSPSetTolerances(ksp,1.e-12,**PETSC_DEFAULT,PETSC_DEFAULT,** > PETSC_DEFAULT);CHKERRQ(ierr); > ierr = KSPSolve(ksp,RHS,U);CHKERRQ(**ierr); > > // Solve with the standard KSPSolve > KSP ksp1; > ierr = KSPCreate(PETSC_COMM_WORLD,&**ksp1);CHKERRQ(ierr); > ierr = KSPSetOperators(ksp1,Mcc,Mcc,**DIFFERENT_NONZERO_PATTERN);** > CHKERRQ(ierr); > ierr = KSPSetTolerances(ksp1,1.e-12/(**2*nn123),PETSC_DEFAULT,PETSC_** > DEFAULT,PETSC_DEFAULT);**CHKERRQ(ierr); > ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(**ierr); > > > At the end, the Vector U and U1 are different. > Thank you. > > Best regards, > Armando > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Feb 16 16:06:39 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 16 Feb 2012 14:06:39 -0800 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: <4F3D0813.50304@imperial.ac.uk> Message-ID: Thanks everyone for participating in the discussion. I really appreciate it. I guess I'm gonna describe the problem more in details here. So basically I use parmetis to partition my quadtree grid (we have previously talked about why i'm not using p4est right now). Please see figures for a sample amr quadtree grid. For most cases, the partition quality that I get is reasonable like the one for the smaller grid (maximum level of amr = 4) where I have color-coded the grid points according to their proc_id (0, 1, 2 and 3) , both before and after the partitioning. In this case, the program tells me that: [0] # of edge cuts: 35 [1] # of edge cuts: 35 [2] # of edge cuts: 35 [3] # of edge cuts: 35 [0] Before: # of ghost Nodes: 35 [1] Before: # of ghost Nodes: 29 [2] Before: # of ghost Nodes: 35 [3] Before: # of ghost Nodes: 23 [0] After: # of ghost Nodes: 17 [1] After: # of ghost Nodes: 20 [2] After: # of ghost Nodes: 17 [3] After: # of ghost Nodes: 16 Now, the partitioning is quite good and number of ghost points required have been dropped after partitioning is applied. However, if i run the code for a larger grid (maximum amr level = 8), the partitioning that I get is actually worse than the initial numbering i started with (well at least for proc 0, 1, 2) [0] # of edge cuts: 67 [1] # of edge cuts: 67 [2] # of edge cuts: 67 [3] # of edge cuts: 67 [0] Before: # of ghost Nodes: 58 [1] Before: # of ghost Nodes: 67 [2] Before: # of ghost Nodes: 62 [3] Before: # of ghost Nodes: 44 [0] After: # of ghost Nodes: 103 [1] After: # of ghost Nodes: 115 [2] After: # of ghost Nodes: 81 [3] After: # of ghost Nodes: 28 as you can see for proc 0, 1, and 2 the number of required ghost points have been doubled which means more communication. To me, this should not happen (maybe i'm wrong here?). That's why i asked if minimizing edge cuts does not translate to having fewer ghost points. Does minimizing latency favors creating 'clustered chunks' of points for each processor and if so, could that create a partitioning with more ghost points? One final point, I'm not sure how parmetis is calculating the edge cuts, but if you simply count the number of cuts in the smaller graphs, you don't get the reported numbers. why is that? (for instance proc 3, cuts 20 edges, including extra 'diagonal' ones at the coarse-fine interfaces but parmetis reports 35-- i can explain this more in details if needed) thanks, Mohammad On Thu, Feb 16, 2012 at 6:21 AM, Mark F. Adams wrote: > > On Feb 16, 2012, at 8:46 AM, Matthew Knepley wrote: > > On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman wrote: > >> Matthew Knepley emailed the following on 16/02/12 13:29: >> > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh > > > wrote: >> > >> > Hi guys, >> > >> > I'm wondering if there is any implementation >> > for ParMETIS_V3_PartGeomKway()? All I can find is the >> > implementation for ParMETIS_V3_PartKway and I'm wondering if >> > including the vertex positions could help me get a better >> > partitioning? >> > >> > >> > As far as communication goes, it will not help. >> > >> > >> > Also I have a general question. Is minimizing number of edge cuts >> > essentially the same as minimizing communication? I understand >> > that they are related, but communication is proportional to the >> > number of ghost points which is not exactly equal (its actually >> > less than) to the number of edge cuts. So then is it possible that >> > this could actually result in larger number of ghost points, at >> > least for some of processors? >> > >> > >> > It of course depends on your problem and the graph you draw, but you >> > can always draw a graph >> > where the edge cut is exactly your communication. >> > >> > >> >> >> It might also be interesting to look at Zoltan's hypergraph partitioning >> which can better balance communications - >> http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html > > > I would caution you that one thing people are rarely careful about is > quantifying the effect of > latency vs communication volume. I would bet you $10 that the lion's share > of slow down is > due to load imbalance/latency rather than communication volume (since comm > links are so > big). > > > Just to add, edge cuts are a good proxy for communication volume on PDE > graphs, IMO, but they are not the same and I applaud Zoltan for optimizing > this directly. Both edge cuts and #ghosts are decent metrics to indirectly > minimize number of partition neighbors (latency). As Matt says this is > probably what you want to minimize, assuming good load balance. > > Furthermore you don't really want to minimize any of these. What you want > to minimize is the maximum cuts/neighbors/load of an any partition -- that > is load balance (max/optimal). > > Mark > > Matt > > >> >> Cheers >> Gerard >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: new_4_1.jpg Type: image/jpeg Size: 132607 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: old_4_1.jpg Type: image/jpeg Size: 137391 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: old_8_1.jpg Type: image/jpeg Size: 217432 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: new_8_1.jpg Type: image/jpeg Size: 217620 bytes Desc: not available URL: From mirzadeh at gmail.com Thu Feb 16 16:15:12 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 16 Feb 2012 14:15:12 -0800 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: <4F3D0813.50304@imperial.ac.uk> Message-ID: My apologies, the email I send had large attachments and didn't get through. Here is the email and the link to the figures: http://imgur.com/a/yRTHf ------------------------------------------------- Thanks everyone for participating in the discussion. I really appreciate it. I guess I'm gonna describe the problem more in details here. So basically I use parmetis to partition my quadtree grid (we have previously talked about why i'm not using p4est right now). Please see figures for a sample amr quadtree grid. For most cases, the partition quality that I get is reasonable like the one for the smaller grid (maximum level of amr = 4) where I have color-coded the grid points according to their proc_id (0, 1, 2 and 3) , both before and after the partitioning. In this case, the program tells me that: [0] # of edge cuts: 35 [1] # of edge cuts: 35 [2] # of edge cuts: 35 [3] # of edge cuts: 35 [0] Before: # of ghost Nodes: 35 [1] Before: # of ghost Nodes: 29 [2] Before: # of ghost Nodes: 35 [3] Before: # of ghost Nodes: 23 [0] After: # of ghost Nodes: 17 [1] After: # of ghost Nodes: 20 [2] After: # of ghost Nodes: 17 [3] After: # of ghost Nodes: 16 Now, the partitioning is quite good and number of ghost points required have been dropped after partitioning is applied. However, if i run the code for a larger grid (maximum amr level = 8), the partitioning that I get is actually worse than the initial numbering i started with (well at least for proc 0, 1, 2) [0] # of edge cuts: 67 [1] # of edge cuts: 67 [2] # of edge cuts: 67 [3] # of edge cuts: 67 [0] Before: # of ghost Nodes: 58 [1] Before: # of ghost Nodes: 67 [2] Before: # of ghost Nodes: 62 [3] Before: # of ghost Nodes: 44 [0] After: # of ghost Nodes: 103 [1] After: # of ghost Nodes: 115 [2] After: # of ghost Nodes: 81 [3] After: # of ghost Nodes: 28 as you can see for proc 0, 1, and 2 the number of required ghost points have been doubled which means more communication. To me, this should not happen (maybe i'm wrong here?). That's why i asked if minimizing edge cuts does not translate to having fewer ghost points. Does minimizing latency favors creating 'clustered chunks' of points for each processor and if so, could that create a partitioning with more ghost points? One final point, I'm not sure how parmetis is calculating the edge cuts, but if you simply count the number of cuts in the smaller graphs, you don't get the reported numbers. why is that? (for instance proc 3, cuts 20 edges, including extra 'diagonal' ones at the coarse-fine interfaces but parmetis reports 35-- i can explain this more in details if needed) thanks, Mohammad On Thu, Feb 16, 2012 at 6:21 AM, Mark F. Adams wrote: > > On Feb 16, 2012, at 8:46 AM, Matthew Knepley wrote: > > On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman wrote: > >> Matthew Knepley emailed the following on 16/02/12 13:29: >> > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh > > > wrote: >> > >> > Hi guys, >> > >> > I'm wondering if there is any implementation >> > for ParMETIS_V3_PartGeomKway()? All I can find is the >> > implementation for ParMETIS_V3_PartKway and I'm wondering if >> > including the vertex positions could help me get a better >> > partitioning? >> > >> > >> > As far as communication goes, it will not help. >> > >> > >> > Also I have a general question. Is minimizing number of edge cuts >> > essentially the same as minimizing communication? I understand >> > that they are related, but communication is proportional to the >> > number of ghost points which is not exactly equal (its actually >> > less than) to the number of edge cuts. So then is it possible that >> > this could actually result in larger number of ghost points, at >> > least for some of processors? >> > >> > >> > It of course depends on your problem and the graph you draw, but you >> > can always draw a graph >> > where the edge cut is exactly your communication. >> > >> > >> >> >> It might also be interesting to look at Zoltan's hypergraph partitioning >> which can better balance communications - >> http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html > > > I would caution you that one thing people are rarely careful about is > quantifying the effect of > latency vs communication volume. I would bet you $10 that the lion's share > of slow down is > due to load imbalance/latency rather than communication volume (since comm > links are so > big). > > > Just to add, edge cuts are a good proxy for communication volume on PDE > graphs, IMO, but they are not the same and I applaud Zoltan for optimizing > this directly. Both edge cuts and #ghosts are decent metrics to indirectly > minimize number of partition neighbors (latency). As Matt says this is > probably what you want to minimize, assuming good load balance. > > Furthermore you don't really want to minimize any of these. What you want > to minimize is the maximum cuts/neighbors/load of an any partition -- that > is load balance (max/optimal). > > Mark > > Matt > > >> >> Cheers >> Gerard >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.adams at columbia.edu Thu Feb 16 18:03:41 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Thu, 16 Feb 2012 19:03:41 -0500 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: <4F3D0813.50304@imperial.ac.uk> Message-ID: On Feb 16, 2012, at 5:06 PM, Mohammad Mirzadeh wrote: > Thanks everyone for participating in the discussion. I really appreciate it. I guess I'm gonna describe the problem more in details here. So basically I use parmetis to partition my quadtree grid (we have previously talked about why i'm not using p4est right now). Please see figures for a sample amr quadtree grid. > > For most cases, the partition quality that I get is reasonable like the one for the smaller grid (maximum level of amr = 4) where I have color-coded the grid points according to their proc_id (0, 1, 2 and 3) , both before and after the partitioning. In this case, the program tells me that: > > [0] # of edge cuts: 35 > [1] # of edge cuts: 35 > [2] # of edge cuts: 35 > [3] # of edge cuts: 35 > [0] Before: # of ghost Nodes: 35 > [1] Before: # of ghost Nodes: 29 > [2] Before: # of ghost Nodes: 35 > [3] Before: # of ghost Nodes: 23 > [0] After: # of ghost Nodes: 17 > [1] After: # of ghost Nodes: 20 > [2] After: # of ghost Nodes: 17 > [3] After: # of ghost Nodes: 16 > > Now, the partitioning is quite good and number of ghost points required have been dropped after partitioning is applied. However, if i run the code for a larger grid (maximum amr level = 8), the partitioning that I get is actually worse than the initial numbering i started with (well at least for proc 0, 1, 2) > > [0] # of edge cuts: 67 > [1] # of edge cuts: 67 > [2] # of edge cuts: 67 > [3] # of edge cuts: 67 > [0] Before: # of ghost Nodes: 58 > [1] Before: # of ghost Nodes: 67 > [2] Before: # of ghost Nodes: 62 > [3] Before: # of ghost Nodes: 44 > [0] After: # of ghost Nodes: 103 > [1] After: # of ghost Nodes: 115 > [2] After: # of ghost Nodes: 81 > [3] After: # of ghost Nodes: 28 > > as you can see for proc 0, 1, and 2 the number of required ghost points have been doubled which means more communication. To me, this should not happen (maybe i'm wrong here?). Partitioning is an NP-something problem, ie, you do not get the optimal solution in reasonable time. If you give ParMetis the optimal initial partitioning it will ignore it and just run its algorithm and give you, in general, a sub-optimal solution. It does not check that the initial partition was better than what it produced. So you _can_ get a worse partitioning out of ParMetis then what you gave it. Mark > That's why i asked if minimizing edge cuts does not translate to having fewer ghost points. > > Does minimizing latency favors creating 'clustered chunks' of points for each processor and if so, could that create a partitioning with more ghost points? > > One final point, I'm not sure how parmetis is calculating the edge cuts, but if you simply count the number of cuts in the smaller graphs, you don't get the reported numbers. why is that? (for instance proc 3, cuts 20 edges, including extra 'diagonal' ones at the coarse-fine interfaces but parmetis reports 35-- i can explain this more in details if needed) > > thanks, > Mohammad > > > > > > On Thu, Feb 16, 2012 at 6:21 AM, Mark F. Adams wrote: > > On Feb 16, 2012, at 8:46 AM, Matthew Knepley wrote: > >> On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman wrote: >> Matthew Knepley emailed the following on 16/02/12 13:29: >> > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh > > > wrote: >> > >> > Hi guys, >> > >> > I'm wondering if there is any implementation >> > for ParMETIS_V3_PartGeomKway()? All I can find is the >> > implementation for ParMETIS_V3_PartKway and I'm wondering if >> > including the vertex positions could help me get a better >> > partitioning? >> > >> > >> > As far as communication goes, it will not help. >> > >> > >> > Also I have a general question. Is minimizing number of edge cuts >> > essentially the same as minimizing communication? I understand >> > that they are related, but communication is proportional to the >> > number of ghost points which is not exactly equal (its actually >> > less than) to the number of edge cuts. So then is it possible that >> > this could actually result in larger number of ghost points, at >> > least for some of processors? >> > >> > >> > It of course depends on your problem and the graph you draw, but you >> > can always draw a graph >> > where the edge cut is exactly your communication. >> > >> > >> >> >> It might also be interesting to look at Zoltan's hypergraph partitioning >> which can better balance communications - >> http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html >> >> I would caution you that one thing people are rarely careful about is quantifying the effect of >> latency vs communication volume. I would bet you $10 that the lion's share of slow down is >> due to load imbalance/latency rather than communication volume (since comm links are so >> big). >> > > Just to add, edge cuts are a good proxy for communication volume on PDE graphs, IMO, but they are not the same and I applaud Zoltan for optimizing this directly. Both edge cuts and #ghosts are decent metrics to indirectly minimize number of partition neighbors (latency). As Matt says this is probably what you want to minimize, assuming good load balance. > > Furthermore you don't really want to minimize any of these. What you want to minimize is the maximum cuts/neighbors/load of an any partition -- that is load balance (max/optimal). > > Mark > >> Matt >> >> >> Cheers >> Gerard >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Feb 16 18:14:19 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 16 Feb 2012 16:14:19 -0800 Subject: [petsc-users] ParMETIS_V3_PartGeomKway In-Reply-To: References: <4F3D0813.50304@imperial.ac.uk> Message-ID: Thanks Mark. It is true that partitioning is NP-complete but its a bit strange to get result worse than initial numbering. I mean you could at the very least always compare against initial numbering and if its worse just use that one! I think that's what I'm gonna add to my own code then. Anyways, as long as its a not a bug in my code, i'm happy. Thanks for the help On Thu, Feb 16, 2012 at 4:03 PM, Mark F. Adams wrote: > > On Feb 16, 2012, at 5:06 PM, Mohammad Mirzadeh wrote: > > Thanks everyone for participating in the discussion. I really appreciate > it. I guess I'm gonna describe the problem more in details here. So > basically I use parmetis to partition my quadtree grid (we have previously > talked about why i'm not using p4est right now). Please see figures for a > sample amr quadtree grid. > > For most cases, the partition quality that I get is reasonable like the > one for the smaller grid (maximum level of amr = 4) where I have > color-coded the grid points according to their proc_id (0, 1, 2 and 3) , > both before and after the partitioning. In this case, the program tells me > that: > > [0] # of edge cuts: 35 > [1] # of edge cuts: 35 > [2] # of edge cuts: 35 > [3] # of edge cuts: 35 > [0] Before: # of ghost Nodes: 35 > [1] Before: # of ghost Nodes: 29 > [2] Before: # of ghost Nodes: 35 > [3] Before: # of ghost Nodes: 23 > [0] After: # of ghost Nodes: 17 > [1] After: # of ghost Nodes: 20 > [2] After: # of ghost Nodes: 17 > [3] After: # of ghost Nodes: 16 > > Now, the partitioning is quite good and number of ghost points required > have been dropped after partitioning is applied. However, if i run the code > for a larger grid (maximum amr level = 8), the partitioning that I get is > actually worse than the initial numbering i started with (well at least for > proc 0, 1, 2) > > [0] # of edge cuts: 67 > [1] # of edge cuts: 67 > [2] # of edge cuts: 67 > [3] # of edge cuts: 67 > [0] Before: # of ghost Nodes: 58 > [1] Before: # of ghost Nodes: 67 > [2] Before: # of ghost Nodes: 62 > [3] Before: # of ghost Nodes: 44 > [0] After: # of ghost Nodes: 103 > [1] After: # of ghost Nodes: 115 > [2] After: # of ghost Nodes: 81 > [3] After: # of ghost Nodes: 28 > > as you can see for proc 0, 1, and 2 the number of required ghost points > have been doubled which means more communication. To me, this should not > happen (maybe i'm wrong here?). > > > Partitioning is an NP-something problem, ie, you do not get the optimal > solution in reasonable time. If you give ParMetis the optimal initial > partitioning it will ignore it and just run its algorithm and give you, in > general, a sub-optimal solution. It does not check that the initial > partition was better than what it produced. So you _can_ get a worse > partitioning out of ParMetis then what you gave it. > > Mark > > That's why i asked if minimizing edge cuts does not translate to having > fewer ghost points. > > Does minimizing latency favors creating 'clustered chunks' of points for > each processor and if so, could that create a partitioning with more ghost > points? > > One final point, I'm not sure how parmetis is calculating the edge cuts, > but if you simply count the number of cuts in the smaller graphs, you don't > get the reported numbers. why is that? (for instance proc 3, cuts 20 edges, > including extra 'diagonal' ones at the coarse-fine interfaces but parmetis > reports 35-- i can explain this more in details if needed) > > thanks, > Mohammad > > > > > > On Thu, Feb 16, 2012 at 6:21 AM, Mark F. Adams wrote: > >> >> On Feb 16, 2012, at 8:46 AM, Matthew Knepley wrote: >> >> On Thu, Feb 16, 2012 at 7:43 AM, Gerard Gorman wrote: >> >>> Matthew Knepley emailed the following on 16/02/12 13:29: >>> > On Thu, Feb 16, 2012 at 3:38 AM, Mohammad Mirzadeh >> > > wrote: >>> > >>> > Hi guys, >>> > >>> > I'm wondering if there is any implementation >>> > for ParMETIS_V3_PartGeomKway()? All I can find is the >>> > implementation for ParMETIS_V3_PartKway and I'm wondering if >>> > including the vertex positions could help me get a better >>> > partitioning? >>> > >>> > >>> > As far as communication goes, it will not help. >>> > >>> > >>> > Also I have a general question. Is minimizing number of edge cuts >>> > essentially the same as minimizing communication? I understand >>> > that they are related, but communication is proportional to the >>> > number of ghost points which is not exactly equal (its actually >>> > less than) to the number of edge cuts. So then is it possible that >>> > this could actually result in larger number of ghost points, at >>> > least for some of processors? >>> > >>> > >>> > It of course depends on your problem and the graph you draw, but you >>> > can always draw a graph >>> > where the edge cut is exactly your communication. >>> > >>> > >>> >>> >>> It might also be interesting to look at Zoltan's hypergraph partitioning >>> which can better balance communications - >>> http://www.cs.sandia.gov/zoltan/dev_html/dev_phg.html >> >> >> I would caution you that one thing people are rarely careful about is >> quantifying the effect of >> latency vs communication volume. I would bet you $10 that the lion's >> share of slow down is >> due to load imbalance/latency rather than communication volume (since >> comm links are so >> big). >> >> >> Just to add, edge cuts are a good proxy for communication volume on PDE >> graphs, IMO, but they are not the same and I applaud Zoltan for optimizing >> this directly. Both edge cuts and #ghosts are decent metrics to indirectly >> minimize number of partition neighbors (latency). As Matt says this is >> probably what you want to minimize, assuming good load balance. >> >> Furthermore you don't really want to minimize any of these. What you >> want to minimize is the maximum cuts/neighbors/load of an any partition -- >> that is load balance (max/optimal). >> >> Mark >> >> Matt >> >> >>> >>> Cheers >>> Gerard >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bibrakc at gmail.com Fri Feb 17 02:27:11 2012 From: bibrakc at gmail.com (Bibrak Qamar) Date: Fri, 17 Feb 2012 12:27:11 +0400 Subject: [petsc-users] KSP_PCApply Documentation? Message-ID: Hello, KSP_PCApply I found the the above function used in KSPSolve_CG but couldn't find its documentation. I will appreciate any help in this direction. thanks Bibrak -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Fri Feb 17 05:12:25 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Fri, 17 Feb 2012 12:12:25 +0100 Subject: [petsc-users] Unhandled exception at 0x00000001408cca07 ... Message-ID: <4F3E3619.8090705@gmail.com> Hi, I'm runing my CFD code in windows visual studio 2008 with ifort 64bit mpich2 64bit. I managed to build the PETSc library after doing some modifications - [petsc-maint #105754] Error : Cannot determine Fortran module include flag. There is no error building my CFD code. However, when running my code, I got the error at: call PetscInitialize(PETSC_NULL_CHARACTER,ierr) It jumps to : *ierr = PetscMemzero(name,256); if (*ierr) return; May I know what's wrong? It was working when I run it in compaq visual fortran under windows xp 32bit. -- Yours sincerely, TAY wee-beng From thomas.witkowski at tu-dresden.de Fri Feb 17 05:54:17 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Fri, 17 Feb 2012 12:54:17 +0100 Subject: [petsc-users] What do with singular blocks in block matrix preconditioning? In-Reply-To: <4F3CED48.1050605@tu-dresden.de> References: <4F3CED48.1050605@tu-dresden.de> Message-ID: <4F3E3FE9.7090801@tu-dresden.de> Maybe some related question: Most textbooks write that the compatibility condition to solve a system with constant null space is that the right hand side has zero mean value. Today I read part of the Multigrid-book written by Trottenberg, and there the condition is written in a different form (eq 5.6.22 on page 185): the integral of the right hand side must be equal on the whole domain and on the boundary. Does any of you have an explanation for this condition? Is there a book/paper that considers the compatibility condition in more details? Thomas Am 16.02.2012 12:49, schrieb Thomas Witkowski: > I consider a 2x2 block matrix (saddle point) with the left upper block > being singular due to Neumann boundary conditions. The whole block > matrix is still non-singular. I worked on some ideas for block > preconditioning, but there is always some problem with the singular > block. All publications I know assume the block to be definite. There > is also some work on highly singular blocks, but this is here not the > case. Does some of you know papers about block preconditioners for > some class of 2x2 saddle point problems, where the left upper block is > assumed to be positive semi-definite? > > From a more practical point of view, I have the problem that, > independently of a special kind of block preconditioner, one has > always to solve (or to approximate the solution) a system with the > singular block with an arbitrary right hand side. But in general the > right hand side does not fulfill the compatibility condition of having > zero mean. Is there a way out of this problem? > > Thomas From coco at dmi.unict.it Fri Feb 17 06:38:24 2012 From: coco at dmi.unict.it (coco at dmi.unict.it) Date: Fri, 17 Feb 2012 13:38:24 +0100 Subject: [petsc-users] Multigrid as a preconditioner Message-ID: <20120217133824.Horde.W2oCYuph4B9PPkpAxyJxE0A@mbox.dmi.unict.it> Thank you very much for the answer, but some other doubts remain. > Date: Thu, 16 Feb 2012 13:39:11 -0600 > From: Matthew Knepley > Subject: Re: [petsc-users] Multigrid as a preconditioner > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Thu, Feb 16, 2012 at 12:54 PM, wrote: > >> Dear list, >> >> I would like to parallelize a multigrid code by Petsc. I do not want to >> use the DMMG infrastructure, since it will be replaced in the next PETSc >> release. Therefore I preferred to use the multigrid as a preconditioner. In >> practice, I use the Richardson iteration, choosing the same matrix of the >> linear system as a preconditioner, so that I think the Richardson iteration >> should converge in only one iteration, and effectively it is like solving >> the whole linear system by the multigrid. >> > > Your understanding of the Richardson iteration is flawed. You can consult > Yousef Saad's book for the standard definition and anaysis. > I think my explanation was not so clear. What I would like to do is to use a preconditioned Richardson iteration: x^(n+1) = x^n + P^(-1) (f-A x^n) Choosing P=A, I should expect to obtain the exact solution at the first iteration. Then, the whole linear system is solved by the preconditioner method that I chose. Is it what Petsc would do? > >> As a first test, I tried to use a one-grid multigrid (then not a truly >> multigrid). I just set the coarse solver (by a standard KSPSolve), and it >> should be enough, because the multigrid starts already from the coarsest >> grid and then it does not need the smoother and the transfer operators. >> Unfortunately, the iteration scheme (which should be the Richardson >> scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong solution. >> On the other hand, if I solve the whole problem with the standard KSPSolve >> (then withouth setting the multigrid as a preconditioner ...), it converges >> to the right solution with a reason=2. >> > > Yes, give Richardson many more iterations, -ksp_max_it. > > Matt > I tried, but unfortunately nothing changed. Another strange phenomen is that, even with the standard KSP solver (which provides the right solution), if I use the flag -ksp_monitor, nothing is displayed. Is that an indicator of some troubles? Thank you in advance. Armando > >> I thought that the two methods should be the exactly the same method, and >> I do not understand why they provide different convergence results. >> >> Here is the relevant code: >> >> // Set the matrix of the linear system >> Mat Mcc; >> ierr=MatCreate(PETSC_COMM_**WORLD,&Mcc); CHKERRQ(ierr); >> ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); >> ierr=MatSetSizes(Mcc,PETSC_**DECIDE,PETSC_DECIDE,1000,1000)**; >> CHKERRQ(ierr); >> ierr=setMatrix(Mcc); //It is a routine that set the values of the matrix >> Mcc >> >> // Set the ksp solver with the multigrid as a preconditioner >> KSP ksp, KspSolver; >> ierr = KSPCreate(PETSC_COMM_WORLD,&**ksp);CHKERRQ(ierr); >> ierr = KSPSetType(ksp,KSPRICHARDSON); >> ierr = KSPGetPC(ksp,&pc);CHKERRQ(**ierr); >> ierr = PCSetType(pc,PCMG);CHKERRQ(**ierr); >> ierr = PCMGSetLevels(pc,1,&PETSC_**COMM_WORLD);CHKERRQ(ierr); >> ierr = PCMGSetType(pc,PC_MG_**MULTIPLICATIVE);CHKERRQ(ierr); >> ierr = PCMGGetCoarseSolve(pc,&**kspCoarseSolve);CHKERRQ(ierr); >> ierr = KSPSetOperators(**kspCoarseSolve,Mcc,Mcc,** >> DIFFERENT_NONZERO_PATTERN);**CHKERRQ(ierr); >> ierr = KSPSetTolerances(**kspCoarseSolve,1.e-12,PETSC_** >> DEFAULT,PETSC_DEFAULT,PETSC_**DEFAULT);CHKERRQ(ierr); >> ierr = KSPSetOperators(ksp,Mcc,Mcc,**DIFFERENT_NONZERO_PATTERN);** >> CHKERRQ(ierr); >> ierr = KSPSetTolerances(ksp,1.e-12,**PETSC_DEFAULT,PETSC_DEFAULT,** >> PETSC_DEFAULT);CHKERRQ(ierr); >> ierr = KSPSolve(ksp,RHS,U);CHKERRQ(**ierr); >> >> // Solve with the standard KSPSolve >> KSP ksp1; >> ierr = KSPCreate(PETSC_COMM_WORLD,&**ksp1);CHKERRQ(ierr); >> ierr = KSPSetOperators(ksp1,Mcc,Mcc,**DIFFERENT_NONZERO_PATTERN);** >> CHKERRQ(ierr); >> ierr = KSPSetTolerances(ksp1,1.e-12/(**2*nn123),PETSC_DEFAULT,PETSC_** >> DEFAULT,PETSC_DEFAULT);**CHKERRQ(ierr); >> ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(**ierr); >> >> >> At the end, the Vector U and U1 are different. >> Thank you. >> >> Best regards, >> Armando >> >> From jedbrown at mcs.anl.gov Fri Feb 17 07:27:26 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 17 Feb 2012 08:27:26 -0500 Subject: [petsc-users] KSP_PCApply Documentation? In-Reply-To: References: Message-ID: On Fri, Feb 17, 2012 at 03:27, Bibrak Qamar wrote: > I found the the above function used in KSPSolve_CG but couldn't find its > documentation. I will appreciate any help in this direction. > It is not a public function, so if you want to look at it, you'll need to read the source. You should follow the instructions in the user's manual to set up tags with your editor so you can easily jump to the definitions. #define KSP_PCApply(ksp,x,y) (!ksp->transpose_solve) ? (PCApply(ksp->pc,x,y) || KSP_RemoveNullSpace(ksp,y)) : PCApplyTranspose(ksp->pc,x,y) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Fri Feb 17 07:53:44 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Fri, 17 Feb 2012 14:53:44 +0100 Subject: [petsc-users] What do with singular blocks in block matrix preconditioning? In-Reply-To: <4F3E3FE9.7090801@tu-dresden.de> References: <4F3CED48.1050605@tu-dresden.de> <4F3E3FE9.7090801@tu-dresden.de> Message-ID: On Feb 17, 2012, at 12:54 PM, Thomas Witkowski wrote: > Maybe some related question: Most textbooks write that the compatibility condition to solve a system with constant null space is that the right hand side has zero mean value. Today I read part of the Multigrid-book written by Trottenberg, and there the condition is written in a different form (eq 5.6.22 on page 185): the integral of the right hand side must be equal on the whole domain and on the boundary. Does any of you have an explanation for this condition? Is there a book/paper that considers the compatibility condition in more details? You might find these things in "Mixed and Hybrid Finite Element Methods" by F. Brezzi and M. Fortin. By the way, it seems your question is more suitably posed on http://scicomp.stackexchange.com > > Thomas > > Am 16.02.2012 12:49, schrieb Thomas Witkowski: >> I consider a 2x2 block matrix (saddle point) with the left upper block being singular due to Neumann boundary conditions. The whole block matrix is still non-singular. I worked on some ideas for block preconditioning, but there is always some problem with the singular block. All publications I know assume the block to be definite. There is also some work on highly singular blocks, but this is here not the case. Does some of you know papers about block preconditioners for some class of 2x2 saddle point problems, where the left upper block is assumed to be positive semi-definite? >> >> From a more practical point of view, I have the problem that, independently of a special kind of block preconditioner, one has always to solve (or to approximate the solution) a system with the singular block with an arbitrary right hand side. But in general the right hand side does not fulfill the compatibility condition of having zero mean. Is there a way out of this problem? >> >> Thomas > From jedbrown at mcs.anl.gov Fri Feb 17 09:44:20 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 17 Feb 2012 10:44:20 -0500 Subject: [petsc-users] What do with singular blocks in block matrix preconditioning? In-Reply-To: <4F3E3FE9.7090801@tu-dresden.de> References: <4F3CED48.1050605@tu-dresden.de> <4F3E3FE9.7090801@tu-dresden.de> Message-ID: On Fri, Feb 17, 2012 at 06:54, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Maybe some related question: Most textbooks write that the compatibility > condition to solve a system with constant null space is that the right hand > side has zero mean value. Today I read part of the Multigrid-book written > by Trottenberg, and there the condition is written in a different form (eq > 5.6.22 on page 185): the integral of the right hand side must be equal on > the whole domain and on the boundary. Does any of you have an explanation > for this condition? Is there a book/paper that considers the compatibility > condition in more details? It's common to project the null space out of the right hand side (yielding a "consistent" right hand side). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 17 09:53:21 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 17 Feb 2012 10:53:21 -0500 Subject: [petsc-users] What do with singular blocks in block matrix preconditioning? In-Reply-To: <4F3CED48.1050605@tu-dresden.de> References: <4F3CED48.1050605@tu-dresden.de> Message-ID: On Thu, Feb 16, 2012 at 06:49, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > I consider a 2x2 block matrix (saddle point) with the left upper block > being singular due to Neumann boundary conditions. The whole block matrix > is still non-singular. I worked on some ideas for block preconditioning, > but there is always some problem with the singular block. All publications > I know assume the block to be definite. There is also some work on highly > singular blocks, but this is here not the case. Does some of you know > papers about block preconditioners for some class of 2x2 saddle point > problems, where the left upper block is assumed to be positive > semi-definite? > I could search, but I don't recall a paper specifically addressing this issue. In practice, you should remove the constant null space and use a preconditioner that is stable even on the singular operator (as with any singular operator). > > From a more practical point of view, I have the problem that, > independently of a special kind of block preconditioner, one has always to > solve (or to approximate the solution) a system with the singular block > with an arbitrary right hand side. But in general the right hand side does > not fulfill the compatibility condition of having zero mean. Is there a way > out of this problem? > Make the right hand side consistent by removing the null space. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 17 10:11:51 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 17 Feb 2012 10:11:51 -0600 Subject: [petsc-users] Multigrid as a preconditioner In-Reply-To: <20120217133824.Horde.W2oCYuph4B9PPkpAxyJxE0A@mbox.dmi.unict.it> References: <20120217133824.Horde.W2oCYuph4B9PPkpAxyJxE0A@mbox.dmi.unict.it> Message-ID: On Fri, Feb 17, 2012 at 6:38 AM, wrote: > Thank you very much for the answer, but some other doubts remain. > > Date: Thu, 16 Feb 2012 13:39:11 -0600 >> From: Matthew Knepley >> Subject: Re: [petsc-users] Multigrid as a preconditioner >> To: PETSc users list >> Message-ID: >> > RPAq+0Zg at mail.gmail.com >> > >> Content-Type: text/plain; charset="iso-8859-1" >> >> On Thu, Feb 16, 2012 at 12:54 PM, wrote: >> >> Dear list, >>> >>> I would like to parallelize a multigrid code by Petsc. I do not want to >>> use the DMMG infrastructure, since it will be replaced in the next PETSc >>> release. Therefore I preferred to use the multigrid as a preconditioner. >>> In >>> practice, I use the Richardson iteration, choosing the same matrix of the >>> linear system as a preconditioner, so that I think the Richardson >>> iteration >>> should converge in only one iteration, and effectively it is like solving >>> the whole linear system by the multigrid. >>> >>> >> Your understanding of the Richardson iteration is flawed. You can consult >> Yousef Saad's book for the standard definition and anaysis. >> >> > I think my explanation was not so clear. What I would like to do is to use > a preconditioned Richardson iteration: > > x^(n+1) = x^n + P^(-1) (f-A x^n) > > Choosing P=A, I should expect to obtain the exact solution at the first > iteration. Then, the whole linear system is solved by the preconditioner > method that I chose. Is it what Petsc would do? > I am not sure what you mean by "Is it what Petsc would do?". PETSc does what you tell it to do. If you want it to solve in one iteration, tell it to use LU, -ksp_type richardson -pc_type lu. > >> As a first test, I tried to use a one-grid multigrid (then not a truly >>> multigrid). I just set the coarse solver (by a standard KSPSolve), and it >>> should be enough, because the multigrid starts already from the coarsest >>> grid and then it does not need the smoother and the transfer operators. >>> Unfortunately, the iteration scheme (which should be the Richardson >>> scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong >>> solution. >>> On the other hand, if I solve the whole problem with the standard >>> KSPSolve >>> (then withouth setting the multigrid as a preconditioner ...), it >>> converges >>> to the right solution with a reason=2. >>> >>> >> Yes, give Richardson many more iterations, -ksp_max_it. >> >> Matt >> >> > I tried, but unfortunately nothing changed. > Another strange phenomen is that, even with the standard KSP solver (which > provides the right solution), if I use the flag -ksp_monitor, nothing is > displayed. Is that an indicator of some troubles? > 1) You must call KSPSetFromOptions() to use command line options 2) Run with -ksp_monitor -ksp_view and send the results, or there is no way we can know what is happening. Matt > Thank you in advance. > Armando > > >> I thought that the two methods should be the exactly the same method, and >>> I do not understand why they provide different convergence results. >>> >>> Here is the relevant code: >>> >>> // Set the matrix of the linear system >>> Mat Mcc; >>> ierr=MatCreate(PETSC_COMM_****WORLD,&Mcc); CHKERRQ(ierr); >>> ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); >>> ierr=MatSetSizes(Mcc,PETSC_****DECIDE,PETSC_DECIDE,1000,1000)****; >>> CHKERRQ(ierr); >>> ierr=setMatrix(Mcc); //It is a routine that set the values of the matrix >>> Mcc >>> >>> // Set the ksp solver with the multigrid as a preconditioner >>> KSP ksp, KspSolver; >>> ierr = KSPCreate(PETSC_COMM_WORLD,&****ksp);CHKERRQ(ierr); >>> ierr = KSPSetType(ksp,KSPRICHARDSON); >>> ierr = KSPGetPC(ksp,&pc);CHKERRQ(****ierr); >>> ierr = PCSetType(pc,PCMG);CHKERRQ(****ierr); >>> ierr = PCMGSetLevels(pc,1,&PETSC_****COMM_WORLD);CHKERRQ(ierr); >>> ierr = PCMGSetType(pc,PC_MG_****MULTIPLICATIVE);CHKERRQ(ierr); >>> ierr = PCMGGetCoarseSolve(pc,&****kspCoarseSolve);CHKERRQ(ierr); >>> ierr = KSPSetOperators(****kspCoarseSolve,Mcc,Mcc,** >>> DIFFERENT_NONZERO_PATTERN);****CHKERRQ(ierr); >>> ierr = KSPSetTolerances(****kspCoarseSolve,1.e-12,PETSC_** >>> DEFAULT,PETSC_DEFAULT,PETSC_****DEFAULT);CHKERRQ(ierr); >>> ierr = KSPSetOperators(ksp,Mcc,Mcc,****DIFFERENT_NONZERO_PATTERN);** >>> CHKERRQ(ierr); >>> ierr = KSPSetTolerances(ksp,1.e-12,****PETSC_DEFAULT,PETSC_DEFAULT,** >>> PETSC_DEFAULT);CHKERRQ(ierr); >>> ierr = KSPSolve(ksp,RHS,U);CHKERRQ(****ierr); >>> >>> // Solve with the standard KSPSolve >>> KSP ksp1; >>> ierr = KSPCreate(PETSC_COMM_WORLD,&****ksp1);CHKERRQ(ierr); >>> ierr = KSPSetOperators(ksp1,Mcc,Mcc,****DIFFERENT_NONZERO_PATTERN);** >>> CHKERRQ(ierr); >>> ierr = KSPSetTolerances(ksp1,1.e-12/(****2*nn123),PETSC_DEFAULT,** >>> PETSC_** >>> DEFAULT,PETSC_DEFAULT);****CHKERRQ(ierr); >>> ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(****ierr); >>> >>> >>> At the end, the Vector U and U1 are different. >>> Thank you. >>> >>> Best regards, >>> Armando >>> >>> >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 17 11:27:45 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 17 Feb 2012 11:27:45 -0600 Subject: [petsc-users] Unhandled exception at 0x00000001408cca07 ... In-Reply-To: <4F3E3619.8090705@gmail.com> References: <4F3E3619.8090705@gmail.com> Message-ID: On Fri, Feb 17, 2012 at 5:12 AM, TAY wee-beng wrote: > Hi, > > I'm runing my CFD code in windows visual studio 2008 with ifort 64bit > mpich2 64bit. > > I managed to build the PETSc library after doing some modifications - > [petsc-maint #105754] Error : Cannot determine Fortran module include flag. > > There is no error building my CFD code. > > However, when running my code, I got the error at: > > call PetscInitialize(PETSC_NULL_**CHARACTER,ierr) > > It jumps to : > > *ierr = PetscMemzero(name,256); if (*ierr) return; > > May I know what's wrong? It was working when I run it in compaq visual > fortran under windows xp 32bit. Can you get a stack trace? Without that, we really cannot figure out what is going on. This does not happen on our test machine. Matt > > -- > Yours sincerely, > > TAY wee-beng > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From coco at dmi.unict.it Fri Feb 17 13:09:59 2012 From: coco at dmi.unict.it (coco at dmi.unict.it) Date: Fri, 17 Feb 2012 20:09:59 +0100 Subject: [petsc-users] petsc-users Digest, Vol 38, Issue 41 Message-ID: <20120217200959.Horde.XZWFF_ph4B9PPqYHhDo3MeA@mbox.dmi.unict.it> Thank you for the answer. > Date: Fri, 17 Feb 2012 10:11:51 -0600 > From: Matthew Knepley > Subject: Re: [petsc-users] Multigrid as a preconditioner > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Fri, Feb 17, 2012 at 6:38 AM, wrote: > >> Thank you very much for the answer, but some other doubts remain. >> >> Date: Thu, 16 Feb 2012 13:39:11 -0600 >>> From: Matthew Knepley >>> Subject: Re: [petsc-users] Multigrid as a preconditioner >>> To: PETSc users list >>> Message-ID: >>> >> RPAq+0Zg at mail.gmail.com >>>> >>> Content-Type: text/plain; charset="iso-8859-1" >>> >>> On Thu, Feb 16, 2012 at 12:54 PM, wrote: >>> >>> Dear list, >>>> >>>> I would like to parallelize a multigrid code by Petsc. I do not want to >>>> use the DMMG infrastructure, since it will be replaced in the next PETSc >>>> release. Therefore I preferred to use the multigrid as a preconditioner. >>>> In >>>> practice, I use the Richardson iteration, choosing the same matrix of the >>>> linear system as a preconditioner, so that I think the Richardson >>>> iteration >>>> should converge in only one iteration, and effectively it is like solving >>>> the whole linear system by the multigrid. >>>> >>>> >>> Your understanding of the Richardson iteration is flawed. You can consult >>> Yousef Saad's book for the standard definition and anaysis. >>> >>> >> I think my explanation was not so clear. What I would like to do is to use >> a preconditioned Richardson iteration: >> >> x^(n+1) = x^n + P^(-1) (f-A x^n) >> >> Choosing P=A, I should expect to obtain the exact solution at the first >> iteration. Then, the whole linear system is solved by the preconditioner >> method that I chose. Is it what Petsc would do? >> > > I am not sure what you mean by "Is it what Petsc would do?". PETSc does > what you tell it to do. If you want it > to solve in one iteration, tell it to use LU, -ksp_type richardson -pc_type > lu. > Indeed I would like to solve the whole linear system by a multigrid approach and not by a lu factorization. Therefore I would like to use -ksp_type richardson -pc_type mg. In this case, the preconditioned problem P^(-1) (f-A x^n) is solved exactly or it performs just a V-cycle iteration? In both cases, since I am using a one-grid multigrid (just for debugging), it should anyway provide the exact solution at the first iteration, but it is not so. > >> >>> As a first test, I tried to use a one-grid multigrid (then not a truly >>>> multigrid). I just set the coarse solver (by a standard KSPSolve), and it >>>> should be enough, because the multigrid starts already from the coarsest >>>> grid and then it does not need the smoother and the transfer operators. >>>> Unfortunately, the iteration scheme (which should be the Richardson >>>> scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong >>>> solution. >>>> On the other hand, if I solve the whole problem with the standard >>>> KSPSolve >>>> (then withouth setting the multigrid as a preconditioner ...), it >>>> converges >>>> to the right solution with a reason=2. >>>> >>>> >>> Yes, give Richardson many more iterations, -ksp_max_it. >>> >>> Matt >>> >>> >> I tried, but unfortunately nothing changed. >> Another strange phenomen is that, even with the standard KSP solver (which >> provides the right solution), if I use the flag -ksp_monitor, nothing is >> displayed. Is that an indicator of some troubles? >> > > 1) You must call KSPSetFromOptions() to use command line options > > 2) Run with -ksp_monitor -ksp_view and send the results, or there is no way > we can know what is happening. > > Matt > Thank you. I plotted the residual and it decreases so much slowly. It seems like it is using the non-preconditioned Richardson iteration x^(n+1) = x^n + P^(-1) (f-A x^n) with P=I instead of P=A. Thank you. Armando > >> Thank you in advance. >> Armando >> >> >>> I thought that the two methods should be the exactly the same method, and >>>> I do not understand why they provide different convergence results. >>>> >>>> Here is the relevant code: >>>> >>>> // Set the matrix of the linear system >>>> Mat Mcc; >>>> ierr=MatCreate(PETSC_COMM_****WORLD,&Mcc); CHKERRQ(ierr); >>>> ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); >>>> ierr=MatSetSizes(Mcc,PETSC_****DECIDE,PETSC_DECIDE,1000,1000)****; >>>> CHKERRQ(ierr); >>>> ierr=setMatrix(Mcc); //It is a routine that set the values of the matrix >>>> Mcc >>>> >>>> // Set the ksp solver with the multigrid as a preconditioner >>>> KSP ksp, KspSolver; >>>> ierr = KSPCreate(PETSC_COMM_WORLD,&****ksp);CHKERRQ(ierr); >>>> ierr = KSPSetType(ksp,KSPRICHARDSON); >>>> ierr = KSPGetPC(ksp,&pc);CHKERRQ(****ierr); >>>> ierr = PCSetType(pc,PCMG);CHKERRQ(****ierr); >>>> ierr = PCMGSetLevels(pc,1,&PETSC_****COMM_WORLD);CHKERRQ(ierr); >>>> ierr = PCMGSetType(pc,PC_MG_****MULTIPLICATIVE);CHKERRQ(ierr); >>>> ierr = PCMGGetCoarseSolve(pc,&****kspCoarseSolve);CHKERRQ(ierr); >>>> ierr = KSPSetOperators(****kspCoarseSolve,Mcc,Mcc,** >>>> DIFFERENT_NONZERO_PATTERN);****CHKERRQ(ierr); >>>> ierr = KSPSetTolerances(****kspCoarseSolve,1.e-12,PETSC_** >>>> DEFAULT,PETSC_DEFAULT,PETSC_****DEFAULT);CHKERRQ(ierr); >>>> ierr = KSPSetOperators(ksp,Mcc,Mcc,****DIFFERENT_NONZERO_PATTERN);** >>>> CHKERRQ(ierr); >>>> ierr = KSPSetTolerances(ksp,1.e-12,****PETSC_DEFAULT,PETSC_DEFAULT,** >>>> PETSC_DEFAULT);CHKERRQ(ierr); >>>> ierr = KSPSolve(ksp,RHS,U);CHKERRQ(****ierr); >>>> >>>> // Solve with the standard KSPSolve >>>> KSP ksp1; >>>> ierr = KSPCreate(PETSC_COMM_WORLD,&****ksp1);CHKERRQ(ierr); >>>> ierr = KSPSetOperators(ksp1,Mcc,Mcc,****DIFFERENT_NONZERO_PATTERN);** >>>> CHKERRQ(ierr); >>>> ierr = KSPSetTolerances(ksp1,1.e-12/(****2*nn123),PETSC_DEFAULT,** >>>> PETSC_** >>>> DEFAULT,PETSC_DEFAULT);****CHKERRQ(ierr); >>>> ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(****ierr); >>>> >>>> >>>> At the end, the Vector U and U1 are different. >>>> Thank you. >>>> >>>> Best regards, >>>> Armando >>>> >>>> >>>> From knepley at gmail.com Fri Feb 17 13:29:16 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 17 Feb 2012 13:29:16 -0600 Subject: [petsc-users] petsc-users Digest, Vol 38, Issue 41 In-Reply-To: <20120217200959.Horde.XZWFF_ph4B9PPqYHhDo3MeA@mbox.dmi.unict.it> References: <20120217200959.Horde.XZWFF_ph4B9PPqYHhDo3MeA@mbox.dmi.unict.it> Message-ID: On Fri, Feb 17, 2012 at 1:09 PM, wrote: > > Thank you for the answer. > > Date: Fri, 17 Feb 2012 10:11:51 -0600 >> From: Matthew Knepley >> Subject: Re: [petsc-users] Multigrid as a preconditioner >> To: PETSc users list >> Message-ID: >> <**CAMYG4GkKE6doSQ1FgHCSr1deZRYS0**kpvLQF8daqv0au==tOgyw at mail.** >> gmail.com > >> Content-Type: text/plain; charset="iso-8859-1" >> >> On Fri, Feb 17, 2012 at 6:38 AM, wrote: >> >> Thank you very much for the answer, but some other doubts remain. >>> >>> Date: Thu, 16 Feb 2012 13:39:11 -0600 >>> >>>> From: Matthew Knepley >>>> Subject: Re: [petsc-users] Multigrid as a preconditioner >>>> To: PETSc users list >>>> Message-ID: >>>> >>> RPAq+0Zg at mail.gmail.com>>> **RPAq%2B0Zg at mail.gmail.com >>>> > >>>> >>>>> >>>>> Content-Type: text/plain; charset="iso-8859-1" >>>> >>>> On Thu, Feb 16, 2012 at 12:54 PM, wrote: >>>> >>>> Dear list, >>>> >>>>> >>>>> I would like to parallelize a multigrid code by Petsc. I do not want to >>>>> use the DMMG infrastructure, since it will be replaced in the next >>>>> PETSc >>>>> release. Therefore I preferred to use the multigrid as a >>>>> preconditioner. >>>>> In >>>>> practice, I use the Richardson iteration, choosing the same matrix of >>>>> the >>>>> linear system as a preconditioner, so that I think the Richardson >>>>> iteration >>>>> should converge in only one iteration, and effectively it is like >>>>> solving >>>>> the whole linear system by the multigrid. >>>>> >>>>> >>>>> Your understanding of the Richardson iteration is flawed. You can >>>> consult >>>> Yousef Saad's book for the standard definition and anaysis. >>>> >>>> >>>> I think my explanation was not so clear. What I would like to do is to >>> use >>> a preconditioned Richardson iteration: >>> >>> x^(n+1) = x^n + P^(-1) (f-A x^n) >>> >>> Choosing P=A, I should expect to obtain the exact solution at the first >>> iteration. Then, the whole linear system is solved by the preconditioner >>> method that I chose. Is it what Petsc would do? >>> >>> >> I am not sure what you mean by "Is it what Petsc would do?". PETSc does >> what you tell it to do. If you want it >> to solve in one iteration, tell it to use LU, -ksp_type richardson >> -pc_type >> lu. >> >> > Indeed I would like to solve the whole linear system by a multigrid > approach and not by a lu factorization. Therefore I would like to use > -ksp_type richardson -pc_type mg. > In this case, the preconditioned problem P^(-1) (f-A x^n) is solved > exactly or it performs just a V-cycle iteration? In both cases, since I am > using a one-grid multigrid (just for debugging), it should anyway provide > the exact solution at the first iteration, but it is not so. > Great, however I have no idea what you are actually using if you do not send what I asked for in the last message. Matt > >> >>> As a first test, I tried to use a one-grid multigrid (then not a truly >>>> >>>>> multigrid). I just set the coarse solver (by a standard KSPSolve), and >>>>> it >>>>> should be enough, because the multigrid starts already from the >>>>> coarsest >>>>> grid and then it does not need the smoother and the transfer operators. >>>>> Unfortunately, the iteration scheme (which should be the Richardson >>>>> scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong >>>>> solution. >>>>> On the other hand, if I solve the whole problem with the standard >>>>> KSPSolve >>>>> (then withouth setting the multigrid as a preconditioner ...), it >>>>> converges >>>>> to the right solution with a reason=2. >>>>> >>>>> >>>>> Yes, give Richardson many more iterations, -ksp_max_it. >>>> >>>> Matt >>>> >>>> >>>> I tried, but unfortunately nothing changed. >>> Another strange phenomen is that, even with the standard KSP solver >>> (which >>> provides the right solution), if I use the flag -ksp_monitor, nothing is >>> displayed. Is that an indicator of some troubles? >>> >>> >> 1) You must call KSPSetFromOptions() to use command line options >> >> 2) Run with -ksp_monitor -ksp_view and send the results, or there is no >> way >> we can know what is happening. >> >> Matt >> >> > Thank you. I plotted the residual and it decreases so much slowly. It > seems like it is using the non-preconditioned Richardson iteration x^(n+1) > = x^n + P^(-1) (f-A x^n) with P=I instead of P=A. > > Thank you. > Armando > > >> Thank you in advance. >>> Armando >>> >>> >>> I thought that the two methods should be the exactly the same method, >>>> and >>>> >>>>> I do not understand why they provide different convergence results. >>>>> >>>>> Here is the relevant code: >>>>> >>>>> // Set the matrix of the linear system >>>>> Mat Mcc; >>>>> ierr=MatCreate(PETSC_COMM_******WORLD,&Mcc); CHKERRQ(ierr); >>>>> ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); >>>>> ierr=MatSetSizes(Mcc,PETSC_******DECIDE,PETSC_DECIDE,1000,**1000)****; >>>>> CHKERRQ(ierr); >>>>> ierr=setMatrix(Mcc); //It is a routine that set the values of the >>>>> matrix >>>>> Mcc >>>>> >>>>> // Set the ksp solver with the multigrid as a preconditioner >>>>> KSP ksp, KspSolver; >>>>> ierr = KSPCreate(PETSC_COMM_WORLD,&******ksp);CHKERRQ(ierr); >>>>> ierr = KSPSetType(ksp,KSPRICHARDSON); >>>>> ierr = KSPGetPC(ksp,&pc);CHKERRQ(******ierr); >>>>> ierr = PCSetType(pc,PCMG);CHKERRQ(******ierr); >>>>> ierr = PCMGSetLevels(pc,1,&PETSC_******COMM_WORLD);CHKERRQ(ierr); >>>>> ierr = PCMGSetType(pc,PC_MG_******MULTIPLICATIVE);CHKERRQ(ierr); >>>>> ierr = PCMGGetCoarseSolve(pc,&******kspCoarseSolve);CHKERRQ(ierr); >>>>> ierr = KSPSetOperators(******kspCoarseSolve,Mcc,Mcc,** >>>>> DIFFERENT_NONZERO_PATTERN);******CHKERRQ(ierr); >>>>> ierr = KSPSetTolerances(******kspCoarseSolve,1.e-12,PETSC_** >>>>> DEFAULT,PETSC_DEFAULT,PETSC_******DEFAULT);CHKERRQ(ierr); >>>>> ierr = KSPSetOperators(ksp,Mcc,Mcc,******DIFFERENT_NONZERO_PATTERN);** >>>>> ** >>>>> CHKERRQ(ierr); >>>>> ierr = KSPSetTolerances(ksp,1.e-12,******PETSC_DEFAULT,PETSC_DEFAULT,* >>>>> *** >>>>> PETSC_DEFAULT);CHKERRQ(ierr); >>>>> ierr = KSPSolve(ksp,RHS,U);CHKERRQ(******ierr); >>>>> >>>>> // Solve with the standard KSPSolve >>>>> KSP ksp1; >>>>> ierr = KSPCreate(PETSC_COMM_WORLD,&******ksp1);CHKERRQ(ierr); >>>>> ierr = KSPSetOperators(ksp1,Mcc,Mcc,******DIFFERENT_NONZERO_PATTERN);* >>>>> *** >>>>> CHKERRQ(ierr); >>>>> ierr = KSPSetTolerances(ksp1,1.e-12/(******2*nn123),PETSC_DEFAULT,** >>>>> PETSC_** >>>>> DEFAULT,PETSC_DEFAULT);******CHKERRQ(ierr); >>>>> ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(******ierr); >>>>> >>>>> >>>>> At the end, the Vector U and U1 are different. >>>>> Thank you. >>>>> >>>>> Best regards, >>>>> Armando >>>>> >>>>> >>>>> >>>>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Fri Feb 17 13:46:13 2012 From: bourdin at lsu.edu (Blaise Bourdin) Date: Fri, 17 Feb 2012 13:46:13 -0600 Subject: [petsc-users] ISAllGather withoutduplicates Message-ID: Hi, Is there an easy way to gather all values of an IS across all processes in a communicator while removing duplicates? Basically, I want to go from [0] Number of indices in set 2 [0] 0 1 [0] 1 2 [1] Number of indices in set 2 [1] 0 2 [1] 1 3 to [0] Number of indices in set 3 [0] 0 1 [0] 1 2 [0] 2 3 [1] Number of indices in set 3 [1] 0 1 [1] 1 2 [1] 2 3 The way I do it right now is ierr = ISGetTotalIndices(csIS,&labels);CHKERRQ(ierr); ierr = ISGetSize(csIS,&num_cs_global);CHKERRQ(ierr); ierr = PetscMalloc(num_cs_global * sizeof(PetscInt),&labels2); for (i = 0; i < num_cs_global; i++) { labels2[i] = labels[i]; } ierr = PetscSortRemoveDupsInt(&num_cs_global,labels2);CHKERRQ(ierr); ierr = ISCreateGeneral(comm,num_cs_global,labels2,PETSC_COPY_VALUES,&csIS_global);CHKERRQ(ierr); ierr = ISRestoreTotalIndices(csIS,&labels);CHKERRQ(ierr); ierr = PetscFree(labels2);CHKERRQ(ierr); but there has to be a better way (or at least one that does not involve copying from const PetscInt *labels to PetscInt *labels2, and then uses again PETSC_COPY_VALUES). Blaise -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin From knepley at gmail.com Fri Feb 17 13:54:29 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 17 Feb 2012 13:54:29 -0600 Subject: [petsc-users] ISAllGather withoutduplicates In-Reply-To: References: Message-ID: On Fri, Feb 17, 2012 at 1:46 PM, Blaise Bourdin wrote: > Hi, > > Is there an easy way to gather all values of an IS across all processes in > a communicator while removing duplicates? > > Basically, I want to go from > [0] Number of indices in set 2 > [0] 0 1 > [0] 1 2 > [1] Number of indices in set 2 > [1] 0 2 > [1] 1 3 > > to > [0] Number of indices in set 3 > [0] 0 1 > [0] 1 2 > [0] 2 3 > [1] Number of indices in set 3 > [1] 0 1 > [1] 1 2 > [1] 2 3 > > The way I do it right now is > ierr = ISGetTotalIndices(csIS,&labels);CHKERRQ(ierr); > ierr = ISGetSize(csIS,&num_cs_global);CHKERRQ(ierr); > I would violate PETSc semantics here since you are going to destroy csIS anyway: PetscSortRemoveDupsInt(&num_cs_global, labels); ISCreateGeneral(comm, num_cs_global, labels, PETSC_COPY_VALUES, &csIS_global); > ierr = ISRestoreTotalIndices(csIS,&labels);CHKERRQ(ierr); > Matt > > Blaise > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 > http://www.math.lsu.edu/~bourdin > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 17 14:38:55 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 17 Feb 2012 15:38:55 -0500 Subject: [petsc-users] petsc-users Digest, Vol 38, Issue 41 In-Reply-To: <20120217200959.Horde.XZWFF_ph4B9PPqYHhDo3MeA@mbox.dmi.unict.it> References: <20120217200959.Horde.XZWFF_ph4B9PPqYHhDo3MeA@mbox.dmi.unict.it> Message-ID: On Fri, Feb 17, 2012 at 14:09, wrote: > Indeed I would like to solve the whole linear system by a multigrid > approach and not by a lu factorization. Therefore I would like to use > -ksp_type richardson -pc_type mg. > In this case, the preconditioned problem P^(-1) (f-A x^n) is solved > exactly or it performs just a V-cycle iteration? In both cases, since I am > using a one-grid multigrid (just for debugging), it should anyway provide > the exact solution at the first iteration, but it is not so. > -pc_type mg with one level just applies a normal smoother. I've sometimes thought it should do a coarse-level solve instead, but I haven't messed with it. Barry, why doesn't it do a direct solve? In general -pc_type mg does one multigrid cycle (usually a V or W cycle). If you want to use multiple iterations, you can -pc_type ksp -ksp_pc_type mg which would use the default KSP (GMRES) as an iteration, preconditioned by multigrid. The "outer" problem will see the result of this converged iterative solve. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Fri Feb 17 14:39:21 2012 From: bourdin at lsu.edu (Blaise Bourdin) Date: Fri, 17 Feb 2012 14:39:21 -0600 Subject: [petsc-users] ISAllGather withoutduplicates In-Reply-To: References: Message-ID: > The way I do it right now is > ierr = ISGetTotalIndices(csIS,&labels);CHKERRQ(ierr); > ierr = ISGetSize(csIS,&num_cs_global);CHKERRQ(ierr); > > I would violate PETSc semantics here since you are going to destroy csIS anyway: > > PetscSortRemoveDupsInt(&num_cs_global, labels); That's what I was thinking too, but PetscSortRemoveDupsInt expect a PetscInt* not a const PetscInt* It's not a big deal, I can live with the 2 copies, considering that the local size of the IS is going to be quite small. Blaise > ISCreateGeneral(comm, num_cs_global, labels, PETSC_COPY_VALUES, &csIS_global); > > ierr = ISRestoreTotalIndices(csIS,&labels);CHKERRQ(ierr); > > Matt > > > Blaise > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 17 14:45:07 2012 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 17 Feb 2012 14:45:07 -0600 Subject: [petsc-users] ISAllGather withoutduplicates In-Reply-To: References: Message-ID: On Fri, Feb 17, 2012 at 2:39 PM, Blaise Bourdin wrote: > The way I do it right now is >> ierr = ISGetTotalIndices(csIS,&labels);CHKERRQ(ierr); >> ierr = ISGetSize(csIS,&num_cs_global);CHKERRQ(ierr); >> > > I would violate PETSc semantics here since you are going to destroy csIS > anyway: > > > PetscSortRemoveDupsInt(&num_cs_global, labels); > > PetscSortRemoveDupsInt(&num_cs_global, (PetscInt *) labels); Matt > That's what I was thinking too, but PetscSortRemoveDupsInt expect a > PetscInt* not a const PetscInt* > > It's not a big deal, I can live with the 2 copies, considering that the > local size of the IS is going to be quite small. > > Blaise > > > > > ISCreateGeneral(comm, num_cs_global, labels, PETSC_COPY_VALUES, > &csIS_global); > > >> ierr = ISRestoreTotalIndices(csIS,&labels);CHKERRQ(ierr); >> > > Matt > > >> >> Blaise >> -- >> Department of Mathematics and Center for Computation & Technology >> Louisiana State University, Baton Rouge, LA 70803, USA >> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 >> http://www.math.lsu.edu/~bourdin >> >> >> >> >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 > http://www.math.lsu.edu/~bourdin > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Fri Feb 17 14:47:00 2012 From: bourdin at lsu.edu (Blaise Bourdin) Date: Fri, 17 Feb 2012 14:47:00 -0600 Subject: [petsc-users] ISAllGather withoutduplicates In-Reply-To: References: Message-ID: <806A967F-7EAF-4568-B833-27D773A76B71@lsu.edu> On Feb 17, 2012, at 2:45 PM, Matthew Knepley wrote: > On Fri, Feb 17, 2012 at 2:39 PM, Blaise Bourdin wrote: >> The way I do it right now is >> ierr = ISGetTotalIndices(csIS,&labels);CHKERRQ(ierr); >> ierr = ISGetSize(csIS,&num_cs_global);CHKERRQ(ierr); >> >> I would violate PETSc semantics here since you are going to destroy csIS anyway: >> >> PetscSortRemoveDupsInt(&num_cs_global, labels); > > PetscSortRemoveDupsInt(&num_cs_global, (PetscInt *) labels); Facepalm... Blaise > > Matt > > That's what I was thinking too, but PetscSortRemoveDupsInt expect a PetscInt* not a const PetscInt* > > It's not a big deal, I can live with the 2 copies, considering that the local size of the IS is going to be quite small. > > Blaise > > > > >> ISCreateGeneral(comm, num_cs_global, labels, PETSC_COPY_VALUES, &csIS_global); >> >> ierr = ISRestoreTotalIndices(csIS,&labels);CHKERRQ(ierr); >> >> Matt >> >> >> Blaise >> -- >> Department of Mathematics and Center for Computation & Technology >> Louisiana State University, Baton Rouge, LA 70803, USA >> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin >> >> >> >> >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > -- > Department of Mathematics and Center for Computation & Technology > Louisiana State University, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -- Department of Mathematics and Center for Computation & Technology Louisiana State University, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From coco at dmi.unict.it Fri Feb 17 18:49:56 2012 From: coco at dmi.unict.it (coco at dmi.unict.it) Date: Sat, 18 Feb 2012 01:49:56 +0100 Subject: [petsc-users] Multigrid as a preconditioner In-Reply-To: References: Message-ID: <20120218014956.Horde.hOgkeeph4B9PPvW0UfjRhYA@mbox.dmi.unict.it> > Date: Fri, 17 Feb 2012 13:29:16 -0600 > From: Matthew Knepley > Subject: Re: [petsc-users] petsc-users Digest, Vol 38, Issue 41 > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Fri, Feb 17, 2012 at 1:09 PM, wrote: > >> >> Thank you for the answer. >> >> Date: Fri, 17 Feb 2012 10:11:51 -0600 >>> From: Matthew Knepley >>> Subject: Re: [petsc-users] Multigrid as a preconditioner >>> To: PETSc users list >>> Message-ID: >>> <**CAMYG4GkKE6doSQ1FgHCSr1deZRYS0**kpvLQF8daqv0au==tOgyw at mail.** >>> gmail.com > >>> Content-Type: text/plain; charset="iso-8859-1" >>> >>> On Fri, Feb 17, 2012 at 6:38 AM, wrote: >>> >>> Thank you very much for the answer, but some other doubts remain. >>>> >>>> Date: Thu, 16 Feb 2012 13:39:11 -0600 >>>> >>>>> From: Matthew Knepley >>>>> Subject: Re: [petsc-users] Multigrid as a preconditioner >>>>> To: PETSc users list >>>>> Message-ID: >>>>> >>>> RPAq+0Zg at mail.gmail.com>>>> **RPAq%2B0Zg at mail.gmail.com >>>>> > >>>>> >>>>>> >>>>>> Content-Type: text/plain; charset="iso-8859-1" >>>>> >>>>> On Thu, Feb 16, 2012 at 12:54 PM, wrote: >>>>> >>>>> Dear list, >>>>> >>>>>> >>>>>> I would like to parallelize a multigrid code by Petsc. I do not want to >>>>>> use the DMMG infrastructure, since it will be replaced in the next >>>>>> PETSc >>>>>> release. Therefore I preferred to use the multigrid as a >>>>>> preconditioner. >>>>>> In >>>>>> practice, I use the Richardson iteration, choosing the same matrix of >>>>>> the >>>>>> linear system as a preconditioner, so that I think the Richardson >>>>>> iteration >>>>>> should converge in only one iteration, and effectively it is like >>>>>> solving >>>>>> the whole linear system by the multigrid. >>>>>> >>>>>> >>>>>> Your understanding of the Richardson iteration is flawed. You can >>>>> consult >>>>> Yousef Saad's book for the standard definition and anaysis. >>>>> >>>>> >>>>> I think my explanation was not so clear. What I would like to do is to >>>> use >>>> a preconditioned Richardson iteration: >>>> >>>> x^(n+1) = x^n + P^(-1) (f-A x^n) >>>> >>>> Choosing P=A, I should expect to obtain the exact solution at the first >>>> iteration. Then, the whole linear system is solved by the preconditioner >>>> method that I chose. Is it what Petsc would do? >>>> >>>> >>> I am not sure what you mean by "Is it what Petsc would do?". PETSc does >>> what you tell it to do. If you want it >>> to solve in one iteration, tell it to use LU, -ksp_type richardson >>> -pc_type >>> lu. >>> >>> >> Indeed I would like to solve the whole linear system by a multigrid >> approach and not by a lu factorization. Therefore I would like to use >> -ksp_type richardson -pc_type mg. >> In this case, the preconditioned problem P^(-1) (f-A x^n) is solved >> exactly or it performs just a V-cycle iteration? In both cases, since I am >> using a one-grid multigrid (just for debugging), it should anyway provide >> the exact solution at the first iteration, but it is not so. >> > > Great, however I have no idea what you are actually using if you do not > send what I asked for in the last message. > > Matt > > This is the output of the options: -ksp_monitor -ksp_view 0 KSP Residual norm 1.393931159942e+00 [...] 10000 KSP Residual norm 1.000095856678e+00 KSP Object: type: richardson Richardson: damping factor=1 maximum iterations=10000, initial guess is zero tolerances: relative=3.75657e-16, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: type: mg MG: type is MULTIPLICATIVE, levels=1 cycles=v Cycles per PCApply=1 Coarse grid solver -- level 0 presmooths=1 postsmooths=1 ----- KSP Object:(mg_levels_0_) type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=3.75657e-16, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object:(mg_levels_0_) type: none linear system matrix = precond matrix: Matrix Object: type=shell, rows=2662, cols=2662 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=2662, cols=2662 total: nonzeros=12160, allocated nonzeros=31324 not using I-node (on process 0) routines Thank you. Armando >> >>> >>>> As a first test, I tried to use a one-grid multigrid (then not a truly >>>>> >>>>>> multigrid). I just set the coarse solver (by a standard KSPSolve), and >>>>>> it >>>>>> should be enough, because the multigrid starts already from the >>>>>> coarsest >>>>>> grid and then it does not need the smoother and the transfer operators. >>>>>> Unfortunately, the iteration scheme (which should be the Richardson >>>>>> scheme) converges with a reason=4 (KSP_CONVERGED_ITS) to a wrong >>>>>> solution. >>>>>> On the other hand, if I solve the whole problem with the standard >>>>>> KSPSolve >>>>>> (then withouth setting the multigrid as a preconditioner ...), it >>>>>> converges >>>>>> to the right solution with a reason=2. >>>>>> >>>>>> >>>>>> Yes, give Richardson many more iterations, -ksp_max_it. >>>>> >>>>> Matt >>>>> >>>>> >>>>> I tried, but unfortunately nothing changed. >>>> Another strange phenomen is that, even with the standard KSP solver >>>> (which >>>> provides the right solution), if I use the flag -ksp_monitor, nothing is >>>> displayed. Is that an indicator of some troubles? >>>> >>>> >>> 1) You must call KSPSetFromOptions() to use command line options >>> >>> 2) Run with -ksp_monitor -ksp_view and send the results, or there is no >>> way >>> we can know what is happening. >>> >>> Matt >>> >>> >> Thank you. I plotted the residual and it decreases so much slowly. It >> seems like it is using the non-preconditioned Richardson iteration x^(n+1) >> = x^n + P^(-1) (f-A x^n) with P=I instead of P=A. >> >> Thank you. >> Armando >> >> >>> Thank you in advance. >>>> Armando >>>> >>>> >>>> I thought that the two methods should be the exactly the same method, >>>>> and >>>>> >>>>>> I do not understand why they provide different convergence results. >>>>>> >>>>>> Here is the relevant code: >>>>>> >>>>>> // Set the matrix of the linear system >>>>>> Mat Mcc; >>>>>> ierr=MatCreate(PETSC_COMM_******WORLD,&Mcc); CHKERRQ(ierr); >>>>>> ierr=MatSetType(Mcc, MATMPIAIJ); CHKERRQ(ierr); >>>>>> ierr=MatSetSizes(Mcc,PETSC_******DECIDE,PETSC_DECIDE,1000,**1000)****; >>>>>> CHKERRQ(ierr); >>>>>> ierr=setMatrix(Mcc); //It is a routine that set the values of the >>>>>> matrix >>>>>> Mcc >>>>>> >>>>>> // Set the ksp solver with the multigrid as a preconditioner >>>>>> KSP ksp, KspSolver; >>>>>> ierr = KSPCreate(PETSC_COMM_WORLD,&******ksp);CHKERRQ(ierr); >>>>>> ierr = KSPSetType(ksp,KSPRICHARDSON); >>>>>> ierr = KSPGetPC(ksp,&pc);CHKERRQ(******ierr); >>>>>> ierr = PCSetType(pc,PCMG);CHKERRQ(******ierr); >>>>>> ierr = PCMGSetLevels(pc,1,&PETSC_******COMM_WORLD);CHKERRQ(ierr); >>>>>> ierr = PCMGSetType(pc,PC_MG_******MULTIPLICATIVE);CHKERRQ(ierr); >>>>>> ierr = PCMGGetCoarseSolve(pc,&******kspCoarseSolve);CHKERRQ(ierr); >>>>>> ierr = KSPSetOperators(******kspCoarseSolve,Mcc,Mcc,** >>>>>> DIFFERENT_NONZERO_PATTERN);******CHKERRQ(ierr); >>>>>> ierr = KSPSetTolerances(******kspCoarseSolve,1.e-12,PETSC_** >>>>>> DEFAULT,PETSC_DEFAULT,PETSC_******DEFAULT);CHKERRQ(ierr); >>>>>> ierr = KSPSetOperators(ksp,Mcc,Mcc,******DIFFERENT_NONZERO_PATTERN);** >>>>>> ** >>>>>> CHKERRQ(ierr); >>>>>> ierr = KSPSetTolerances(ksp,1.e-12,******PETSC_DEFAULT,PETSC_DEFAULT,* >>>>>> *** >>>>>> PETSC_DEFAULT);CHKERRQ(ierr); >>>>>> ierr = KSPSolve(ksp,RHS,U);CHKERRQ(******ierr); >>>>>> >>>>>> // Solve with the standard KSPSolve >>>>>> KSP ksp1; >>>>>> ierr = KSPCreate(PETSC_COMM_WORLD,&******ksp1);CHKERRQ(ierr); >>>>>> ierr = KSPSetOperators(ksp1,Mcc,Mcc,******DIFFERENT_NONZERO_PATTERN);* >>>>>> *** >>>>>> CHKERRQ(ierr); >>>>>> ierr = KSPSetTolerances(ksp1,1.e-12/(******2*nn123),PETSC_DEFAULT,** >>>>>> PETSC_** >>>>>> DEFAULT,PETSC_DEFAULT);******CHKERRQ(ierr); >>>>>> ierr = KSPSolve(ksp1,RHS,U1);CHKERRQ(******ierr); >>>>>> >>>>>> >>>>>> At the end, the Vector U and U1 are different. >>>>>> Thank you. >>>>>> >>>>>> Best regards, >>>>>> Armando >>>>>> >>>>>> >>>>>> >>>>>> From coco at dmi.unict.it Fri Feb 17 18:53:28 2012 From: coco at dmi.unict.it (coco at dmi.unict.it) Date: Sat, 18 Feb 2012 01:53:28 +0100 Subject: [petsc-users] Multigrid as a preconditioner In-Reply-To: References: Message-ID: <20120218015328.Horde.t0YUAOph4B9PPvaIsm_hhsA@mbox.dmi.unict.it> > Date: Fri, 17 Feb 2012 15:38:55 -0500 > From: Jed Brown > Subject: Re: [petsc-users] petsc-users Digest, Vol 38, Issue 41 > To: PETSc users list > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Fri, Feb 17, 2012 at 14:09, wrote: > >> Indeed I would like to solve the whole linear system by a multigrid >> approach and not by a lu factorization. Therefore I would like to use >> -ksp_type richardson -pc_type mg. >> In this case, the preconditioned problem P^(-1) (f-A x^n) is solved >> exactly or it performs just a V-cycle iteration? In both cases, since I am >> using a one-grid multigrid (just for debugging), it should anyway provide >> the exact solution at the first iteration, but it is not so. >> > > -pc_type mg with one level just applies a normal smoother. I've sometimes > thought it should do a coarse-level solve instead, but I haven't messed > with it. Barry, why doesn't it do a direct solve? > This explains why the residual decreases so slowly: because it applies the smoother instead of the coarse solver. > In general -pc_type mg does one multigrid cycle (usually a V or W cycle). > If you want to use multiple iterations, you can > > -pc_type ksp -ksp_pc_type mg > > which would use the default KSP (GMRES) as an iteration, preconditioned by > multigrid. The "outer" problem will see the result of this converged > iterative solve. I've perfectly understood that. Thank you very much. Armando From bsmith at mcs.anl.gov Sat Feb 18 12:59:58 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 18 Feb 2012 12:59:58 -0600 Subject: [petsc-users] petsc-users Digest, Vol 38, Issue 41 In-Reply-To: References: <20120217200959.Horde.XZWFF_ph4B9PPqYHhDo3MeA@mbox.dmi.unict.it> Message-ID: <08ED7087-6596-43D6-97AF-0B4D3BB806DD@mcs.anl.gov> On Feb 17, 2012, at 2:38 PM, Jed Brown wrote: > On Fri, Feb 17, 2012 at 14:09, wrote: > Indeed I would like to solve the whole linear system by a multigrid approach and not by a lu factorization. Therefore I would like to use -ksp_type richardson -pc_type mg. > In this case, the preconditioned problem P^(-1) (f-A x^n) is solved exactly or it performs just a V-cycle iteration? In both cases, since I am using a one-grid multigrid (just for debugging), it should anyway provide the exact solution at the first iteration, but it is not so. > > -pc_type mg with one level just applies a normal smoother. I've sometimes thought it should do a coarse-level solve instead, but I haven't messed with it. Barry, why doesn't it do a direct solve? 1) Because MG is an accelerator of the basic smoother, MG is not a deccelerator of a direct solver. That is the action of adding a coarser level is suppose to improve the convergence of the solver. 2) Because if you used a direct solver and the user switched from one to two levels they would be dismayed at the worsening of the convergence. If the user ran a large problem on one level it would run out of memory. 3) I don't think there is really a "correct" abstract or practical answer to which it should be (hence my two snide answers above) I am happy with the current default > > In general -pc_type mg does one multigrid cycle (usually a V or W cycle). If you want to use multiple iterations, you can > > -pc_type ksp -ksp_pc_type mg > > which would use the default KSP (GMRES) as an iteration, preconditioned by multigrid. The "outer" problem will see the result of this converged iterative solve. You can use -pc_mg_multiplicative_cycles 2 to use 2 V or W cycles as the preconditioner etc. Barry From xyuan at lbl.gov Sat Feb 18 18:52:00 2012 From: xyuan at lbl.gov (Xuefei Yuan (Rebecca)) Date: Sat, 18 Feb 2012 16:52:00 -0800 Subject: [petsc-users] Any updates in /src/mat/examples/tests/ex81.c Message-ID: <0BF6BA2C-B14F-4CDD-BF52-8234782FF124@lbl.gov> Hello, all, This routine is to convert the matrix to the HB format, but it seems this code is not right. Any update on it? Thanks, Rebecca From bibrakc at gmail.com Sat Feb 18 23:58:09 2012 From: bibrakc at gmail.com (Bibrak Qamar) Date: Sun, 19 Feb 2012 09:58:09 +0400 Subject: [petsc-users] KSP_PCApply Documentation? In-Reply-To: References: Message-ID: Thank you very much. Bibrak On Fri, Feb 17, 2012 at 5:27 PM, Jed Brown wrote: > On Fri, Feb 17, 2012 at 03:27, Bibrak Qamar wrote: > >> I found the the above function used in KSPSolve_CG but couldn't find its >> documentation. I will appreciate any help in this direction. >> > > It is not a public function, so if you want to look at it, you'll need to > read the source. You should follow the instructions in the user's manual to > set up tags with your editor so you can easily jump to the definitions. > > #define KSP_PCApply(ksp,x,y) (!ksp->transpose_solve) ? > (PCApply(ksp->pc,x,y) || KSP_RemoveNullSpace(ksp,y)) : > PCApplyTranspose(ksp->pc,x,y) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolph at berkeley.edu Sun Feb 19 02:32:15 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Sun, 19 Feb 2012 00:32:15 -0800 Subject: [petsc-users] Specifying different ksp_types for multiple linear systems Message-ID: <63015095-91E7-4D69-871F-62C3B22F5238@berkeley.edu> In my thermomechanical convection code, I set up and solve two linear systems, the first for the Stokes system and the second for an energy equation. Currently, these are separate matrices and during each timestep I first create a KSP object for each system, then solve, then destroy the KSP contexts. I would like to try out the -pc_type fieldsplit for the Stokes system, but is it possible to use -pc_type fieldsplit for the Stokes system and different pc_type and ksp_type for the energy equation? Is there perhaps a way to name the KSP associated with the Stokes system and then refer to this label from the command line (e.g. -pc_type_0 fieldsplit -pc_type_1 lu) ? Thanks very much for your help. Max Rudolph From agrayver at gfz-potsdam.de Sun Feb 19 04:27:21 2012 From: agrayver at gfz-potsdam.de (Alexander Grayver) Date: Sun, 19 Feb 2012 11:27:21 +0100 Subject: [petsc-users] Specifying different ksp_types for multiple linear systems In-Reply-To: <63015095-91E7-4D69-871F-62C3B22F5238@berkeley.edu> References: <63015095-91E7-4D69-871F-62C3B22F5238@berkeley.edu> Message-ID: <4F40CE89.5090406@gfz-potsdam.de> KSPSetOptionsPrefix/PCSetOptionsPrefix? On 19.02.2012 09:32, Max Rudolph wrote: > In my thermomechanical convection code, I set up and solve two linear systems, the first for the Stokes system and the second for an energy equation. Currently, these are separate matrices and during each timestep I first create a KSP object for each system, then solve, then destroy the KSP contexts. I would like to try out the -pc_type fieldsplit for the Stokes system, but is it possible to use -pc_type fieldsplit for the Stokes system and different pc_type and ksp_type for the energy equation? Is there perhaps a way to name the KSP associated with the Stokes system and then refer to this label from the command line (e.g. -pc_type_0 fieldsplit -pc_type_1 lu) ? Thanks very much for your help. > > Max Rudolph -- Regards, Alexander From m.skates82 at gmail.com Sun Feb 19 11:20:54 2012 From: m.skates82 at gmail.com (Nun ion) Date: Sun, 19 Feb 2012 12:20:54 -0500 Subject: [petsc-users] Multiple Sparse Matrix Vector products Message-ID: Hello i have a conceptual idea of a sparse matvec implementation where i have multiple matrices, how would i go about implementing something such as for i = ... for k = ... w_{ik} = K_i * u_k end end Where each of the K_i are sparse matrices... the K_i are various stiffness matrices whose size can range (although they are all the same size). The u_k are reused Thanks! Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Feb 19 11:33:23 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 19 Feb 2012 11:33:23 -0600 Subject: [petsc-users] Multiple Sparse Matrix Vector products In-Reply-To: References: Message-ID: On Sun, Feb 19, 2012 at 11:20 AM, Nun ion wrote: > Hello i have a conceptual idea of a sparse matvec implementation where i > have multiple matrices, how would i go about implementing something such as > > for i = ... > for k = ... > w_{ik} = K_i * u_k > end > end > > Where each of the K_i are sparse matrices... the K_i are various stiffness > matrices whose size can range (although they are all the same size). The > u_k are reused > I suspect that this reuse does not matter. You can do a back of the envelope calculation for your matrices, using the analysis method in http://www.mcs.anl.gov/~kaushik/Papers/pcfd99_gkks.pdf. K_i is much bigger than u_k, and will generally blow u_k right out of the cache. In fact, this is the optimization that PETSc currently makes (see Prefetch code in MatMult). Matt > Thanks! > > Mark > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Sun Feb 19 15:08:58 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Sun, 19 Feb 2012 22:08:58 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: I have a new problem: the results from ASM and GASM are different and it seems GASM has something wrong with SetModifySubMatrices. Numerical tests are with each subdomain supported only by one subdomain. There are no problems when I did not modify submatrices. But when I modify submatrices, there are problems with GASM but no problems with ASM. For example, I use two subdomains. In the first case each subdomain is supported by one processor and there seems no problem with GASM. But when I use run my program with only one proc. so that it supports both of the two subdomains, the iteration number is different from the first case and is much larger. On the other hand ASM has no such problem. On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: > You should be able to. > This behavior is the same as in PCASM, > except in GASM the matrices live on subcommunicators. > I am in transit right now, but I can take a closer look in Friday. > > Dmitry > > > > On Feb 15, 2012, at 8:07, Hui Zhang wrote: > >> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >> >>> Hi Dmitry, >>> >>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>> >>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>> >>> I think the coloumns from the parameter 'col' are always the same as the rows >>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>> index sets but not rows and columns. Has I misunderstood something? >> >> As I tested, the row and col are always the same. >> >> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >> in the above func()? >> >> thanks, >> Hui >> >>> >>> thanks, >>> Hui >>> >>> >>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>> >>>> Yes, that's right. >>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>> It is expected that they are generated from mesh subdomains. >>>> Each IS does carry the subdomains subcomm. >>>> >>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>> each having the indices with the same color and the subcomm that supports that color. It is >>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>> >>>> Dmitry. >>>> >>>> >>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>> >>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>> >>>> Thanks, >>>> Hui >>>> >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Sun Feb 19 17:41:30 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Sun, 19 Feb 2012 17:41:30 -0600 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: On Sun, Feb 19, 2012 at 3:08 PM, Hui Zhang wrote: > I have a new problem: the results from ASM and GASM are different and it > seems > GASM has something wrong with SetModifySubMatrices. Numerical tests are > with > each subdomain supported only by one subdomain. There are no problems when > I did not modify submatrices. But when I modify submatrices, there are > problems > with GASM but no problems with ASM. > > For example, I use two subdomains. In the first case each subdomain is > supported by > one processor and there seems no problem with GASM. But when I use run my > program > with only one proc. so that it supports both of the two subdomains, the > iteration > number is different from the first case and is much larger. On the other > hand > ASM has no such problem. > Are the solutions the same? What problem are you solving? Dmitry. > > > On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: > > You should be able to. > This behavior is the same as in PCASM, > except in GASM the matrices live on subcommunicators. > I am in transit right now, but I can take a closer look in Friday. > > Dmitry > > > > On Feb 15, 2012, at 8:07, Hui Zhang wrote: > > On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: > > Hi Dmitry, > > thanks a lot! Currently, I'm not using ISColoring. Just comes another > question > on PCGASMSetModifySubMatrices(). The user provided function has the > prototype > > func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); > > I think the coloumns from the parameter 'col' are always the same as the > rows > from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts > index sets but not rows and columns. Has I misunderstood something? > > > As I tested, the row and col are always the same. > > I have a new question. Am I allowed to SetLocalToGlobalMapping() for the > submat's > in the above func()? > > thanks, > Hui > > > thanks, > Hui > > > On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: > > Yes, that's right. > There is no good way to help the user assemble the subdomains at the > moment beyond the 2D stuff. > It is expected that they are generated from mesh subdomains. > Each IS does carry the subdomains subcomm. > > There is ISColoringToList() that is supposed to convert a "coloring" of > indices to an array of ISs, > each having the indices with the same color and the subcomm that supports > that color. It is > largely untested, though. You could try using it and give us feedback on > any problems you encounter. > > Dmitry. > > > On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang < > mike.hui.zhang at hotmail.com> wrote: > >> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported >> by >> multiple processors, shall I always create the arguments 'is[s]' and >> 'is_local[s]' >> in a subcommunicator consisting of processors supporting the subdomain >> 's'? >> >> The source code of PCGASMCreateSubdomains2D() seemingly does so. >> >> Thanks, >> Hui >> >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Mon Feb 20 00:59:11 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Mon, 20 Feb 2012 07:59:11 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: On Feb 20, 2012, at 12:41 AM, Dmitry Karpeev wrote: > > > On Sun, Feb 19, 2012 at 3:08 PM, Hui Zhang wrote: > I have a new problem: the results from ASM and GASM are different and it seems > GASM has something wrong with SetModifySubMatrices. Numerical tests are with > each subdomain supported only by one subdomain. There are no problems when > I did not modify submatrices. But when I modify submatrices, there are problems > with GASM but no problems with ASM. > > For example, I use two subdomains. In the first case each subdomain is supported by > one processor and there seems no problem with GASM. But when I use run my program > with only one proc. so that it supports both of the two subdomains, the iteration > number is different from the first case and is much larger. On the other hand > ASM has no such problem. > > Are the solutions the same? > What problem are you solving? Yes, the solutions are the same. That's why ASM gives the same results with one or two processors. But GASM did not. I'm solving the Helmholtz equation. Maybe I can prepare a simpler example to show this difference. > > Dmitry. > > > On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: > >> You should be able to. >> This behavior is the same as in PCASM, >> except in GASM the matrices live on subcommunicators. >> I am in transit right now, but I can take a closer look in Friday. >> >> Dmitry >> >> >> >> On Feb 15, 2012, at 8:07, Hui Zhang wrote: >> >>> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >>> >>>> Hi Dmitry, >>>> >>>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>>> >>>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>>> >>>> I think the coloumns from the parameter 'col' are always the same as the rows >>>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>>> index sets but not rows and columns. Has I misunderstood something? >>> >>> As I tested, the row and col are always the same. >>> >>> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >>> in the above func()? >>> >>> thanks, >>> Hui >>> >>>> >>>> thanks, >>>> Hui >>>> >>>> >>>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>>> >>>>> Yes, that's right. >>>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>>> It is expected that they are generated from mesh subdomains. >>>>> Each IS does carry the subdomains subcomm. >>>>> >>>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>>> each having the indices with the same color and the subcomm that supports that color. It is >>>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>>> >>>>> Dmitry. >>>>> >>>>> >>>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>>> >>>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>>> >>>>> Thanks, >>>>> Hui >>>>> >>>>> >>>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolph at berkeley.edu Mon Feb 20 01:04:18 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Sun, 19 Feb 2012 23:04:18 -0800 Subject: [petsc-users] Starting point for Stokes fieldsplit Message-ID: <09B73032-FED5-4F96-91DE-A61068F23332@berkeley.edu> I am solving a 2D Stokes flow problem using a finite volume discretization, velocity (dof 0,1)-pressure (dof 2) formulation. Until now I have mostly used MUMPS. I am trying to use the PCFieldSplit interface now, with the goal of trying out the multigrid functionality provided through ML. I looked at these (http://www.mcs.anl.gov/petsc/documentation/tutorials/Speedup10.pdf) talk slides for some guidance on command line options and tried. When assembling the linear system, I obtain the global degree-of-freedom indices using DMDAGetGlobalIndices. The solver does not appear to be converging and I was wondering if someone could explain to me why this might be happening. Thanks for your help. Max petscmpiexec -n $1 $2 $3 \ -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type additive \ -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly \ -stokes_fieldsplit_1_pc_type jacobi -stokes_fieldsplit_1_ksp_type preonly \ -stokes_ksp_view \ -stokes_ksp_monitor_true_residual \ Residual norms for stokes_ solve. 0 KSP preconditioned resid norm 2.156200561011e-07 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.156200561011e-07 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 5.176186579848e-08 true resid norm 5.747484453195e+06 ||r(i)||/||b|| 4.932990239085e+00 3 KSP preconditioned resid norm 5.054067588022e-08 true resid norm 5.763708451168e+06 ||r(i)||/||b|| 4.946915082953e+00 4 KSP preconditioned resid norm 3.556841873413e-08 true resid norm 4.649778784249e+06 ||r(i)||/||b|| 3.990843915003e+00 5 KSP preconditioned resid norm 3.177972840516e-08 true resid norm 4.677248322326e+06 ||r(i)||/||b|| 4.014420657890e+00 6 KSP preconditioned resid norm 3.100188857346e-08 true resid norm 4.857618195959e+06 ||r(i)||/||b|| 4.169229745814e+00 7 KSP preconditioned resid norm 3.045495907075e-08 true resid norm 5.030091724356e+06 ||r(i)||/||b|| 4.317261504580e+00 8 KSP preconditioned resid norm 2.993896937859e-08 true resid norm 5.213745290794e+06 ||r(i)||/||b|| 4.474888942808e+00 9 KSP preconditioned resid norm 2.944838631679e-08 true resid norm 5.403745734510e+06 ||r(i)||/||b|| 4.637963822245e+00 10 KSP preconditioned resid norm 2.898115557667e-08 true resid norm 5.596205405394e+06 ||r(i)||/||b|| 4.803149423985e+00 11 KSP preconditioned resid norm 2.853548106431e-08 true resid norm 5.788329048801e+06 ||r(i)||/||b|| 4.968046617765e+00 12 KSP preconditioned resid norm 2.810975464206e-08 true resid norm 5.978285004145e+06 ||r(i)||/||b|| 5.131083313418e+00 13 KSP preconditioned resid norm 2.770253127208e-08 true resid norm 6.164551408716e+06 ||r(i)||/||b|| 5.290953316217e+00 14 KSP preconditioned resid norm 2.731250834421e-08 true resid norm 6.346469882864e+06 ||r(i)||/||b|| 5.447091547575e+00 15 KSP preconditioned resid norm 2.693850812065e-08 true resid norm 6.523343466838e+06 ||r(i)||/||b|| 5.598899816114e+00 From dave.mayhem23 at gmail.com Mon Feb 20 03:30:27 2012 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 20 Feb 2012 10:30:27 +0100 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: <09B73032-FED5-4F96-91DE-A61068F23332@berkeley.edu> References: <09B73032-FED5-4F96-91DE-A61068F23332@berkeley.edu> Message-ID: Hey Max, Without knowing anything about the specific application related to your Stokes problem, or information about the mesh you are using, I have a couple of questions and suggestions which might help. 1) If A, is your stokes operator A = ( K,B ; B^T, 0 ), what is your precondition operator? Specifically, what is in the (2,2) slot in the precondioner? - i.e. what matrix are you you applying -stokes_fieldsplit_1_pc_type jacobi -stokes_fieldsplit_1_ksp_type preonly to? Is it the identity as in the SpeedUp notes? 2) This choice -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly may simply not be a very effective and degrade the performance of the outer solver. I'd make the solver for the operator in the (1,1) slot much stronger, for example -stokes_fieldsplit_0_ksp_type gmres -stokes_fieldsplit_0_ksp_rtol 1.0e-4 -stokes_fieldsplit_0_mg_levels_ksp_type gmres -stokes_fieldsplit_0_mg_levels_pc_type bjacobi -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 Add a monitor on this solver (-stokes_fieldsplit_0_ksp_XXX) to see how ML is doing. 3) Using -stokes_pc_fieldsplit_type MULTIPLICATIVE should reduce the number of outer iterations by a factor of two, but it will use more memory. 4) You should use a flexible Krylov method on the outer most solve (-stokes_ksp_XXX) as the preconditioner is varying between each outer iteration. Use -stokes_ksp_type fgmres or -stokes_ksp_type gcr 5) Depending on how the physical problem is scaled (non-dimensionalised), the size of the residuals associated with the momentum and continuity equation make be quite different. You are currently use the entire residual from (u,p) to determine when to stop iterating. You might want to consider writing a monitor which examines the these residuals independently. Cheers, Dave On 20 February 2012 08:04, Max Rudolph wrote: > I am solving a 2D Stokes flow problem using a finite volume discretization, velocity (dof 0,1)-pressure (dof 2) formulation. Until now I have mostly used MUMPS. I am trying to use the PCFieldSplit interface now, with the goal of trying out the multigrid functionality provided through ML. I looked at these (http://www.mcs.anl.gov/petsc/documentation/tutorials/Speedup10.pdf) talk slides for some guidance on command line options and tried. When assembling the linear system, I obtain the global degree-of-freedom indices using DMDAGetGlobalIndices. The solver does not appear to be converging and I was wondering if someone could explain to me why this might be happening. Thanks for your help. > > Max > > petscmpiexec -n $1 $2 $3 \ > -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ > -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type additive \ > -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly \ > -stokes_fieldsplit_1_pc_type jacobi -stokes_fieldsplit_1_ksp_type preonly \ > -stokes_ksp_view \ > -stokes_ksp_monitor_true_residual \ > > ?Residual norms for stokes_ solve. > ?0 KSP preconditioned resid norm 2.156200561011e-07 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 > ?1 KSP preconditioned resid norm 2.156200561011e-07 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 > ?2 KSP preconditioned resid norm 5.176186579848e-08 true resid norm 5.747484453195e+06 ||r(i)||/||b|| 4.932990239085e+00 > ?3 KSP preconditioned resid norm 5.054067588022e-08 true resid norm 5.763708451168e+06 ||r(i)||/||b|| 4.946915082953e+00 > ?4 KSP preconditioned resid norm 3.556841873413e-08 true resid norm 4.649778784249e+06 ||r(i)||/||b|| 3.990843915003e+00 > ?5 KSP preconditioned resid norm 3.177972840516e-08 true resid norm 4.677248322326e+06 ||r(i)||/||b|| 4.014420657890e+00 > ?6 KSP preconditioned resid norm 3.100188857346e-08 true resid norm 4.857618195959e+06 ||r(i)||/||b|| 4.169229745814e+00 > ?7 KSP preconditioned resid norm 3.045495907075e-08 true resid norm 5.030091724356e+06 ||r(i)||/||b|| 4.317261504580e+00 > ?8 KSP preconditioned resid norm 2.993896937859e-08 true resid norm 5.213745290794e+06 ||r(i)||/||b|| 4.474888942808e+00 > ?9 KSP preconditioned resid norm 2.944838631679e-08 true resid norm 5.403745734510e+06 ||r(i)||/||b|| 4.637963822245e+00 > ?10 KSP preconditioned resid norm 2.898115557667e-08 true resid norm 5.596205405394e+06 ||r(i)||/||b|| 4.803149423985e+00 > ?11 KSP preconditioned resid norm 2.853548106431e-08 true resid norm 5.788329048801e+06 ||r(i)||/||b|| 4.968046617765e+00 > ?12 KSP preconditioned resid norm 2.810975464206e-08 true resid norm 5.978285004145e+06 ||r(i)||/||b|| 5.131083313418e+00 > ?13 KSP preconditioned resid norm 2.770253127208e-08 true resid norm 6.164551408716e+06 ||r(i)||/||b|| 5.290953316217e+00 > ?14 KSP preconditioned resid norm 2.731250834421e-08 true resid norm 6.346469882864e+06 ||r(i)||/||b|| 5.447091547575e+00 > ?15 KSP preconditioned resid norm 2.693850812065e-08 true resid norm 6.523343466838e+06 ||r(i)||/||b|| 5.598899816114e+00 > From bibrakc at gmail.com Mon Feb 20 05:14:27 2012 From: bibrakc at gmail.com (Bibrak Qamar) Date: Mon, 20 Feb 2012 15:14:27 +0400 Subject: [petsc-users] PETSc Matrix Partitioning and SPMVM Message-ID: Hello all, The way PETSc MatCreateMPIAIJ distributes an N*N square matrix is row wise. And internally every processor stores its local matrix into two sub matrices one Diagonal and other part is off-Diagonal. (more here --> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJ.html ). So when PETSc MatMul (v = A.x) is called it basically tries to hide communication of vector (x) by overlapping computation. As for as I understand every process first initiates a kind of non-Blocking Bcast of vector (x) and continues to do MVM of diagonal submatrix and then waits till communication is done and finally does MVM for the off-Diagonal matrix. My question is this (since I am new) what was the historical reason PETSc opted for this Diagonal and off-Diagonal storage? And what if I want to change the overlapping strategy of MVM by lets say introducing a ring based communication of vector (x), then I have to partition the local matrix into not two sub-matrices but P sub-matrices (here P = number of processors). Does PETSc provide this facility or one has to go from scratch to implement different techniques to store local matrix? Thanks Bibrak -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Mon Feb 20 08:06:43 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Mon, 20 Feb 2012 08:06:43 -0600 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: On Mon, Feb 20, 2012 at 12:59 AM, Hui Zhang wrote: > > On Feb 20, 2012, at 12:41 AM, Dmitry Karpeev wrote: > > > > On Sun, Feb 19, 2012 at 3:08 PM, Hui Zhang wrote: > >> I have a new problem: the results from ASM and GASM are different and it >> seems >> GASM has something wrong with SetModifySubMatrices. Numerical tests are >> with >> each subdomain supported only by one subdomain. There are no problems when >> I did not modify submatrices. But when I modify submatrices, there are >> problems >> with GASM but no problems with ASM. >> >> For example, I use two subdomains. In the first case each subdomain is >> supported by >> one processor and there seems no problem with GASM. But when I use run my >> program >> with only one proc. so that it supports both of the two subdomains, the >> iteration >> number is different from the first case and is much larger. On the other >> hand >> ASM has no such problem. >> > > Are the solutions the same? > What problem are you solving? > > > Yes, the solutions are the same. That's why ASM gives the same results > with one or > two processors. But GASM did not. > Sorry, I wasn't clear: ASM and GASM produced different solutions in the case of two domains per processor? > I'm solving the Helmholtz equation. Maybe > I can prepare a simpler example to show this difference. > That would be helpful. Thanks. Dmitry. > > > Dmitry. > >> >> >> On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: >> >> You should be able to. >> This behavior is the same as in PCASM, >> except in GASM the matrices live on subcommunicators. >> I am in transit right now, but I can take a closer look in Friday. >> >> Dmitry >> >> >> >> On Feb 15, 2012, at 8:07, Hui Zhang wrote: >> >> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >> >> Hi Dmitry, >> >> thanks a lot! Currently, I'm not using ISColoring. Just comes another >> question >> on PCGASMSetModifySubMatrices(). The user provided function has the >> prototype >> >> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >> >> I think the coloumns from the parameter 'col' are always the same as the >> rows >> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >> index sets but not rows and columns. Has I misunderstood something? >> >> >> As I tested, the row and col are always the same. >> >> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the >> submat's >> in the above func()? >> >> thanks, >> Hui >> >> >> thanks, >> Hui >> >> >> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >> >> Yes, that's right. >> There is no good way to help the user assemble the subdomains at the >> moment beyond the 2D stuff. >> It is expected that they are generated from mesh subdomains. >> Each IS does carry the subdomains subcomm. >> >> There is ISColoringToList() that is supposed to convert a "coloring" of >> indices to an array of ISs, >> each having the indices with the same color and the subcomm that supports >> that color. It is >> largely untested, though. You could try using it and give us feedback on >> any problems you encounter. >> >> Dmitry. >> >> >> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang < >> mike.hui.zhang at hotmail.com> wrote: >> >>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported >>> by >>> multiple processors, shall I always create the arguments 'is[s]' and >>> 'is_local[s]' >>> in a subcommunicator consisting of processors supporting the subdomain >>> 's'? >>> >>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>> >>> Thanks, >>> Hui >>> >>> >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 20 09:01:10 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 09:01:10 -0600 Subject: [petsc-users] PETSc Matrix Partitioning and SPMVM In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 5:14 AM, Bibrak Qamar wrote: > Hello all, > > The way PETSc MatCreateMPIAIJ distributes an N*N square matrix is row > wise. And internally every processor stores its local matrix into two sub > matrices one Diagonal and other part is off-Diagonal. (more here --> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJ.html > ). > > > So when PETSc MatMul (v = A.x) is called it basically tries to hide > communication of vector (x) by overlapping computation. As for as I > understand every process first initiates a kind of non-Blocking Bcast of > vector (x) and continues to do MVM of diagonal submatrix and then waits > till communication is done and finally does MVM for the off-Diagonal matrix. > > My question is this (since I am new) what was the historical reason PETSc > opted for this Diagonal and off-Diagonal storage? > Overlapping communication and computation. > And what if I want to change the overlapping strategy of MVM by lets say > introducing a ring based communication of vector (x), then I have to > partition the local matrix into not two sub-matrices but P sub-matrices > (here P = number of processors). Does PETSc provide this facility or one > has to go from scratch to implement different techniques to store local > matrix? > 1) Its unclear what that would accomplish 2) You can get that style of communication by altering the VecScatter that sends the data, rather than the matrix 3) You can always implement another matrix type. We have a lot (see src/mat/impl) Thanks, Matt > Thanks > Bibrak > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Mon Feb 20 11:13:18 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Mon, 20 Feb 2012 18:13:18 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: Hi, attached is an example modified from ksp/ex11.c. I found even I do not modify matrices there are problems. Just make and run with mpirun -np 1 gasm_test -ksp_monitor -n 64 and mpirun -np 2 gasm_test -ksp_monitor -n 64 You will see how the ASM gives the same result in the two runs. Then, do a replacement for ASM->GASM, make and run again like before. I saw different results from two runs. thank you, Hui On Feb 20, 2012, at 3:06 PM, Dmitry Karpeev wrote: > > > On Mon, Feb 20, 2012 at 12:59 AM, Hui Zhang wrote: > > On Feb 20, 2012, at 12:41 AM, Dmitry Karpeev wrote: > >> >> >> On Sun, Feb 19, 2012 at 3:08 PM, Hui Zhang wrote: >> I have a new problem: the results from ASM and GASM are different and it seems >> GASM has something wrong with SetModifySubMatrices. Numerical tests are with >> each subdomain supported only by one subdomain. There are no problems when >> I did not modify submatrices. But when I modify submatrices, there are problems >> with GASM but no problems with ASM. >> >> For example, I use two subdomains. In the first case each subdomain is supported by >> one processor and there seems no problem with GASM. But when I use run my program >> with only one proc. so that it supports both of the two subdomains, the iteration >> number is different from the first case and is much larger. On the other hand >> ASM has no such problem. >> >> Are the solutions the same? >> What problem are you solving? > > Yes, the solutions are the same. That's why ASM gives the same results with one or > two processors. But GASM did not. > Sorry, I wasn't clear: ASM and GASM produced different solutions in the case of two domains per processor? > I'm solving the Helmholtz equation. Maybe > I can prepare a simpler example to show this difference. > That would be helpful. > Thanks. > > Dmitry. > >> >> Dmitry. >> >> >> On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: >> >>> You should be able to. >>> This behavior is the same as in PCASM, >>> except in GASM the matrices live on subcommunicators. >>> I am in transit right now, but I can take a closer look in Friday. >>> >>> Dmitry >>> >>> >>> >>> On Feb 15, 2012, at 8:07, Hui Zhang wrote: >>> >>>> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >>>> >>>>> Hi Dmitry, >>>>> >>>>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>>>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>>>> >>>>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>>>> >>>>> I think the coloumns from the parameter 'col' are always the same as the rows >>>>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>>>> index sets but not rows and columns. Has I misunderstood something? >>>> >>>> As I tested, the row and col are always the same. >>>> >>>> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >>>> in the above func()? >>>> >>>> thanks, >>>> Hui >>>> >>>>> >>>>> thanks, >>>>> Hui >>>>> >>>>> >>>>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>>>> >>>>>> Yes, that's right. >>>>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>>>> It is expected that they are generated from mesh subdomains. >>>>>> Each IS does carry the subdomains subcomm. >>>>>> >>>>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>>>> each having the indices with the same color and the subcomm that supports that color. It is >>>>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>>>> >>>>>> Dmitry. >>>>>> >>>>>> >>>>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>>>> >>>>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>>>> >>>>>> Thanks, >>>>>> Hui >>>>>> >>>>>> >>>>> >>>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gasm_test.c Type: application/octet-stream Size: 10102 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: makefile Type: application/octet-stream Size: 162 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Mon Feb 20 11:30:08 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Mon, 20 Feb 2012 18:30:08 +0100 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: For reference, my results are attached. asm1.txt for asm with 1 process, asm2.txt for asm with 2 processes, gasm1.txt for gasm with 1 process, (with the iteration numbers different from others) gasm2.txt for gasm with 2 processes thank you, Hui On Feb 20, 2012, at 3:06 PM, Dmitry Karpeev wrote: > > > On Mon, Feb 20, 2012 at 12:59 AM, Hui Zhang wrote: > > On Feb 20, 2012, at 12:41 AM, Dmitry Karpeev wrote: > >> >> >> On Sun, Feb 19, 2012 at 3:08 PM, Hui Zhang wrote: >> I have a new problem: the results from ASM and GASM are different and it seems >> GASM has something wrong with SetModifySubMatrices. Numerical tests are with >> each subdomain supported only by one subdomain. There are no problems when >> I did not modify submatrices. But when I modify submatrices, there are problems >> with GASM but no problems with ASM. >> >> For example, I use two subdomains. In the first case each subdomain is supported by >> one processor and there seems no problem with GASM. But when I use run my program >> with only one proc. so that it supports both of the two subdomains, the iteration >> number is different from the first case and is much larger. On the other hand >> ASM has no such problem. >> >> Are the solutions the same? >> What problem are you solving? > > Yes, the solutions are the same. That's why ASM gives the same results with one or > two processors. But GASM did not. > Sorry, I wasn't clear: ASM and GASM produced different solutions in the case of two domains per processor? > I'm solving the Helmholtz equation. Maybe > I can prepare a simpler example to show this difference. > That would be helpful. > Thanks. > > Dmitry. > >> >> Dmitry. >> >> >> On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: >> >>> You should be able to. >>> This behavior is the same as in PCASM, >>> except in GASM the matrices live on subcommunicators. >>> I am in transit right now, but I can take a closer look in Friday. >>> >>> Dmitry >>> >>> >>> >>> On Feb 15, 2012, at 8:07, Hui Zhang wrote: >>> >>>> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >>>> >>>>> Hi Dmitry, >>>>> >>>>> thanks a lot! Currently, I'm not using ISColoring. Just comes another question >>>>> on PCGASMSetModifySubMatrices(). The user provided function has the prototype >>>>> >>>>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>>>> >>>>> I think the coloumns from the parameter 'col' are always the same as the rows >>>>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only accepts >>>>> index sets but not rows and columns. Has I misunderstood something? >>>> >>>> As I tested, the row and col are always the same. >>>> >>>> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the submat's >>>> in the above func()? >>>> >>>> thanks, >>>> Hui >>>> >>>>> >>>>> thanks, >>>>> Hui >>>>> >>>>> >>>>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>>>> >>>>>> Yes, that's right. >>>>>> There is no good way to help the user assemble the subdomains at the moment beyond the 2D stuff. >>>>>> It is expected that they are generated from mesh subdomains. >>>>>> Each IS does carry the subdomains subcomm. >>>>>> >>>>>> There is ISColoringToList() that is supposed to convert a "coloring" of indices to an array of ISs, >>>>>> each having the indices with the same color and the subcomm that supports that color. It is >>>>>> largely untested, though. You could try using it and give us feedback on any problems you encounter. >>>>>> >>>>>> Dmitry. >>>>>> >>>>>> >>>>>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang wrote: >>>>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain supported by >>>>>> multiple processors, shall I always create the arguments 'is[s]' and 'is_local[s]' >>>>>> in a subcommunicator consisting of processors supporting the subdomain 's'? >>>>>> >>>>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>>>> >>>>>> Thanks, >>>>>> Hui >>>>>> >>>>>> >>>>> >>>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: asm1.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: asm2.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: gasm1.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: gasm2.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Mon Feb 20 11:38:20 2012 From: karpeev at mcs.anl.gov (Dmitry Karpeev) Date: Mon, 20 Feb 2012 11:38:20 -0600 Subject: [petsc-users] PCGASMSetLocalSubdomains In-Reply-To: References: <7E09CD1C-EE42-41B3-9107-20B3C887CA8A@gmail.com> Message-ID: Okay, thanks. I'll take a look. Dmitry. On Mon, Feb 20, 2012 at 11:30 AM, Hui Zhang wrote: > For reference, my results are attached. > > asm1.txt for asm with 1 process, > asm2.txt for asm with 2 processes, > gasm1.txt for gasm with 1 process, (with the iteration numbers different > from others) > gasm2.txt for gasm with 2 processes > > > > > > > thank you, > Hui > > On Feb 20, 2012, at 3:06 PM, Dmitry Karpeev wrote: > > > > On Mon, Feb 20, 2012 at 12:59 AM, Hui Zhang wrote: > >> >> On Feb 20, 2012, at 12:41 AM, Dmitry Karpeev wrote: >> >> >> >> On Sun, Feb 19, 2012 at 3:08 PM, Hui Zhang wrote: >> >>> I have a new problem: the results from ASM and GASM are different and it >>> seems >>> GASM has something wrong with SetModifySubMatrices. Numerical tests are >>> with >>> each subdomain supported only by one subdomain. There are no problems >>> when >>> I did not modify submatrices. But when I modify submatrices, there are >>> problems >>> with GASM but no problems with ASM. >>> >>> For example, I use two subdomains. In the first case each subdomain is >>> supported by >>> one processor and there seems no problem with GASM. But when I use run >>> my program >>> with only one proc. so that it supports both of the two subdomains, the >>> iteration >>> number is different from the first case and is much larger. On the >>> other hand >>> ASM has no such problem. >>> >> >> Are the solutions the same? >> What problem are you solving? >> >> >> Yes, the solutions are the same. That's why ASM gives the same results >> with one or >> two processors. But GASM did not. >> > Sorry, I wasn't clear: ASM and GASM produced different solutions in the > case of two domains per processor? > >> I'm solving the Helmholtz equation. Maybe >> I can prepare a simpler example to show this difference. >> > That would be helpful. > Thanks. > > Dmitry. > >> >> >> Dmitry. >> >>> >>> >>> On Feb 15, 2012, at 6:46 PM, Dmitry Karpeev wrote: >>> >>> You should be able to. >>> This behavior is the same as in PCASM, >>> except in GASM the matrices live on subcommunicators. >>> I am in transit right now, but I can take a closer look in Friday. >>> >>> Dmitry >>> >>> >>> >>> On Feb 15, 2012, at 8:07, Hui Zhang wrote: >>> >>> On Feb 15, 2012, at 11:19 AM, Hui Zhang wrote: >>> >>> Hi Dmitry, >>> >>> thanks a lot! Currently, I'm not using ISColoring. Just comes another >>> question >>> on PCGASMSetModifySubMatrices(). The user provided function has the >>> prototype >>> >>> func (PC pc,PetscInt nsub,IS *row,IS *col,Mat *submat,void *ctx); >>> >>> I think the coloumns from the parameter 'col' are always the same as >>> the rows >>> from the parameter 'row'. Because PCGASMSetLocalSubdomains() only >>> accepts >>> index sets but not rows and columns. Has I misunderstood something? >>> >>> >>> As I tested, the row and col are always the same. >>> >>> I have a new question. Am I allowed to SetLocalToGlobalMapping() for the >>> submat's >>> in the above func()? >>> >>> thanks, >>> Hui >>> >>> >>> thanks, >>> Hui >>> >>> >>> On Feb 11, 2012, at 3:36 PM, Dmitry Karpeev wrote: >>> >>> Yes, that's right. >>> There is no good way to help the user assemble the subdomains at the >>> moment beyond the 2D stuff. >>> It is expected that they are generated from mesh subdomains. >>> Each IS does carry the subdomains subcomm. >>> >>> There is ISColoringToList() that is supposed to convert a "coloring" of >>> indices to an array of ISs, >>> each having the indices with the same color and the subcomm that >>> supports that color. It is >>> largely untested, though. You could try using it and give us feedback >>> on any problems you encounter. >>> >>> Dmitry. >>> >>> >>> On Sat, Feb 11, 2012 at 6:06 AM, Hui Zhang < >>> mike.hui.zhang at hotmail.com> wrote: >>> >>>> About PCGASMSetLocalSubdomains(), in the case of one subdomain >>>> supported by >>>> multiple processors, shall I always create the arguments 'is[s]' and >>>> 'is_local[s]' >>>> in a subcommunicator consisting of processors supporting the subdomain >>>> 's'? >>>> >>>> The source code of PCGASMCreateSubdomains2D() seemingly does so. >>>> >>>> Thanks, >>>> Hui >>>> >>>> >>> >>> >>> >>> >> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Mon Feb 20 12:35:44 2012 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 20 Feb 2012 19:35:44 +0100 Subject: [petsc-users] Unhandled exception at 0x00000001408cca07 ... In-Reply-To: References: <4F3E3619.8090705@gmail.com> Message-ID: <4F429280.3050407@gmail.com> Hi, I realised that it is because I did not remove the /cvf flag after converting the project from Compaq visual fortran to Visual studio. Yours sincerely, TAY wee-beng On 17/2/2012 6:27 PM, Matthew Knepley wrote: > On Fri, Feb 17, 2012 at 5:12 AM, TAY wee-beng > wrote: > > Hi, > > I'm runing my CFD code in windows visual studio 2008 with ifort > 64bit mpich2 64bit. > > I managed to build the PETSc library after doing some > modifications - [petsc-maint #105754] Error : Cannot determine > Fortran module include flag. > > There is no error building my CFD code. > > However, when running my code, I got the error at: > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > It jumps to : > > *ierr = PetscMemzero(name,256); if (*ierr) return; > > May I know what's wrong? It was working when I run it in compaq > visual fortran under windows xp 32bit. > > > Can you get a stack trace? Without that, we really cannot figure out > what is going on. This does not happen > on our test machine. > > Matt > > > -- > Yours sincerely, > > TAY wee-beng > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolph at berkeley.edu Mon Feb 20 14:05:17 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Mon, 20 Feb 2012 12:05:17 -0800 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: Hi Dave, Thanks for your help. Max > Hey Max, > > Without knowing anything about the specific application related to > your Stokes problem, or information about the mesh you are using, I > have a couple of questions and suggestions which might help. The test case that I am working with is isoviscous convection, benchmark case 1a from Blankenbach 1989. > 1) If A, is your stokes operator A = ( K,B ; B^T, 0 ), what is your > precondition operator? > Specifically, what is in the (2,2) slot in the precondioner? - i.e. > what matrix are you you applying -stokes_fieldsplit_1_pc_type jacobi > -stokes_fieldsplit_1_ksp_type preonly to? > Is it the identity as in the SpeedUp notes? I think that this is the problem. The (2,2) slot in the LHS matrix is all zero (pressure does not appear in the continuity equation), so I think that the preconditioner is meaningless. I am still confused as to why this choice of preconditioner was suggested in the tutorial, and what is a better choice of preconditioner for this block? Should I be using one of the Schur complement methods instead of the additive or multiplicative field split? > 2) This choice > -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly > may simply not be a very effective and degrade the performance of the > outer solver. > I'd make the solver for the operator in the (1,1) slot much stronger, > for example > -stokes_fieldsplit_0_ksp_type gmres > -stokes_fieldsplit_0_ksp_rtol 1.0e-4 > -stokes_fieldsplit_0_mg_levels_ksp_type gmres > -stokes_fieldsplit_0_mg_levels_pc_type bjacobi > -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 > > Add a monitor on this solver (-stokes_fieldsplit_0_ksp_XXX) to see how > ML is doing. > > 3) Using -stokes_pc_fieldsplit_type MULTIPLICATIVE should reduce the > number of outer iterations by a factor of two, but it will use more > memory. > 4) You should use a flexible Krylov method on the outer most solve > (-stokes_ksp_XXX) as the preconditioner is varying between each outer > iteration. Use -stokes_ksp_type fgmres or -stokes_ksp_type gcr Thanks for pointing this out. I made that change. > 5) Depending on how the physical problem is scaled > (non-dimensionalised), the size of the residuals associated with the > momentum and continuity equation make be quite different. You are > currently use the entire residual from (u,p) to determine when to stop > iterating. You might want to consider writing a monitor which examines > the these residuals independently. I think that I have scaled the problem correctly. I (slowly) obtain a sufficiently accurate solution using as options only: -stokes_ksp_atol 1e-5 -stokes_ksp_rtol 1e-5 -stokes_ksp_monitor_true_residual -stokes_ksp_norm_type UNPRECONDITIONED > > > Cheers, > Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 20 14:23:59 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 14:23:59 -0600 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 2:05 PM, Max Rudolph wrote: > Hi Dave, > Thanks for your help. > > Max > > Hey Max, > > Without knowing anything about the specific application related to > your Stokes problem, or information about the mesh you are using, I > have a couple of questions and suggestions which might help. > > > The test case that I am working with is isoviscous convection, benchmark > case 1a from Blankenbach 1989. > > 1) If A, is your stokes operator A = ( K,B ; B^T, 0 ), what is your > precondition operator? > Specifically, what is in the (2,2) slot in the precondioner? - i.e. > what matrix are you you applying -stokes_fieldsplit_1_pc_type jacobi > -stokes_fieldsplit_1_ksp_type preonly to? > Is it the identity as in the SpeedUp notes? > > > I think that this is the problem. The (2,2) slot in the LHS matrix is all > zero (pressure does not appear in the continuity equation), so I think that > the preconditioner is meaningless. I am still confused as to why this > choice of preconditioner was suggested in the tutorial, and what is a > better choice of preconditioner for this block? Should I be using one of > the Schur complement methods instead of the additive or multiplicative > field split? > Its not suggested, it is demonstrated. Its the first logical choice, since Jacobi gives the identity for a 0 block (see http://www.jstor.org/pss/2158202). Its not meaningless. All the better preconditioners involve either a Schur complement (also shown in the tutorial), or an auxiliary operator which is more difficult to setup and thus not shown. > 2) This choice > -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly > may simply not be a very effective and degrade the performance of the > outer solver. > I'd make the solver for the operator in the (1,1) slot much stronger, > for example > -stokes_fieldsplit_0_ksp_type gmres > -stokes_fieldsplit_0_ksp_rtol 1.0e-4 > -stokes_fieldsplit_0_mg_levels_ksp_type gmres > -stokes_fieldsplit_0_mg_levels_pc_type bjacobi > -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 > > Add a monitor on this solver (-stokes_fieldsplit_0_ksp_XXX) to see how > ML is doing. > > 3) Using -stokes_pc_fieldsplit_type MULTIPLICATIVE should reduce the > number of outer iterations by a factor of two, but it will use more > memory. > > 4) You should use a flexible Krylov method on the outer most solve > (-stokes_ksp_XXX) as the preconditioner is varying between each outer > iteration. Use -stokes_ksp_type fgmres or -stokes_ksp_type gcr > > > Thanks for pointing this out. I made that change. > > 5) Depending on how the physical problem is scaled > (non-dimensionalised), the size of the residuals associated with the > momentum and continuity equation make be quite different. You are > currently use the entire residual from (u,p) to determine when to stop > iterating. You might want to consider writing a monitor which examines > the these residuals independently. > > > I think that I have scaled the problem correctly. I (slowly) obtain a > sufficiently accurate solution using as options only: > -stokes_ksp_atol 1e-5 -stokes_ksp_rtol 1e-5 > -stokes_ksp_monitor_true_residual -stokes_ksp_norm_type UNPRECONDITIONED > How do you know the problem is scaled correctly? Have you looked at norms of the residuals for the two systems? Thanks, Matt > Cheers, > Dave > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Feb 20 16:18:46 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 20 Feb 2012 16:18:46 -0600 Subject: [petsc-users] partition an MPIAIJ matrix Message-ID: Dear PETSc Developers, I have an MPIAIJ matrix, I want to use MatPartitioning functions to repartition the matrix. Before doing this, I need to do MatCreateMPIAdj to generate an Adj Mat. Could you give me some hints about how to get "i" and "j" ( http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateMPIAdj.html) from existing MPIAIJ matrix? Thank you very much, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 20 16:20:49 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 20 Feb 2012 16:20:49 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 16:18, recrusader wrote: > Dear PETSc Developers, > > I have an MPIAIJ matrix, I want to use MatPartitioning functions to > repartition the matrix. > Before doing this, I need to do MatCreateMPIAdj to generate an Adj Mat. > Could you give me some hints about how to get "i" and "j" ( > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateMPIAdj.html) > from existing MPIAIJ matrix? MatConvert(), though some of the partitioning routines can do the conversion transparently on their own. -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Feb 20 16:27:51 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 20 Feb 2012 16:27:51 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: Message-ID: Dear Jed, Thank you very much for your prompt reply. Actually, I want to reorder MPIAIJ matrix using Parmetis. The reordering function needs the same information with the partitioning, that is "i" and "j". I need to directly get them and then use the function in Parmetis. Thanks a lot, Yujie On Mon, Feb 20, 2012 at 4:20 PM, Jed Brown wrote: > On Mon, Feb 20, 2012 at 16:18, recrusader wrote: > >> Dear PETSc Developers, >> >> I have an MPIAIJ matrix, I want to use MatPartitioning functions to >> repartition the matrix. >> Before doing this, I need to do MatCreateMPIAdj to generate an Adj Mat. >> Could you give me some hints about how to get "i" and "j" ( >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateMPIAdj.html) >> from existing MPIAIJ matrix? > > > MatConvert(), though some of the partitioning routines can do the > conversion transparently on their own. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bishtg at ornl.gov Mon Feb 20 16:25:59 2012 From: bishtg at ornl.gov (Bisht, Gautam) Date: Mon, 20 Feb 2012 17:25:59 -0500 Subject: [petsc-users] Help with MatTranspose Message-ID: <48ED016D-AC4D-438C-8716-F588B5038512@ornl.gov> Hi, I have a MPIAIJ matrix A as listed below: row 0: (36, 1) (37, 1) (42, 1) (43, 1) row 1: (37, 1) (38, 1) (43, 1) (44, 1) row 2: (38, 1) (39, 1) (44, 1) (45, 1) row 3: (39, 1) (40, 1) (45, 1) (46, 1) row 4: (40, 1) (41, 1) (46, 1) (47, 1) When I try to transpose the matrix, while running on 2 processors, the code crashes. Attached below is the F90 code. I would appreciate any help in figuring out why the code crashes. Thanks, -Gautam. -------------- next part -------------- A non-text attachment was scrubbed... Name: mattrans.F90 Type: application/octet-stream Size: 2426 bytes Desc: mattrans.F90 URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ATT00001.txt URL: From jedbrown at mcs.anl.gov Mon Feb 20 16:31:51 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 20 Feb 2012 16:31:51 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 16:27, recrusader wrote: > Thank you very much for your prompt reply. > Actually, I want to reorder MPIAIJ matrix using Parmetis. The reordering > function needs the same information with the partitioning, that is "i" and > "j". > Are you just redistributing the matrix? Or other data? > I need to directly get them and then use the function in Parmetis. > MatConvert() -------------- next part -------------- An HTML attachment was scrubbed... URL: From rudolph at berkeley.edu Mon Feb 20 16:36:40 2012 From: rudolph at berkeley.edu (Max Rudolph) Date: Mon, 20 Feb 2012 14:36:40 -0800 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: Matt, Thank you for your help. > On Mon, Feb 20, 2012 at 2:05 PM, Max Rudolph wrote: > > > Hi Dave, > > Thanks for your help. > > > > Max > > > > Hey Max, > > > > Without knowing anything about the specific application related to > > your Stokes problem, or information about the mesh you are using, I > > have a couple of questions and suggestions which might help. > > > > > > The test case that I am working with is isoviscous convection, benchmark > > case 1a from Blankenbach 1989. > > > > 1) If A, is your stokes operator A = ( K,B ; B^T, 0 ), what is your > > precondition operator? > > Specifically, what is in the (2,2) slot in the precondioner? - i.e. > > what matrix are you you applying -stokes_fieldsplit_1_pc_type jacobi > > -stokes_fieldsplit_1_ksp_type preonly to? > > Is it the identity as in the SpeedUp notes? > > > > > > I think that this is the problem. The (2,2) slot in the LHS matrix is all > > zero (pressure does not appear in the continuity equation), so I think that > > the preconditioner is meaningless. I am still confused as to why this > > choice of preconditioner was suggested in the tutorial, and what is a > > better choice of preconditioner for this block? Should I be using one of > > the Schur complement methods instead of the additive or multiplicative > > field split? > > > > Its not suggested, it is demonstrated. Its the first logical choice, since > Jacobi gives the identity for a 0 block (see > http://www.jstor.org/pss/2158202). Its > not meaningless. All the better preconditioners involve either a Schur > complement (also shown in the tutorial), or an auxiliary operator which is > more > difficult to setup and thus not shown. Thank you for clarifying this. > > > 2) This choice > > -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly > > may simply not be a very effective and degrade the performance of the > > outer solver. > > I'd make the solver for the operator in the (1,1) slot much stronger, > > for example > > -stokes_fieldsplit_0_ksp_type gmres > > -stokes_fieldsplit_0_ksp_rtol 1.0e-4 > > -stokes_fieldsplit_0_mg_levels_ksp_type gmres > > -stokes_fieldsplit_0_mg_levels_pc_type bjacobi > > -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 > > > > Add a monitor on this solver (-stokes_fieldsplit_0_ksp_XXX) to see how > > ML is doing. > > > > 3) Using -stokes_pc_fieldsplit_type MULTIPLICATIVE should reduce the > > number of outer iterations by a factor of two, but it will use more > > memory. > > > > 4) You should use a flexible Krylov method on the outer most solve > > (-stokes_ksp_XXX) as the preconditioner is varying between each outer > > iteration. Use -stokes_ksp_type fgmres or -stokes_ksp_type gcr > > > > > > Thanks for pointing this out. I made that change. > > > > 5) Depending on how the physical problem is scaled > > (non-dimensionalised), the size of the residuals associated with the > > momentum and continuity equation make be quite different. You are > > currently use the entire residual from (u,p) to determine when to stop > > iterating. You might want to consider writing a monitor which examines > > the these residuals independently. > > > > > > I think that I have scaled the problem correctly. I (slowly) obtain a > > sufficiently accurate solution using as options only: > > -stokes_ksp_atol 1e-5 -stokes_ksp_rtol 1e-5 > > -stokes_ksp_monitor_true_residual -stokes_ksp_norm_type UNPRECONDITIONED > > > > How do you know the problem is scaled correctly? Have you looked at norms > of the residuals for the two systems > Thanks, > > Matt > > > > Cheers, > > Dave Yes, here are the norms computed for the P, X, and Y components, following the last residual that ksp_monitor_true_residual returned: 383 KSP unpreconditioned resid norm 1.121628211019e-03 true resid norm 1.121628224178e-03 ||r(i)||/||b|| 9.626787321554e-10 P, X, Y residual norms 5.340336e-02, 4.463404e-02, 2.509621e-02 -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Feb 20 16:45:01 2012 From: recrusader at gmail.com (Recrusader) Date: Mon, 20 Feb 2012 16:45:01 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: Message-ID: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Dear Jed, It is not only redistribute the matrix. I want to get the reorder information and use matpermute to reorder the matrix. Thanks Yujie On Feb 20, 2012, at 4:31 PM, Jed Brown wrote: > On Mon, Feb 20, 2012 at 16:27, recrusader wrote: > Thank you very much for your prompt reply. > Actually, I want to reorder MPIAIJ matrix using Parmetis. The reordering function needs the same information with the partitioning, that is "i" and "j". > > Are you just redistributing the matrix? Or other data? > > I need to directly get them and then use the function in Parmetis. > > MatConvert() -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 20 16:45:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 20 Feb 2012 16:45:35 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> References: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Message-ID: On Mon, Feb 20, 2012 at 16:45, Recrusader wrote: > It is not only redistribute the matrix. I want to get the reorder > information and use matpermute to reorder the matrix. MatConvert(). ISPartitioningToNumbering() might also be useful for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.alken at Colorado.EDU Mon Feb 20 16:52:29 2012 From: patrick.alken at Colorado.EDU (Patrick Alken) Date: Mon, 20 Feb 2012 15:52:29 -0700 Subject: [petsc-users] convergence problem in spherical coordinates Message-ID: <4F42CEAD.2050501@colorado.edu> Hello all, I am having great difficulty solving a 3D finite difference equation in spherical coordinates. I am solving the equation in a spherical shell region S(a,b), with the boundary conditions being that the function is 0 on both boundaries (r = a and r = b). I haven't imposed any boundary conditions on theta or phi which may be a reason its not converging. The phi boundary condition would be that the function is periodic in phi, but I don't know if this needs to be put into the matrix somehow? I nondimensionalized the equation before solving which helped a little bit. I've also scaled the matrix and RHS vectors by their maximum element to make all entries <= 1. I've tried both direct and iterative solvers. The direct solvers give a fairly accurate solution for small grids but seem unstable for larger grids. The PETSc iterative solvers converge for very small grids but for medium to large grids don't converge at all. When running with the command (for a small grid): *> ./main -ksp_converged_reason -ksp_monitor_true_residual -pc_type svd -pc_svd_monitor* I get the output: SVD: condition number 5.929088512946e+03, 0 of 1440 singular values are (nearly) zero SVD: smallest singular values: 2.742809162118e-04 2.807446554985e-04 1.548488288425e-03 1.852332719983e-03 2.782708934678e-03 SVD: largest singular values : 1.590835571953e+00 1.593368145758e+00 1.595771695877e+00 1.623691828398e+00 1.626235829632e+00 0 KSP preconditioned resid norm 2.154365616645e+03 true resid norm 8.365589263063e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.832753933427e-10 true resid norm 4.587845792963e-12 ||r(i)||/||b|| 5.484187244549e-13 Linear solve converged due to CONVERGED_RTOL iterations 1 When plotting the output of this SVD solution, it looks pretty good, but svd isn't practical for larger grids. Using the command (on the same grid): *> ./main -ksp_converged_reason -ksp_monitor_true_residual -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none* The output is attached. There do not appear to be any 0 eigenvalues. The solution here is much less accurate than the SVD case since it didn't converge. I've also tried the -ksp_diagonal_scale -ksp_diagonal_scale_fix options which don't help very much. Any advice on how to trouble shoot this would be greatly appreciated. Some things I've checked already: 1) there aren't any 0 rows in the matrix 2) using direct solvers on very small grids seems to give decent solutions 3) there don't appear to be any 0 singular values or eigenvalues Perhaps the matrix has a null space, but I don't know how I would find out what the null space is? Is there a tutorial on how to do this? Thanks in advance! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output.txt URL: From knepley at gmail.com Mon Feb 20 17:16:07 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 17:16:07 -0600 Subject: [petsc-users] Help with MatTranspose In-Reply-To: <48ED016D-AC4D-438C-8716-F588B5038512@ornl.gov> References: <48ED016D-AC4D-438C-8716-F588B5038512@ornl.gov> Message-ID: On Mon, Feb 20, 2012 at 4:25 PM, Bisht, Gautam wrote: > Hi, > > I have a MPIAIJ matrix A as listed below: > > row 0: (36, 1) (37, 1) (42, 1) (43, 1) > row 1: (37, 1) (38, 1) (43, 1) (44, 1) > row 2: (38, 1) (39, 1) (44, 1) (45, 1) > row 3: (39, 1) (40, 1) (45, 1) (46, 1) > row 4: (40, 1) (41, 1) (46, 1) (47, 1) > > When I try to transpose the matrix, while running on 2 processors, the > code crashes. Attached below is the F90 code. I would appreciate any help > in figuring out why the code crashes. > Runs fine for me: knepley:/PETSc3/petsc/petsc-dev$ /PETSc3/petsc/petsc-dev/arch-c-f90-debug/bin/mpiexec -n 2 /PETSc3/petsc/petsc-dev/arch-c-f90-debug/lib/mattrans-obj/mattrans -mat_view Matrix Object: 1 MPI processes type: mpiaij row 0: (36, 1) (37, 1) (42, 1) (43, 1) row 1: (37, 1) (38, 1) (43, 1) (44, 1) row 2: (38, 1) (39, 1) (44, 1) (45, 1) row 3: (39, 1) (40, 1) (45, 1) (46, 1) row 4: (40, 1) (41, 1) (46, 1) (47, 1) Matrix Object: 1 MPI processes type: mpiaij row 0: (36, 1) (37, 1) (42, 1) (43, 1) row 1: (37, 1) (38, 1) (43, 1) (44, 1) row 2: (38, 1) (39, 1) (44, 1) (45, 1) row 3: (39, 1) (40, 1) (45, 1) (46, 1) row 4: (40, 1) (41, 1) (46, 1) (47, 1) Matrix Object: 1 MPI processes type: mpiaij row 0: row 1: row 2: row 3: row 4: row 5: row 6: row 7: row 8: row 9: row 10: row 11: row 12: row 13: row 14: row 15: row 16: row 17: row 18: row 19: row 20: row 21: row 22: row 23: row 24: row 25: row 26: row 27: row 28: row 29: row 30: row 31: row 32: row 33: row 34: row 35: row 36: (0, 1) row 37: (0, 1) (1, 1) row 38: (1, 1) (2, 1) row 39: (2, 1) (3, 1) row 40: (3, 1) (4, 1) row 41: (4, 1) row 42: (0, 1) row 43: (0, 1) (1, 1) row 44: (1, 1) (2, 1) row 45: (2, 1) (3, 1) row 46: (3, 1) (4, 1) row 47: (4, 1) Matrix Object: 1 MPI processes type: mpiaij row 0: row 1: row 2: row 3: row 4: row 5: row 6: row 7: row 8: row 9: row 10: row 11: row 12: row 13: row 14: row 15: row 16: row 17: row 18: row 19: row 20: row 21: row 22: row 23: row 24: row 25: row 26: row 27: row 28: row 29: row 30: row 31: row 32: row 33: row 34: row 35: row 36: (0, 1) row 37: (0, 1) (1, 1) row 38: (1, 1) (2, 1) row 39: (2, 1) (3, 1) row 40: (3, 1) (4, 1) row 41: (4, 1) row 42: (0, 1) row 43: (0, 1) (1, 1) row 44: (1, 1) (2, 1) row 45: (2, 1) (3, 1) row 46: (3, 1) (4, 1) row 47: (4, 1) Matt > Thanks, > -Gautam. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 20 17:17:52 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 17:17:52 -0600 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 4:36 PM, Max Rudolph wrote: > Matt, > > Thank you for your help. > > On Mon, Feb 20, 2012 at 2:05 PM, Max Rudolph > wrote: > > >* Hi Dave,*>* Thanks for your help.*>**>* Max*>**>* Hey Max,*>**>* Without knowing anything about the specific application related to*>* your Stokes problem, or information about the mesh you are using, I*>* have a couple of questions and suggestions which might help.*>**>**>* The test case that I am working with is isoviscous convection, benchmark*>* case 1a from Blankenbach 1989.*>**>* 1) If A, is your stokes operator A = ( K,B ; B^T, 0 ), what is your*>* precondition operator?*>* Specifically, what is in the (2,2) slot in the precondioner? - i.e.*>* what matrix are you you applying -stokes_fieldsplit_1_pc_type jacobi*>* -stokes_fieldsplit_1_ksp_type preonly to?*>* Is it the identity as in the SpeedUp notes?*>**>**>* I think that this is the problem. The (2,2) slot in the LHS matrix is all*>* zero (pressure does not appear in the continuity equation), so I think that*>* the preconditioner is meaningless. I am still confused as to why this*>* choice of preconditioner was suggested in the tutorial, and what is a*>* better choice of preconditioner for this block? Should I be using one of*>* the Schur complement methods instead of the additive or multiplicative*>* field split?*>** > Its not suggested, it is demonstrated. Its the first logical choice, since > Jacobi gives the identity for a 0 block (seehttp://www.jstor.org/pss/2158202). Its > not meaningless. All the better preconditioners involve either a Schur > complement (also shown in the tutorial), or an auxiliary operator which is > more > difficult to setup and thus not shown. > > > Thank you for clarifying this. > > > >* 2) This choice*>* -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly*>* may simply not be a very effective and degrade the performance of the*>* outer solver.*>* I'd make the solver for the operator in the (1,1) slot much stronger,*>* for example*>* -stokes_fieldsplit_0_ksp_type gmres*>* -stokes_fieldsplit_0_ksp_rtol 1.0e-4*>* -stokes_fieldsplit_0_mg_levels_ksp_type gmres*>* -stokes_fieldsplit_0_mg_levels_pc_type bjacobi*>* -stokes_fieldsplit_0_mg_levels_ksp_max_it 4*>**>* Add a monitor on this solver (-stokes_fieldsplit_0_ksp_XXX) to see how*>* ML is doing.*>**>* 3) Using -stokes_pc_fieldsplit_type MULTIPLICATIVE should reduce the*>* number of outer iterations by a factor of two, but it will use more*>* memory.*>**>* 4) You should use a flexible Krylov method on the outer most solve*>* (-stokes_ksp_XXX) as the preconditioner is varying between each outer*>* iteration. Use -stokes_ksp_type fgmres or -stokes_ksp_type gcr*>**>**>* Thanks for pointing this out. I made that change.*>**>* 5) Depending on how the physical problem is scaled*>* (non-dimensionalised), the size of the residuals associated with the*>* momentum and continuity equation make be quite different. You are*>* currently use the entire residual from (u,p) to determine when to stop*>* iterating. You might want to consider writing a monitor which examines*>* the these residuals independently.*>**>**>* I think that I have scaled the problem correctly. I (slowly) obtain a*>* sufficiently accurate solution using as options only:*>* -stokes_ksp_atol 1e-5 -stokes_ksp_rtol 1e-5*>* -stokes_ksp_monitor_true_residual -stokes_ksp_norm_type UNPRECONDITIONED*>** > How do you know the problem is scaled correctly? Have you looked at norms > of the residuals for the two systems > > Thanks, > > Matt > > > >* Cheers,*>* Dave* > > > Yes, here are the norms computed for the P, X, and Y components, following > the last residual that ksp_monitor_true_residual returned: > > 383 KSP unpreconditioned resid norm 1.121628211019e-03 true resid norm > 1.121628224178e-03 ||r(i)||/||b|| 9.626787321554e-10 > P, X, Y residual norms 5.340336e-02, 4.463404e-02, 2.509621e-02 > I am more interested in the initial residuals. Thanks Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Feb 20 17:25:53 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 20 Feb 2012 15:25:53 -0800 Subject: [petsc-users] check for NULL pointer in VecCreateGhost Message-ID: Hi, Is there an internal check to see if the 'ghost' pointer in VecCreateGhost() is NULL ? This can happen, for example, if the code is run in serial and the there are no ghost points (hence n_ghost = 0 and ghost = NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to crash if the corresponding pointer is NULL. Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From maxwellr at gmail.com Mon Feb 20 17:34:00 2012 From: maxwellr at gmail.com (Max Rudolph) Date: Mon, 20 Feb 2012 15:34:00 -0800 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 3:17 PM, Matthew Knepley wrote: > On Mon, Feb 20, 2012 at 4:36 PM, Max Rudolph wrote: > >> Matt, >> >> Thank you for your help. >> >> >> On Mon, Feb 20, 2012 at 2:05 PM, Max Rudolph > wrote: >> >> >* Hi Dave,*>* Thanks for your help.*>**>* Max*>**>* Hey Max,*>**>* Without knowing anything about the specific application related to*>* your Stokes problem, or information about the mesh you are using, I*>* have a couple of questions and suggestions which might help.*>**>**>* The test case that I am working with is isoviscous convection, benchmark*>* case 1a from Blankenbach 1989.*>**>* 1) If A, is your stokes operator A = ( K,B ; B^T, 0 ), what is your*>* precondition operator?*>* Specifically, what is in the (2,2) slot in the precondioner? - i.e.*>* what matrix are you you applying -stokes_fieldsplit_1_pc_type jacobi*>* -stokes_fieldsplit_1_ksp_type preonly to?*>* Is it the identity as in the SpeedUp notes?*>**>**>* I think that this is the problem. The (2,2) slot in the LHS matrix is all*>* zero (pressure does not appear in the continuity equation), so I think that*>* the preconditioner is meaningless. I am still confused as to why this*>* choice of preconditioner was suggested in the tutorial, and what is a*>* better choice of preconditioner for this block? Should I be using one of*>* the Schur complement methods instead of the additive or multiplicative*>* field split?*>** >> Its not suggested, it is demonstrated. Its the first logical choice, since >> Jacobi gives the identity for a 0 block (seehttp://www.jstor.org/pss/2158202). Its >> not meaningless. All the better preconditioners involve either a Schur >> complement (also shown in the tutorial), or an auxiliary operator which is >> more >> difficult to setup and thus not shown. >> >> >> Thank you for clarifying this. >> >> >> >* 2) This choice*>* -stokes_fieldsplit_0_pc_type ml -stokes_fieldsplit_0_ksp_type preonly*>* may simply not be a very effective and degrade the performance of the*>* outer solver.*>* I'd make the solver for the operator in the (1,1) slot much stronger,*>* for example*>* -stokes_fieldsplit_0_ksp_type gmres*>* -stokes_fieldsplit_0_ksp_rtol 1.0e-4*>* -stokes_fieldsplit_0_mg_levels_ksp_type gmres*>* -stokes_fieldsplit_0_mg_levels_pc_type bjacobi*>* -stokes_fieldsplit_0_mg_levels_ksp_max_it 4*>**>* Add a monitor on this solver (-stokes_fieldsplit_0_ksp_XXX) to see how*>* ML is doing.*>**>* 3) Using -stokes_pc_fieldsplit_type MULTIPLICATIVE should reduce the*>* number of outer iterations by a factor of two, but it will use more*>* memory.*>**>* 4) You should use a flexible Krylov method on the outer most solve*>* (-stokes_ksp_XXX) as the preconditioner is varying between each outer*>* iteration. Use -stokes_ksp_type fgmres or -stokes_ksp_type gcr*>**>**>* Thanks for pointing this out. I made that change.*>**>* 5) Depending on how the physical problem is scaled*>* (non-dimensionalised), the size of the residuals associated with the*>* momentum and continuity equation make be quite different. You are*>* currently use the entire residual from (u,p) to determine when to stop*>* iterating. You might want to consider writing a monitor which examines*>* the these residuals independently.*>**>**>* I think that I have scaled the problem correctly. I (slowly) obtain a*>* sufficiently accurate solution using as options only:*>* -stokes_ksp_atol 1e-5 -stokes_ksp_rtol 1e-5*>* -stokes_ksp_monitor_true_residual -stokes_ksp_norm_type UNPRECONDITIONED*>** >> How do you know the problem is scaled correctly? Have you looked at norms >> of the residuals for the two systems >> >> Thanks, >> >> Matt >> >> >> >* Cheers,*>* Dave* >> >> >> Yes, here are the norms computed for the P, X, and Y components, >> following the last residual that ksp_monitor_true_residual returned: >> >> 383 KSP unpreconditioned resid norm 1.121628211019e-03 true resid norm >> 1.121628224178e-03 ||r(i)||/||b|| 9.626787321554e-10 >> P, X, Y residual norms 5.340336e-02, 4.463404e-02, 2.509621e-02 >> > > I am more interested in the initial residuals. > Is this the information that you're looking for? I ran with ksp_max_it -1. The Y-residual below is much larger than the X-or P-residuals, presumably because of the initial density perturbation and body force. Thanks again for helping me. Residual norms for stokes_ solve. 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 KSP Object:(stokes_) 2 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=-1, initial guess is zero tolerances: relative=1e-09, absolute=1e-09, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(stokes_) 2 MPI processes type: none linear system matrix = precond matrix: Matrix Object: CmechLHS 2 MPI processes type: mpiaij rows=2883, cols=2883 total: nonzeros=199809, allocated nonzeros=199809 total number of mallocs used during MatSetValues calls =0 Matrix Object: CmechLHS 2 MPI processes type: mpiaij rows=2883, cols=2883 total: nonzeros=199809, allocated nonzeros=199809 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 496 nodes, limit used is 5 converged reason -3 stokes residual... converged reason -3 stokes residual... X, Y, P residual norms 0.000000e+00, 1.165112e+06, 0.000000e+00 > > Thanks > > Matt > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bishtg at ornl.gov Mon Feb 20 17:37:07 2012 From: bishtg at ornl.gov (Bisht, Gautam) Date: Mon, 20 Feb 2012 18:37:07 -0500 Subject: [petsc-users] Help with MatTranspose In-Reply-To: References: <48ED016D-AC4D-438C-8716-F588B5038512@ornl.gov> Message-ID: <1C428AA7-8D1E-48B4-9358-38184BEE774C@ornl.gov> I updated my petsc-dev copy and it works now. Thanks, -Gautam. On Feb 20, 2012, at 6:16 PM, Matthew Knepley wrote: On Mon, Feb 20, 2012 at 4:25 PM, Bisht, Gautam > wrote: Hi, I have a MPIAIJ matrix A as listed below: row 0: (36, 1) (37, 1) (42, 1) (43, 1) row 1: (37, 1) (38, 1) (43, 1) (44, 1) row 2: (38, 1) (39, 1) (44, 1) (45, 1) row 3: (39, 1) (40, 1) (45, 1) (46, 1) row 4: (40, 1) (41, 1) (46, 1) (47, 1) When I try to transpose the matrix, while running on 2 processors, the code crashes. Attached below is the F90 code. I would appreciate any help in figuring out why the code crashes. Runs fine for me: knepley:/PETSc3/petsc/petsc-dev$ /PETSc3/petsc/petsc-dev/arch-c-f90-debug/bin/mpiexec -n 2 /PETSc3/petsc/petsc-dev/arch-c-f90-debug/lib/mattrans-obj/mattrans -mat_view Matrix Object: 1 MPI processes type: mpiaij row 0: (36, 1) (37, 1) (42, 1) (43, 1) row 1: (37, 1) (38, 1) (43, 1) (44, 1) row 2: (38, 1) (39, 1) (44, 1) (45, 1) row 3: (39, 1) (40, 1) (45, 1) (46, 1) row 4: (40, 1) (41, 1) (46, 1) (47, 1) Matrix Object: 1 MPI processes type: mpiaij row 0: (36, 1) (37, 1) (42, 1) (43, 1) row 1: (37, 1) (38, 1) (43, 1) (44, 1) row 2: (38, 1) (39, 1) (44, 1) (45, 1) row 3: (39, 1) (40, 1) (45, 1) (46, 1) row 4: (40, 1) (41, 1) (46, 1) (47, 1) Matrix Object: 1 MPI processes type: mpiaij row 0: row 1: row 2: row 3: row 4: row 5: row 6: row 7: row 8: row 9: row 10: row 11: row 12: row 13: row 14: row 15: row 16: row 17: row 18: row 19: row 20: row 21: row 22: row 23: row 24: row 25: row 26: row 27: row 28: row 29: row 30: row 31: row 32: row 33: row 34: row 35: row 36: (0, 1) row 37: (0, 1) (1, 1) row 38: (1, 1) (2, 1) row 39: (2, 1) (3, 1) row 40: (3, 1) (4, 1) row 41: (4, 1) row 42: (0, 1) row 43: (0, 1) (1, 1) row 44: (1, 1) (2, 1) row 45: (2, 1) (3, 1) row 46: (3, 1) (4, 1) row 47: (4, 1) Matrix Object: 1 MPI processes type: mpiaij row 0: row 1: row 2: row 3: row 4: row 5: row 6: row 7: row 8: row 9: row 10: row 11: row 12: row 13: row 14: row 15: row 16: row 17: row 18: row 19: row 20: row 21: row 22: row 23: row 24: row 25: row 26: row 27: row 28: row 29: row 30: row 31: row 32: row 33: row 34: row 35: row 36: (0, 1) row 37: (0, 1) (1, 1) row 38: (1, 1) (2, 1) row 39: (2, 1) (3, 1) row 40: (3, 1) (4, 1) row 41: (4, 1) row 42: (0, 1) row 43: (0, 1) (1, 1) row 44: (1, 1) (2, 1) row 45: (2, 1) (3, 1) row 46: (3, 1) (4, 1) row 47: (4, 1) Matt Thanks, -Gautam. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Mon Feb 20 17:37:29 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 17:37:29 -0600 Subject: [petsc-users] convergence problem in spherical coordinates In-Reply-To: <4F42CEAD.2050501@colorado.edu> References: <4F42CEAD.2050501@colorado.edu> Message-ID: On Mon, Feb 20, 2012 at 4:52 PM, Patrick Alken wrote: > Hello all, > > I am having great difficulty solving a 3D finite difference equation in > spherical coordinates. I am solving the equation in a spherical shell > region S(a,b), with the boundary conditions being that the function is 0 on > both boundaries (r = a and r = b). I haven't imposed any boundary > conditions on theta or phi which may be a reason its not converging. The > phi boundary condition would be that the function is periodic in phi, but I > don't know if this needs to be put into the matrix somehow? > 1) The periodicity appears in the definition of the FD derivative in phi. Since this is Cartesian, you can use a DA in 3D, and make one direction periodic. 2) Don't you have a coordinate singularity at the pole? This is why every code I know of uses something like a Ying-Yang grid. Matt > I nondimensionalized the equation before solving which helped a little > bit. I've also scaled the matrix and RHS vectors by their maximum element > to make all entries <= 1. > > I've tried both direct and iterative solvers. The direct solvers give a > fairly accurate solution for small grids but seem unstable for larger > grids. The PETSc iterative solvers converge for very small grids but for > medium to large grids don't converge at all. > > When running with the command (for a small grid): > > *> ./main -ksp_converged_reason -ksp_monitor_true_residual -pc_type svd > -pc_svd_monitor* > > I get the output: > > SVD: condition number 5.929088512946e+03, 0 of 1440 singular values > are (nearly) zero > SVD: smallest singular values: 2.742809162118e-04 2.807446554985e-04 > 1.548488288425e-03 1.852332719983e-03 2.782708934678e-03 > SVD: largest singular values : 1.590835571953e+00 1.593368145758e+00 > 1.595771695877e+00 1.623691828398e+00 1.626235829632e+00 > 0 KSP preconditioned resid norm 2.154365616645e+03 true resid norm > 8.365589263063e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 4.832753933427e-10 true resid norm > 4.587845792963e-12 ||r(i)||/||b|| 5.484187244549e-13 > Linear solve converged due to CONVERGED_RTOL iterations 1 > > When plotting the output of this SVD solution, it looks pretty good, but > svd isn't practical for larger grids. > > Using the command (on the same grid): > > *> ./main -ksp_converged_reason -ksp_monitor_true_residual > -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none* > > The output is attached. There do not appear to be any 0 eigenvalues. The > solution here is much less accurate than the SVD case since it didn't > converge. > > I've also tried the -ksp_diagonal_scale -ksp_diagonal_scale_fix options > which don't help very much. > > Any advice on how to trouble shoot this would be greatly appreciated. > > Some things I've checked already: > > 1) there aren't any 0 rows in the matrix > 2) using direct solvers on very small grids seems to give decent solutions > 3) there don't appear to be any 0 singular values or eigenvalues > > Perhaps the matrix has a null space, but I don't know how I would find out > what the null space is? Is there a tutorial on how to do this? > > Thanks in advance! > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 20 17:47:50 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 20 Feb 2012 17:47:50 -0600 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: > Hi, > > Is there an internal check to see if the 'ghost' pointer in > VecCreateGhost() is NULL ? This can happen, for example, if the code is run > in serial and the there are no ghost points (hence n_ghost = 0 and ghost = > NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to > crash if the corresponding pointer is NULL. > Always send the whole error message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Feb 20 17:54:31 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 20 Feb 2012 15:54:31 -0800 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: Sure, Jed. My bad. Here's the whole message when the code is run in _serial_: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid pointer! [0]PETSC ERROR: Null Pointer: Parameter # 3! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad Mon Feb 20 15:52:25 2012 [0]PETSC ERROR: Libraries linked from /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 --download-mpich=1 --download-hypre=1 --download-ml=1 --with-parmetis-include=/home/mohammad/soft/parmetis/include --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" --download-superlu_dist=1 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: AOApplicationToPetsc() line 249 in /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid pointer! [0]PETSC ERROR: Null Pointer: Parameter # 3! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad Mon Feb 20 15:52:25 2012 [0]PETSC ERROR: Libraries linked from /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 --download-mpich=1 --download-hypre=1 --download-ml=1 --with-parmetis-include=/home/mohammad/soft/parmetis/include --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" --download-superlu_dist=1 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: AOPetscToApplication() line 210 in /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c It seems to complain about the 3rd parameter of AOApplicationToPetsc() and AOPetscToApplication(). VecCreateGhost, on the other hand, is fine. On Mon, Feb 20, 2012 at 3:47 PM, Jed Brown wrote: > On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: > >> Hi, >> >> Is there an internal check to see if the 'ghost' pointer in >> VecCreateGhost() is NULL ? This can happen, for example, if the code is run >> in serial and the there are no ghost points (hence n_ghost = 0 and ghost = >> NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to >> crash if the corresponding pointer is NULL. >> > > Always send the whole error message. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Feb 20 17:58:32 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 20 Feb 2012 15:58:32 -0800 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: Just if wondering what the actual code is, here's the bit causing the trouble: AOApplicationToPetsc(ao, ghostNodes.size(), (int*)ghostNodes); VecCreateGhost(comm, localNodes.size(), PETSC_DECIDE, ghostNodes.size(), (int*)ghostNodes, &v); AOPetscToApplication(ao, ghostNodes.size(), (int*)ghostNodes); when the code is in serial, ghostNodes.size() = 0 and (int*)ghostNodes = NULL. On Mon, Feb 20, 2012 at 3:54 PM, Mohammad Mirzadeh wrote: > Sure, Jed. My bad. Here's the whole message when the code is run in > _serial_: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid pointer! > [0]PETSC ERROR: Null Pointer: Parameter # 3! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 > CST 2012 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad > Mon Feb 20 15:52:25 2012 > [0]PETSC ERROR: Libraries linked from > /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib > [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 > --download-mpich=1 --download-hypre=1 --download-ml=1 > --with-parmetis-include=/home/mohammad/soft/parmetis/include > --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" > --download-superlu_dist=1 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: AOApplicationToPetsc() line 249 in > /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid pointer! > [0]PETSC ERROR: Null Pointer: Parameter # 3! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 > CST 2012 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad > Mon Feb 20 15:52:25 2012 > [0]PETSC ERROR: Libraries linked from > /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib > [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 > --download-mpich=1 --download-hypre=1 --download-ml=1 > --with-parmetis-include=/home/mohammad/soft/parmetis/include > --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" > --download-superlu_dist=1 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: AOPetscToApplication() line 210 in > /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c > > > It seems to complain about the 3rd parameter of AOApplicationToPetsc() > and AOPetscToApplication(). VecCreateGhost, on the other hand, is fine. > > On Mon, Feb 20, 2012 at 3:47 PM, Jed Brown wrote: > >> On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: >> >>> Hi, >>> >>> Is there an internal check to see if the 'ghost' pointer in >>> VecCreateGhost() is NULL ? This can happen, for example, if the code is run >>> in serial and the there are no ghost points (hence n_ghost = 0 and ghost = >>> NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to >>> crash if the corresponding pointer is NULL. >>> >> >> Always send the whole error message. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 20 18:10:09 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 18:10:09 -0600 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 5:58 PM, Mohammad Mirzadeh wrote: > Just if wondering what the actual code is, here's the bit causing the > trouble: > > AOApplicationToPetsc(ao, ghostNodes.size(), (int*)ghostNodes); > VecCreateGhost(comm, localNodes.size(), PETSC_DECIDE, ghostNodes.size(), > (int*)ghostNodes, &v); > AOPetscToApplication(ao, ghostNodes.size(), (int*)ghostNodes); > > when the code is in serial, ghostNodes.size() = 0 and (int*)ghostNodes = > NULL. > So you are asking us to give up checking for NULL if the size is 0? Matt > On Mon, Feb 20, 2012 at 3:54 PM, Mohammad Mirzadeh wrote: > >> Sure, Jed. My bad. Here's the whole message when the code is run in >> _serial_: >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Invalid pointer! >> [0]PETSC ERROR: Null Pointer: Parameter # 3! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 >> CST 2012 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad >> Mon Feb 20 15:52:25 2012 >> [0]PETSC ERROR: Libraries linked from >> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >> --download-mpich=1 --download-hypre=1 --download-ml=1 >> --with-parmetis-include=/home/mohammad/soft/parmetis/include >> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >> --download-superlu_dist=1 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: AOApplicationToPetsc() line 249 in >> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Invalid pointer! >> [0]PETSC ERROR: Null Pointer: Parameter # 3! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 >> CST 2012 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad >> Mon Feb 20 15:52:25 2012 >> [0]PETSC ERROR: Libraries linked from >> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >> --download-mpich=1 --download-hypre=1 --download-ml=1 >> --with-parmetis-include=/home/mohammad/soft/parmetis/include >> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >> --download-superlu_dist=1 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: AOPetscToApplication() line 210 in >> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >> >> >> It seems to complain about the 3rd parameter of AOApplicationToPetsc() >> and AOPetscToApplication(). VecCreateGhost, on the other hand, is fine. >> >> On Mon, Feb 20, 2012 at 3:47 PM, Jed Brown wrote: >> >>> On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: >>> >>>> Hi, >>>> >>>> Is there an internal check to see if the 'ghost' pointer in >>>> VecCreateGhost() is NULL ? This can happen, for example, if the code is run >>>> in serial and the there are no ghost points (hence n_ghost = 0 and ghost = >>>> NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to >>>> crash if the corresponding pointer is NULL. >>>> >>> >>> Always send the whole error message. >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Feb 20 18:30:41 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 20 Feb 2012 16:30:41 -0800 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: Matt, I'm just trying to understand what is going on and why the things do what they do -- I'm not asking for anything in particular :). I just thought that the situation for VecCreateGhost and AOXXXXToYYYY are somewhat similar here and was surprised to see one crashes on NULL while the other one does not. I have included a check on the pointer and size for AO calls and just wanted to check if I need the same thing with VecCreateGhost if there is such an internal check. Mohammad On Mon, Feb 20, 2012 at 4:10 PM, Matthew Knepley wrote: > On Mon, Feb 20, 2012 at 5:58 PM, Mohammad Mirzadeh wrote: > >> Just if wondering what the actual code is, here's the bit causing the >> trouble: >> >> AOApplicationToPetsc(ao, ghostNodes.size(), (int*)ghostNodes); >> VecCreateGhost(comm, localNodes.size(), PETSC_DECIDE, ghostNodes.size(), >> (int*)ghostNodes, &v); >> AOPetscToApplication(ao, ghostNodes.size(), (int*)ghostNodes); >> >> when the code is in serial, ghostNodes.size() = 0 and (int*)ghostNodes = >> NULL. >> > > So you are asking us to give up checking for NULL if the size is 0? > > Matt > > >> On Mon, Feb 20, 2012 at 3:54 PM, Mohammad Mirzadeh wrote: >> >>> Sure, Jed. My bad. Here's the whole message when the code is run in >>> _serial_: >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Invalid pointer! >>> [0]PETSC ERROR: Null Pointer: Parameter # 3! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>> 09:28:45 CST 2012 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad >>> Mon Feb 20 15:52:25 2012 >>> [0]PETSC ERROR: Libraries linked from >>> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >>> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >>> --download-mpich=1 --download-hypre=1 --download-ml=1 >>> --with-parmetis-include=/home/mohammad/soft/parmetis/include >>> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >>> --download-superlu_dist=1 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: AOApplicationToPetsc() line 249 in >>> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Invalid pointer! >>> [0]PETSC ERROR: Null Pointer: Parameter # 3! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>> 09:28:45 CST 2012 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by mohammad >>> Mon Feb 20 15:52:25 2012 >>> [0]PETSC ERROR: Libraries linked from >>> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >>> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >>> --download-mpich=1 --download-hypre=1 --download-ml=1 >>> --with-parmetis-include=/home/mohammad/soft/parmetis/include >>> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >>> --download-superlu_dist=1 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: AOPetscToApplication() line 210 in >>> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >>> >>> >>> It seems to complain about the 3rd parameter of AOApplicationToPetsc() >>> and AOPetscToApplication(). VecCreateGhost, on the other hand, is fine. >>> >>> On Mon, Feb 20, 2012 at 3:47 PM, Jed Brown wrote: >>> >>>> On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: >>>> >>>>> Hi, >>>>> >>>>> Is there an internal check to see if the 'ghost' pointer in >>>>> VecCreateGhost() is NULL ? This can happen, for example, if the code is run >>>>> in serial and the there are no ghost points (hence n_ghost = 0 and ghost = >>>>> NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to >>>>> crash if the corresponding pointer is NULL. >>>>> >>>> >>>> Always send the whole error message. >>>> >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Feb 20 20:09:56 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 20 Feb 2012 20:09:56 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Message-ID: Dear Jed, I am taking a look at ex15 ( http://www.mcs.anl.gov/petsc/petsc-current/src/mat/examples/tutorials/ex15.c.html ). I am very confused with this. In Section 3.5 Partitioning of the manual, one special Mat, that is 'adj', need to be established. However, in ex15, the common matrix is created. Does the matrix automatically contain the connection information? Thanks a lot, Yujie On Mon, Feb 20, 2012 at 4:45 PM, Jed Brown wrote: > On Mon, Feb 20, 2012 at 16:45, Recrusader wrote: > >> It is not only redistribute the matrix. I want to get the reorder >> information and use matpermute to reorder the matrix. > > > MatConvert(). > > ISPartitioningToNumbering() might also be useful for you. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 20 20:13:11 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 20 Feb 2012 20:13:11 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Message-ID: On Mon, Feb 20, 2012 at 20:09, recrusader wrote: > I am taking a look at ex15 ( > http://www.mcs.anl.gov/petsc/petsc-current/src/mat/examples/tutorials/ex15.c.html > ). > I am very confused with this. In Section 3.5 Partitioning of the manual, > one special Mat, that is 'adj', need to be established. > However, in ex15, the common matrix is created. Does the matrix > automatically contain the connection information? > Most (perhaps all?) of the partitioning implementations work with normal AIJ matrices. Internally, they usually have to convert to the MPIAdj because that is the interface to ParMETIS and others. Your code doesn't have to think about it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Feb 20 20:22:26 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 20 Feb 2012 20:22:26 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Message-ID: What types of AIJ matrices can directly use partitioning implementation? The matrix I am using doesn't work? I have to use ParMETIS since it can provide fill-reducing matrix ordering. >From my understanding, MatConvert can only reorder the matrix and doesn't provide the reordering information. Thanks, Yujie On Mon, Feb 20, 2012 at 8:13 PM, Jed Brown wrote: > On Mon, Feb 20, 2012 at 20:09, recrusader wrote: > >> I am taking a look at ex15 ( >> http://www.mcs.anl.gov/petsc/petsc-current/src/mat/examples/tutorials/ex15.c.html >> ). >> I am very confused with this. In Section 3.5 Partitioning of the manual, >> one special Mat, that is 'adj', need to be established. >> However, in ex15, the common matrix is created. Does the matrix >> automatically contain the connection information? >> > > Most (perhaps all?) of the partitioning implementations work with normal > AIJ matrices. Internally, they usually have to convert to the MPIAdj > because that is the interface to ParMETIS and others. Your code doesn't > have to think about it. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Mon Feb 20 20:26:25 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Mon, 20 Feb 2012 20:26:25 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Message-ID: On Mon, Feb 20, 2012 at 20:22, recrusader wrote: > What types of AIJ matrices can directly use partitioning implementation? > The matrix I am using doesn't work? > I have to use ParMETIS since it can provide fill-reducing matrix ordering. > From my understanding, MatConvert can only reorder the matrix and doesn't > provide the reordering information. > We've exchanged a lot of emails now, but I'm still having trouble extracting context from your confusion. You can 1. Make a normal matrix (e.g. AIJ), create a MatPartitioning object and call MatPartitioningApply(). 2. Create an MPIAdj matrix (perhaps by creating an AIJ matrix and using MatConvert()), then use either MatPartitioning or (if you insist) calling ParMETIS directly (you'll write the same code as is in src/mat/partition/impls/pmetis/pmetis.c). -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Mon Feb 20 20:31:19 2012 From: recrusader at gmail.com (recrusader) Date: Mon, 20 Feb 2012 20:31:19 -0600 Subject: [petsc-users] partition an MPIAIJ matrix In-Reply-To: References: <6C50E190-4367-4581-98F1-1978CE5E2A31@gmail.com> Message-ID: I am sorry for these emails. I will further try them. Thanks a lot. Best, Yujie On Mon, Feb 20, 2012 at 8:26 PM, Jed Brown wrote: > On Mon, Feb 20, 2012 at 20:22, recrusader wrote: > >> What types of AIJ matrices can directly use partitioning implementation? >> The matrix I am using doesn't work? >> I have to use ParMETIS since it can provide fill-reducing matrix >> ordering. From my understanding, MatConvert can only reorder the matrix and >> doesn't provide the reordering information. >> > > We've exchanged a lot of emails now, but I'm still having trouble > extracting context from your confusion. You can > > 1. Make a normal matrix (e.g. AIJ), create a MatPartitioning object and > call MatPartitioningApply(). > > 2. Create an MPIAdj matrix (perhaps by creating an AIJ matrix and using > MatConvert()), then use either MatPartitioning or (if you insist) calling > ParMETIS directly (you'll write the same code as is in > src/mat/partition/impls/pmetis/pmetis.c). > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 20 21:30:31 2012 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Feb 2012 21:30:31 -0600 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 6:30 PM, Mohammad Mirzadeh wrote: > Matt, I'm just trying to understand what is going on and why the things do > what they do -- I'm not asking for anything in particular :). I just > thought that the situation for VecCreateGhost and AOXXXXToYYYY are somewhat > similar here and was surprised to see one crashes on NULL while the other > one does not. > > I have included a check on the pointer and size for AO calls and just > wanted to check if I need the same thing with VecCreateGhost if there is > such an internal check. > I have no problem making that change. It is pushed to petsc-dev. Matt > Mohammad > > > On Mon, Feb 20, 2012 at 4:10 PM, Matthew Knepley wrote: > >> On Mon, Feb 20, 2012 at 5:58 PM, Mohammad Mirzadeh wrote: >> >>> Just if wondering what the actual code is, here's the bit causing the >>> trouble: >>> >>> AOApplicationToPetsc(ao, ghostNodes.size(), (int*)ghostNodes); >>> VecCreateGhost(comm, localNodes.size(), PETSC_DECIDE, ghostNodes.size(), >>> (int*)ghostNodes, &v); >>> AOPetscToApplication(ao, ghostNodes.size(), (int*)ghostNodes); >>> >>> when the code is in serial, ghostNodes.size() = 0 and (int*)ghostNodes = >>> NULL. >>> >> >> So you are asking us to give up checking for NULL if the size is 0? >> >> Matt >> >> >>> On Mon, Feb 20, 2012 at 3:54 PM, Mohammad Mirzadeh wrote: >>> >>>> Sure, Jed. My bad. Here's the whole message when the code is run in >>>> _serial_: >>>> >>>> [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> [0]PETSC ERROR: Invalid pointer! >>>> [0]PETSC ERROR: Null Pointer: Parameter # 3! >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>>> 09:28:45 CST 2012 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by >>>> mohammad Mon Feb 20 15:52:25 2012 >>>> [0]PETSC ERROR: Libraries linked from >>>> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >>>> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >>>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>>> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >>>> --download-mpich=1 --download-hypre=1 --download-ml=1 >>>> --with-parmetis-include=/home/mohammad/soft/parmetis/include >>>> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >>>> --download-superlu_dist=1 >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: AOApplicationToPetsc() line 249 in >>>> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >>>> [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> [0]PETSC ERROR: Invalid pointer! >>>> [0]PETSC ERROR: Null Pointer: Parameter # 3! >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>>> 09:28:45 CST 2012 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by >>>> mohammad Mon Feb 20 15:52:25 2012 >>>> [0]PETSC ERROR: Libraries linked from >>>> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >>>> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >>>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>>> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >>>> --download-mpich=1 --download-hypre=1 --download-ml=1 >>>> --with-parmetis-include=/home/mohammad/soft/parmetis/include >>>> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >>>> --download-superlu_dist=1 >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: AOPetscToApplication() line 210 in >>>> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >>>> >>>> >>>> It seems to complain about the 3rd parameter of AOApplicationToPetsc() >>>> and AOPetscToApplication(). VecCreateGhost, on the other hand, is fine. >>>> >>>> On Mon, Feb 20, 2012 at 3:47 PM, Jed Brown wrote: >>>> >>>>> On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Is there an internal check to see if the 'ghost' pointer in >>>>>> VecCreateGhost() is NULL ? This can happen, for example, if the code is run >>>>>> in serial and the there are no ghost points (hence n_ghost = 0 and ghost = >>>>>> NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to >>>>>> crash if the corresponding pointer is NULL. >>>>>> >>>>> >>>>> Always send the whole error message. >>>>> >>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Tue Feb 21 01:25:46 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Tue, 21 Feb 2012 08:25:46 +0100 Subject: [petsc-users] Add a matrix to MatNest twice Message-ID: <4F4346FA.1090600@tu-dresden.de> Is it possible to have the same matrix of type MatMPIAIJ twice in a MatNest? Thomas From dave.mayhem23 at gmail.com Tue Feb 21 05:36:54 2012 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 21 Feb 2012 12:36:54 +0100 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: Max, > > The test case that I am working with is isoviscous convection, benchmark > case 1a from Blankenbach 1989. > Okay, I know this problem. An iso viscous problem, solved on a uniform grid using dx=dy=dz discretised via FV should be super easy to precondition. > > > I think that this is the problem. The (2,2) slot in the LHS matrix is all > zero (pressure does not appear in the continuity equation), so I think that > the preconditioner is meaningless. I am still confused as to why this choice > of preconditioner was suggested in the tutorial, and what is a better choice > of preconditioner for this block? Should I be using one of the Schur > complement methods instead of the additive or multiplicative field split? > No, you need to define an appropriate stokes preconditioner You should assemble this matrix B = ( K,B ; B^T, -1/eta* I ) as the preconditioner for stokes. Here eta* is a measure of the local viscosity within each pressure control volume. Unless you specify to use the real diagonal Pass this into the third argument in KSPSetOperators() (i.e. the Pmat variable) http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetOperators.html Not sure how you represent A and B, but if you really want to run just additive with fieldsplit, you don't need the off diagonal blocks, so B = ( K,0 ; 0, -1/eta* I ) would yield the same result. Depending on your matrix representation, this may save you some memory. PCFieldsplit will use the B(1,1) and B(2,2) to build the stokes preconditioner unless you ask for it to use the real diagonal - but for the stokes operator A, this makes no sense. This is the right thing to do (as Matt states). Try it out, and let us know how it goes. Cheers, Dave From jedbrown at mcs.anl.gov Tue Feb 21 06:52:51 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 21 Feb 2012 06:52:51 -0600 Subject: [petsc-users] Add a matrix to MatNest twice In-Reply-To: <4F4346FA.1090600@tu-dresden.de> References: <4F4346FA.1090600@tu-dresden.de> Message-ID: On Tue, Feb 21, 2012 at 01:25, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Is it possible to have the same matrix of type MatMPIAIJ twice in a > MatNest? > That should be fine, let us know if something doesn't work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.witkowski at tu-dresden.de Tue Feb 21 07:43:36 2012 From: thomas.witkowski at tu-dresden.de (Thomas Witkowski) Date: Tue, 21 Feb 2012 14:43:36 +0100 Subject: [petsc-users] Add a matrix to MatNest twice In-Reply-To: References: <4F4346FA.1090600@tu-dresden.de> Message-ID: <4F439F88.50604@tu-dresden.de> Am 21.02.2012 13:52, schrieb Jed Brown: > On Tue, Feb 21, 2012 at 01:25, Thomas Witkowski > > wrote: > > Is it possible to have the same matrix of type MatMPIAIJ twice in > a MatNest? > > > That should be fine, let us know if something doesn't work. Seems not to work correctly. I made some small changes in src/ksp/ksp/examples/tests/ex22.c: 21c21 < np = 2; --- > np = 3; 62c62 < tmp[0][0] = A11; --- > tmp[0][0] = A12; When running with 2 threads, it ends with a segmentation violation. The stack is as follows: [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 109 src/mat/utils/matstash.c [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 641 src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatAssemblyEnd line 4934 src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatAssemblyEnd_Nest line 228 src/mat/impls/nest/matnest.c [0]PETSC ERROR: [0] MatAssemblyEnd line 4934 src/mat/interface/matrix.c [0]PETSC ERROR: [0] test_solve line 17 src/ksp/ksp/examples/tests/ex22.c Can you also reproduce this problem? Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Tue Feb 21 08:09:03 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 21 Feb 2012 08:09:03 -0600 Subject: [petsc-users] Add a matrix to MatNest twice In-Reply-To: <4F439F88.50604@tu-dresden.de> References: <4F4346FA.1090600@tu-dresden.de> <4F439F88.50604@tu-dresden.de> Message-ID: On Tue, Feb 21, 2012 at 07:43, Thomas Witkowski < thomas.witkowski at tu-dresden.de> wrote: > Seems not to work correctly. I made some small changes in > src/ksp/ksp/examples/tests/ex22.c: > > 21c21 > < np = 2; > --- > > np = 3; > 62c62 > < tmp[0][0] = A11; > --- > > tmp[0][0] = A12; > > When running with 2 threads, it ends with a segmentation violation. The > stack is as follows: > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 109 > src/mat/utils/matstash.c > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 641 > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatAssemblyEnd line 4934 src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatAssemblyEnd_Nest line 228 > src/mat/impls/nest/matnest.c > [0]PETSC ERROR: [0] MatAssemblyEnd line 4934 src/mat/interface/matrix.c > [0]PETSC ERROR: [0] test_solve line 17 src/ksp/ksp/examples/tests/ex22.c > Okay, the problem is that MatAssemblyBegin and MatAssemblyEnd are called twice on the same matrix. There are two ways to fix this: 1. Add a flag to Mat so we can tell when a matrix is being assembled. This makes it so we can't error if an external user leaves a MatAssemblyBegin() open with no closing MatAssemblyEnd(). 2. Uniquify a list of non-empty mats so we only call it once. This doesn't work because you could have nested MatNests: [[A 0;0 B] 0; 0 A] I'm in the midst of something right now, but I'll get back to this if nobody beats me to it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.alken at Colorado.EDU Tue Feb 21 11:41:41 2012 From: patrick.alken at Colorado.EDU (Patrick Alken) Date: Tue, 21 Feb 2012 10:41:41 -0700 Subject: [petsc-users] convergence problem in spherical coordinates In-Reply-To: References: <4F42CEAD.2050501@colorado.edu> Message-ID: <4F43D755.7020205@colorado.edu> On 02/20/2012 04:37 PM, Matthew Knepley wrote: > On Mon, Feb 20, 2012 at 4:52 PM, Patrick Alken > > wrote: > > Hello all, > > I am having great difficulty solving a 3D finite difference > equation in spherical coordinates. I am solving the equation in a > spherical shell region S(a,b), with the boundary conditions being > that the function is 0 on both boundaries (r = a and r = b). I > haven't imposed any boundary conditions on theta or phi which may > be a reason its not converging. The phi boundary condition would > be that the function is periodic in phi, but I don't know if this > needs to be put into the matrix somehow? > > > 1) The periodicity appears in the definition of the FD derivative in > phi. Since this is Cartesian, you can use a DA in 3D, and make one > direction periodic. I've made this change in the FD derivatives of phi. Unfortunately PETSc still does not converge properly. In the meantime I've tried two different direct solver libraries, both of which find correct solutions to the matrix problem, so I don't think I can use petsc for this problem. > > 2) Don't you have a coordinate singularity at the pole? This is why > every code I know of uses something like a Ying-Yang grid. Yes but my grid points start a little away from the poles to avoid this. > > Matt > > I nondimensionalized the equation before solving which helped a > little bit. I've also scaled the matrix and RHS vectors by their > maximum element to make all entries <= 1. > > I've tried both direct and iterative solvers. The direct solvers > give a fairly accurate solution for small grids but seem unstable > for larger grids. The PETSc iterative solvers converge for very > small grids but for medium to large grids don't converge at all. > > When running with the command (for a small grid): > > *> ./main -ksp_converged_reason -ksp_monitor_true_residual > -pc_type svd -pc_svd_monitor* > > I get the output: > > SVD: condition number 5.929088512946e+03, 0 of 1440 singular > values are (nearly) zero > SVD: smallest singular values: 2.742809162118e-04 > 2.807446554985e-04 1.548488288425e-03 1.852332719983e-03 > 2.782708934678e-03 > SVD: largest singular values : 1.590835571953e+00 > 1.593368145758e+00 1.595771695877e+00 1.623691828398e+00 > 1.626235829632e+00 > 0 KSP preconditioned resid norm 2.154365616645e+03 true resid > norm 8.365589263063e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 4.832753933427e-10 true resid > norm 4.587845792963e-12 ||r(i)||/||b|| 5.484187244549e-13 > Linear solve converged due to CONVERGED_RTOL iterations 1 > > When plotting the output of this SVD solution, it looks pretty > good, but svd isn't practical for larger grids. > > Using the command (on the same grid): > > *> ./main -ksp_converged_reason -ksp_monitor_true_residual > -ksp_compute_eigenvalues -ksp_gmres_restart 1000 -pc_type none* > > The output is attached. There do not appear to be any 0 > eigenvalues. The solution here is much less accurate than the SVD > case since it didn't converge. > > I've also tried the -ksp_diagonal_scale -ksp_diagonal_scale_fix > options which don't help very much. > > Any advice on how to trouble shoot this would be greatly appreciated. > > Some things I've checked already: > > 1) there aren't any 0 rows in the matrix > 2) using direct solvers on very small grids seems to give decent > solutions > 3) there don't appear to be any 0 singular values or eigenvalues > > Perhaps the matrix has a null space, but I don't know how I would > find out what the null space is? Is there a tutorial on how to do > this? > > Thanks in advance! > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Tue Feb 21 13:08:05 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 21 Feb 2012 11:08:05 -0800 Subject: [petsc-users] check for NULL pointer in VecCreateGhost In-Reply-To: References: Message-ID: Thanks Matt. If that is not against PETSc coding policies, it sounds reasonable to me. Mohammad On Mon, Feb 20, 2012 at 7:30 PM, Matthew Knepley wrote: > On Mon, Feb 20, 2012 at 6:30 PM, Mohammad Mirzadeh wrote: > >> Matt, I'm just trying to understand what is going on and why the things >> do what they do -- I'm not asking for anything in particular :). I just >> thought that the situation for VecCreateGhost and AOXXXXToYYYY are somewhat >> similar here and was surprised to see one crashes on NULL while the other >> one does not. >> >> I have included a check on the pointer and size for AO calls and just >> wanted to check if I need the same thing with VecCreateGhost if there is >> such an internal check. >> > > I have no problem making that change. It is pushed to petsc-dev. > > Matt > > >> Mohammad >> >> >> On Mon, Feb 20, 2012 at 4:10 PM, Matthew Knepley wrote: >> >>> On Mon, Feb 20, 2012 at 5:58 PM, Mohammad Mirzadeh wrote: >>> >>>> Just if wondering what the actual code is, here's the bit causing the >>>> trouble: >>>> >>>> AOApplicationToPetsc(ao, ghostNodes.size(), (int*)ghostNodes); >>>> VecCreateGhost(comm, localNodes.size(), PETSC_DECIDE, >>>> ghostNodes.size(), (int*)ghostNodes, &v); >>>> AOPetscToApplication(ao, ghostNodes.size(), (int*)ghostNodes); >>>> >>>> when the code is in serial, ghostNodes.size() = 0 and (int*)ghostNodes >>>> = NULL. >>>> >>> >>> So you are asking us to give up checking for NULL if the size is 0? >>> >>> Matt >>> >>> >>>> On Mon, Feb 20, 2012 at 3:54 PM, Mohammad Mirzadeh wrote: >>>> >>>>> Sure, Jed. My bad. Here's the whole message when the code is run in >>>>> _serial_: >>>>> >>>>> [0]PETSC ERROR: --------------------- Error Message >>>>> ------------------------------------ >>>>> [0]PETSC ERROR: Invalid pointer! >>>>> [0]PETSC ERROR: Null Pointer: Parameter # 3! >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>>>> 09:28:45 CST 2012 >>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by >>>>> mohammad Mon Feb 20 15:52:25 2012 >>>>> [0]PETSC ERROR: Libraries linked from >>>>> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >>>>> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >>>>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>>>> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >>>>> --download-mpich=1 --download-hypre=1 --download-ml=1 >>>>> --with-parmetis-include=/home/mohammad/soft/parmetis/include >>>>> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >>>>> --download-superlu_dist=1 >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: AOApplicationToPetsc() line 249 in >>>>> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >>>>> [0]PETSC ERROR: --------------------- Error Message >>>>> ------------------------------------ >>>>> [0]PETSC ERROR: Invalid pointer! >>>>> [0]PETSC ERROR: Null Pointer: Parameter # 3! >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>>>> 09:28:45 CST 2012 >>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: ./petsc on a arch-linu named mohammad-laptop by >>>>> mohammad Mon Feb 20 15:52:25 2012 >>>>> [0]PETSC ERROR: Libraries linked from >>>>> /home/mohammad/soft/petsc-3.2-p6/arch-linux2-cxx-debug/lib >>>>> [0]PETSC ERROR: Configure run at Thu Feb 16 02:16:40 2012 >>>>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>>>> --with-fc=gfortran --with-clanguage=cxx --download-f-blas-lapack=1 >>>>> --download-mpich=1 --download-hypre=1 --download-ml=1 >>>>> --with-parmetis-include=/home/mohammad/soft/parmetis/include >>>>> --with-parmetis-lib="-L/home/mohammad/soft/parmetis/lib -lparmetis -lmetis" >>>>> --download-superlu_dist=1 >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: AOPetscToApplication() line 210 in >>>>> /home/mohammad/soft/petsc-3.2-p6/src/dm/ao/interface/ao.c >>>>> >>>>> >>>>> It seems to complain about the 3rd parameter of AOApplicationToPetsc() >>>>> and AOPetscToApplication(). VecCreateGhost, on the other hand, is fine. >>>>> >>>>> On Mon, Feb 20, 2012 at 3:47 PM, Jed Brown wrote: >>>>> >>>>>> On Mon, Feb 20, 2012 at 17:25, Mohammad Mirzadeh wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Is there an internal check to see if the 'ghost' pointer in >>>>>>> VecCreateGhost() is NULL ? This can happen, for example, if the code is run >>>>>>> in serial and the there are no ghost points (hence n_ghost = 0 and ghost = >>>>>>> NULL(implementation dependent of course) ). AOXXXXToYYYY functions seem to >>>>>>> crash if the corresponding pointer is NULL. >>>>>>> >>>>>> >>>>>> Always send the whole error message. >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Feb 21 15:37:57 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 21 Feb 2012 15:37:57 -0600 Subject: [petsc-users] Add a matrix to MatNest twice In-Reply-To: References: <4F4346FA.1090600@tu-dresden.de> <4F439F88.50604@tu-dresden.de> Message-ID: <628B343F-69CE-4093-9169-09D2BD01D32F@mcs.anl.gov> Jed, Why not just have MatAssemblyBegin_Nest() call the inner MatAssemblyBegin/End() together and stop the charade that there is any overlap of communication and computations etc anyway? Barry On Feb 21, 2012, at 8:09 AM, Jed Brown wrote: > On Tue, Feb 21, 2012 at 07:43, Thomas Witkowski wrote: > Seems not to work correctly. I made some small changes in src/ksp/ksp/examples/tests/ex22.c: > > 21c21 > < np = 2; > --- > > np = 3; > 62c62 > < tmp[0][0] = A11; > --- > > tmp[0][0] = A12; > > When running with 2 threads, it ends with a segmentation violation. The stack is as follows: > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 109 src/mat/utils/matstash.c > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 641 src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatAssemblyEnd line 4934 src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatAssemblyEnd_Nest line 228 src/mat/impls/nest/matnest.c > [0]PETSC ERROR: [0] MatAssemblyEnd line 4934 src/mat/interface/matrix.c > [0]PETSC ERROR: [0] test_solve line 17 src/ksp/ksp/examples/tests/ex22.c > > Okay, the problem is that MatAssemblyBegin and MatAssemblyEnd are called twice on the same matrix. There are two ways to fix this: > > 1. Add a flag to Mat so we can tell when a matrix is being assembled. This makes it so we can't error if an external user leaves a MatAssemblyBegin() open with no closing MatAssemblyEnd(). > > 2. Uniquify a list of non-empty mats so we only call it once. This doesn't work because you could have nested MatNests: > > [[A 0;0 B] 0; 0 A] > > > I'm in the midst of something right now, but I'll get back to this if nobody beats me to it. From jedbrown at mcs.anl.gov Tue Feb 21 16:30:55 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 21 Feb 2012 16:30:55 -0600 Subject: [petsc-users] Add a matrix to MatNest twice In-Reply-To: <628B343F-69CE-4093-9169-09D2BD01D32F@mcs.anl.gov> References: <4F4346FA.1090600@tu-dresden.de> <4F439F88.50604@tu-dresden.de> <628B343F-69CE-4093-9169-09D2BD01D32F@mcs.anl.gov> Message-ID: On Tue, Feb 21, 2012 at 15:37, Barry Smith wrote: > Why not just have MatAssemblyBegin_Nest() call the inner > MatAssemblyBegin/End() together and stop the charade that there is any > overlap of communication and computations etc anyway? (I pushed this.) So there is a very real latency issue in matrix assembly that comes from the reduction to determine how many receives are necessary. Due to MPI limitations, that code (PetscGatherNumberOfMessages() and PetscGatherMessageLengths()) is synchronizing, but MPI-3 will offer non-blocking collectives that we could use for those operations. Now the two entrance points (MatAssemblyBegin() and MatAssemblyEnd()) are not sufficient to make progress on this task of assembly without also having an internal request system (where either a comm thread or callbacks from other library functions poked the progress along). There are also signs that sometime soon it will be common to have a comm thread that manages packing, in which case communication could actually start happening concurrently with computation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vkuhlem at emory.edu Wed Feb 22 08:52:38 2012 From: vkuhlem at emory.edu (Kuhlemann, Verena) Date: Wed, 22 Feb 2012 14:52:38 +0000 Subject: [petsc-users] MatStructure flag in KSPSetOperators Message-ID: Hi, I am not sure that I understand the MatStructure flag in the call of KSPSetOperators correctly. If I use KSPSetOperators(ksp, A, P, DIFFERENT_NONZERO_PATTERN) and then solve two linear systems with the same matrix operators but different rhs is the preconditioner reassembled in every solve. I.e., if I call KSPSolve(ksp,b1,x1); KSPSolve(ksp,b2,x2); And if I use KSPSetOperators(ksp, A, P, SAME_PRECONDITIONER) the preconditioner is only setup one time and reused later. Or is the flag only important if I change the matrix operators in between. Thank you, Verena ________________________________ This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited. If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Feb 22 09:25:13 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 22 Feb 2012 09:25:13 -0600 Subject: [petsc-users] MatStructure flag in KSPSetOperators In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 08:52, Kuhlemann, Verena wrote: > I am not sure that I understand the MatStructure flag in the call of > KSPSetOperators > correctly. > > If I use KSPSetOperators(ksp, A, P, DIFFERENT_NONZERO_PATTERN) and then > solve two linear systems with the same matrix operators but different rhs > is > the preconditioner reassembled in every solve. I.e., if I call > KSPSolve(ksp,b1,x1); > KSPSolve(ksp,b2,x2); > If you don't put anything between these calls, the preconditioner will be reused. > > And if I use KSPSetOperators(ksp, A, P, SAME_PRECONDITIONER) the > preconditioner is > only setup one time and reused later. > > Or is the flag only important if I change the matrix operators in > between. > You only need to call KSPSetOperators() when the matrix changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maxwellr at gmail.com Wed Feb 22 15:29:41 2012 From: maxwellr at gmail.com (Max Rudolph) Date: Wed, 22 Feb 2012 13:29:41 -0800 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 3:36 AM, Dave May wrote: > Max, > > > > > The test case that I am working with is isoviscous convection, benchmark > > case 1a from Blankenbach 1989. > > > > Okay, I know this problem. > An iso viscous problem, solved on a uniform grid using dx=dy=dz > discretised > via FV should be super easy to precondition. > > > > > > > I think that this is the problem. The (2,2) slot in the LHS matrix is all > > zero (pressure does not appear in the continuity equation), so I think > that > > the preconditioner is meaningless. I am still confused as to why this > choice > > of preconditioner was suggested in the tutorial, and what is a better > choice > > of preconditioner for this block? Should I be using one of the Schur > > complement methods instead of the additive or multiplicative field split? > > > > No, you need to define an appropriate stokes preconditioner > You should assemble this matrix > B = ( K,B ; B^T, -1/eta* I ) > as the preconditioner for stokes. > Here eta* is a measure of the local viscosity within each pressure > control volume. > Unless you specify to use the real diagonal > > Pass this into the third argument in KSPSetOperators() (i.e. the Pmat > variable) > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetOperators.html > > Not sure how you represent A and B, but if you really want to run just > additive with fieldsplit, you don't need the off diagonal blocks, so > B = ( K,0 ; 0, -1/eta* I ) > would yield the same result. Depending on your matrix representation, > this may save you some memory. > > PCFieldsplit will use the B(1,1) and B(2,2) to build the stokes > preconditioner unless you ask for it to use the real diagonal - but > for the stokes operator A, this makes no sense. > > This is the right thing to do (as Matt states). > Try it out, and let us know how it goes. > > > Cheers, > Dave > Dave and Matt, Thanks for your help. I had some time to work on this a little more. I now have a stokes operator A that looks like this: A=(K B; B^T 0) and a matrix from which the preconditioner is generated P=(K B; B^T -1/eta*I) I verified that I can solve this system using the default ksp and pc settings in 77 iterations for the first timestep (initial guess zero) and in 31 iterations for the second timestep (nonzero initial guess). I adopted your suggestion to use the multiplicative field split as a starting point. My reading of the PETSc manual suggests to me that the preconditioner formed should then look like: B = (ksp(K,K) 0;-B^T*ksp(K,K)*ksp(0,-1/eta*I) ksp(0,-1/eta*I)) My interpretation of the output suggests that the solvers within each fieldsplit are converging nicely, but the global residual is not decreasing after the first few iterations. Given the disparity in residual sizes, I think that there might be a problem with the scaling of the pressure variable (I scaled the continuity equation by eta/dx where dx is my grid spacing). I also scaled the (1,1) block in the preconditioner by this scale factor. Thanks again for all of your help. Max Options used: -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type multiplicative \ -stokes_fieldsplit_0_pc_type ml \ -stokes_fieldsplit_0_ksp_type gmres \ -stokes_fieldsplit_0_ksp_monitor_true_residual \ -stokes_fieldsplit_0_ksp_norm_type UNPRECONDITIONED \ -stokes_fieldsplit_0_ksp_max_it 3 \ -stokes_fieldsplit_0_ksp_type gmres \ -stokes_fieldsplit_0_ksp_rtol 1.0e-4 \ -stokes_fieldsplit_0_mg_levels_ksp_type gmres \ -stokes_fieldsplit_0_mg_levels_pc_type bjacobi \ -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 \ -stokes_fieldsplit_1_pc_type jacobi \ -stokes_fieldsplit_1_ksp_type preonly \ -stokes_fieldsplit_1_ksp_max_it 3 \ -stokes_fieldsplit_1_ksp_monitor_true_residual \ -stokes_ksp_type gcr \ -stokes_ksp_monitor_blocks \ -stokes_ksp_monitor_draw \ -stokes_ksp_view \ -stokes_ksp_atol 1e-6 \ -stokes_ksp_rtol 1e-6 \ -stokes_ksp_max_it 100 \ -stokes_ksp_norm_type UNPRECONDITIONED \ -stokes_ksp_monitor_true_residual \ Output: 0 KSP Component U,V,P residual norm [ 0.000000000000e+00, 1.165111661413e+06, 0.000000000000e+00 ] Residual norms for stokes_ solve. 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 3.173622513625e+05 true resid norm 3.173622513625e+05 ||r(i)||/||b|| 2.723878421898e-01 2 KSP unpreconditioned resid norm 5.634119635158e+04 true resid norm 1.725996376799e+05 ||r(i)||/||b|| 1.481399967026e-01 3 KSP unpreconditioned resid norm 1.218418968344e+03 true resid norm 1.559727441168e+05 ||r(i)||/||b|| 1.338693528546e-01 1 KSP Component U,V,P residual norm [ 5.763380362961e+04, 1.154490085631e+05, 3.370358145704e-12 ] 1 KSP unpreconditioned resid norm 1.290353784783e+05 true resid norm 1.290353784783e+05 ||r(i)||/||b|| 1.107493665644e-01 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.290353784783e+05 true resid norm 1.290353784783e+05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.655137188235e+04 true resid norm 1.655137188235e+04 ||r(i)||/||b|| 1.282700301076e-01 2 KSP unpreconditioned resid norm 1.195941831181e+03 true resid norm 4.554417355181e+03 ||r(i)||/||b|| 3.529588093508e-02 3 KSP unpreconditioned resid norm 8.479547025398e+01 true resid norm 3.817072778396e+03 ||r(i)||/||b|| 2.958159865466e-02 2 KSP Component U,V,P residual norm [ 2.026983725663e+03, 2.531521226429e+03, 3.419060873106e-12 ] 2 KSP unpreconditioned resid norm 3.243032954498e+03 true resid norm 3.243032954498e+03 ||r(i)||/||b|| 2.783452489493e-03 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 3.243032954498e+03 true resid norm 3.243032954498e+03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.170090628031e+02 true resid norm 1.170090628031e+02 ||r(i)||/||b|| 3.608013376517e-02 2 KSP unpreconditioned resid norm 9.782830529900e+00 true resid norm 1.741722174777e+01 ||r(i)||/||b|| 5.370658267168e-03 3 KSP unpreconditioned resid norm 6.886950142735e-01 true resid norm 1.636749336722e+01 ||r(i)||/||b|| 5.046971028932e-03 3 KSP Component U,V,P residual norm [ 7.515013854917e+01, 7.515663601801e+01, 3.418919176066e-12 ] 3 KSP unpreconditioned resid norm 1.062829396540e+02 true resid norm 1.062829396540e+02 ||r(i)||/||b|| 9.122124786317e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.062829396540e+02 true resid norm 1.062829396540e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.373578062042e+01 true resid norm 5.373578062042e+01 ||r(i)||/||b|| 5.055917797846e-01 2 KSP unpreconditioned resid norm 1.199305097134e+00 true resid norm 3.492111756827e+01 ||r(i)||/||b|| 3.285674792393e-01 3 KSP unpreconditioned resid norm 9.508597255523e-02 true resid norm 3.452079362567e+01 ||r(i)||/||b|| 3.248008922038e-01 4 KSP Component U,V,P residual norm [ 7.495897679790e+01, 7.527868410560e+01, 3.418919160091e-12 ] 4 KSP unpreconditioned resid norm 1.062343093509e+02 true resid norm 1.062343093509e+02 ||r(i)||/||b|| 9.117950911420e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.062343093509e+02 true resid norm 1.062343093509e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.419252803207e+01 true resid norm 5.419252803207e+01 ||r(i)||/||b|| 5.101226558840e-01 2 KSP unpreconditioned resid norm 1.431134174522e+00 true resid norm 3.339055236737e+01 ||r(i)||/||b|| 3.143104386088e-01 3 KSP unpreconditioned resid norm 9.760479467902e-02 true resid norm 3.304522520358e+01 ||r(i)||/||b|| 3.110598205561e-01 5 KSP Component U,V,P residual norm [ 7.491128585963e+01, 7.523275560552e+01, 3.418919008441e-12 ] 5 KSP unpreconditioned resid norm 1.061681132221e+02 true resid norm 1.061681132221e+02 ||r(i)||/||b|| 9.112269384837e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.061681132221e+02 true resid norm 1.061681132221e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.343215079492e+01 true resid norm 5.343215079492e+01 ||r(i)||/||b|| 5.032787074508e-01 2 KSP unpreconditioned resid norm 1.288069736759e+00 true resid norm 3.308925591301e+01 ||r(i)||/||b|| 3.116684935691e-01 3 KSP unpreconditioned resid norm 9.505248953960e-02 true resid norm 3.281875055845e+01 ||r(i)||/||b|| 3.091205971589e-01 6 KSP Component U,V,P residual norm [ 7.481188568118e+01, 7.527346267608e+01, 3.418918860626e-12 ] 6 KSP unpreconditioned resid norm 1.061268694649e+02 true resid norm 1.061268694649e+02 ||r(i)||/||b|| 9.108729487455e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.061268694649e+02 true resid norm 1.061268694649e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.300383444945e+01 true resid norm 5.300383444945e+01 ||r(i)||/||b|| 4.994384053416e-01 2 KSP unpreconditioned resid norm 1.118785004087e+00 true resid norm 3.282090953364e+01 ||r(i)||/||b|| 3.092610730828e-01 3 KSP unpreconditioned resid norm 9.758015489979e-02 true resid norm 3.259718081014e+01 ||r(i)||/||b|| 3.071529479244e-01 7 KSP Component U,V,P residual norm [ 7.475024970669e+01, 7.530858268154e+01, 3.418918784089e-12 ] 7 KSP unpreconditioned resid norm 1.061083524362e+02 true resid norm 1.061083524362e+02 ||r(i)||/||b|| 9.107140195255e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.061083524362e+02 true resid norm 1.061083524362e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.296981668051e+01 true resid norm 5.296981668051e+01 ||r(i)||/||b|| 4.992049679820e-01 2 KSP unpreconditioned resid norm 9.379451887610e-01 true resid norm 3.378466967056e+01 ||r(i)||/||b|| 3.183978348066e-01 3 KSP unpreconditioned resid norm 9.102580142867e-02 true resid norm 3.360853440947e+01 ||r(i)||/||b|| 3.167378781957e-01 8 KSP Component U,V,P residual norm [ 7.464535615814e+01, 7.537007679541e+01, 3.418918790515e-12 ] 8 KSP unpreconditioned resid norm 1.060781677449e+02 true resid norm 1.060781677449e+02 ||r(i)||/||b|| 9.104549482946e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.060781677449e+02 true resid norm 1.060781677449e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.281972737642e+01 true resid norm 5.281972737642e+01 ||r(i)||/||b|| 4.979321240109e-01 2 KSP unpreconditioned resid norm 9.224594814880e-01 true resid norm 3.351285171891e+01 ||r(i)||/||b|| 3.159260046751e-01 3 KSP unpreconditioned resid norm 9.143100662935e-02 true resid norm 3.329269756083e+01 ||r(i)||/||b|| 3.138506091177e-01 9 KSP Component U,V,P residual norm [ 7.451688471900e+01, 7.544516987344e+01, 3.418918860847e-12 ] 9 KSP unpreconditioned resid norm 1.060412172952e+02 true resid norm 1.060412172952e+02 ||r(i)||/||b|| 9.101378074496e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.060412172952e+02 true resid norm 1.060412172952e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.275132249899e+01 true resid norm 5.275132249899e+01 ||r(i)||/||b|| 4.974605520805e-01 2 KSP unpreconditioned resid norm 7.755381284769e-01 true resid norm 3.453933285011e+01 ||r(i)||/||b|| 3.257161105001e-01 3 KSP unpreconditioned resid norm 7.298768665179e-02 true resid norm 3.435179316160e+01 ||r(i)||/||b|| 3.239475558447e-01 10 KSP Component U,V,P residual norm [ 7.451431102619e+01, 7.544762349626e+01, 3.418918857322e-12 ] 10 KSP unpreconditioned resid norm 1.060411544587e+02 true resid norm 1.060411544587e+02 ||r(i)||/||b|| 9.101372681321e-05 Residual norms for stokes_fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.060411544587e+02 true resid norm 1.060411544587e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.276103518337e+01 true resid norm 5.276103518337e+01 ||r(i)||/||b|| 4.975524403961e-01 2 KSP unpreconditioned resid norm 7.777079890360e-01 true resid norm 3.454373663425e+01 ||r(i)||/||b|| 3.257578325186e-01 3 KSP unpreconditioned resid norm 7.356028471071e-02 true resid norm 3.435584054266e+01 ||r(i)||/||b|| 3.239859158269e-01 11 KSP Component U,V,P residual norm [ 7.438335197779e+01, 7.553731959735e+01, 3.418918856471e-12 ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Wed Feb 22 16:52:58 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 22 Feb 2012 14:52:58 -0800 Subject: [petsc-users] AO Message-ID: Hi, Should I be worried about function call overhead if make calls like AOXXXXToYYYY(ao, 1, &node) for each node when setting up the linear system? If so, is there an array where AO internally saves the mapping information that I can access or do I need to first save the mapping once and reuse it? Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Feb 22 16:59:41 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 22 Feb 2012 16:59:41 -0600 Subject: [petsc-users] AO In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 16:52, Mohammad Mirzadeh wrote: > Should I be worried about function call overhead if make calls like > AOXXXXToYYYY(ao, 1, &node) for each node when setting up the linear system? > If so, is there an array where AO internally saves the mapping information > that I can access or do I need to first save the mapping once and reuse it? If you only need to do it once, the function call overhead should not be a big deal. If you need to apply the mapping frequently, it's better to do it in batches. -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.alken at Colorado.EDU Wed Feb 22 18:05:14 2012 From: patrick.alken at Colorado.EDU (Patrick Alken) Date: Wed, 22 Feb 2012 17:05:14 -0700 Subject: [petsc-users] Oscillations in finite difference solution Message-ID: <4F4582BA.70804@colorado.edu> Hi all, I have been trying to track down a problem for a few days with solving a linear system arising from a finite differenced PDE in spherical coordinates. I found that PETSc managed to converge to a nice solution for my matrix at small grid sizes and everything looks pretty good. But when I try larger more realistic grid sizes, PETSc fails to converge. After trying with another direct solver library, I found that the direct solver found a solution which exactly solves the matrix equation, but when plotting the solution, I see that it oscillates rapidly between the grid points and therefore isn't a satisfactory solution. (At smaller grids the solution is nice and smooth) I was wondering if this phenomenon is common in PDEs? and if there is any way to correct for it? I am currently using 2nd order centered differences for interior grid points, and 1st order forward/backward differences for edge points. Would it be worthwhile to try moving to 4th order differences instead? Or would that make the problem worse? I've even tried smoothing the parameters which go into the matrix entries using moving averages...which doesn't seem to help too much. Any advice from those who have experience with this phenomenon would be greatly appreciated! Thanks, Patrick From jedbrown at mcs.anl.gov Wed Feb 22 18:13:13 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 22 Feb 2012 18:13:13 -0600 Subject: [petsc-users] Oscillations in finite difference solution In-Reply-To: <4F4582BA.70804@colorado.edu> References: <4F4582BA.70804@colorado.edu> Message-ID: On Wed, Feb 22, 2012 at 18:05, Patrick Alken wrote: > Hi all, > > I have been trying to track down a problem for a few days with solving a > linear system arising from a finite differenced PDE in spherical > coordinates. I found that PETSc managed to converge to a nice solution for > my matrix at small grid sizes and everything looks pretty good. > > But when I try larger more realistic grid sizes, PETSc fails to converge. > After trying with another direct solver library, I found that the direct > solver found a solution which exactly solves the matrix equation, This never happens, so what do you mean? You compute the residual and it's similar to what you expect the rounding error to be? > but when plotting the solution, I see that it oscillates rapidly between > the grid points and therefore isn't a satisfactory solution. (At smaller > grids the solution is nice and smooth) > What sort of PDE are you solving? > > I was wondering if this phenomenon is common in PDEs? and if there is any > way to correct for it? > > I am currently using 2nd order centered differences for interior grid > points, and 1st order forward/backward differences for edge points. Would > it be worthwhile to try moving to 4th order differences instead? Or would > that make the problem worse? > > I've even tried smoothing the parameters which go into the matrix entries > using moving averages...which doesn't seem to help too much. > > Any advice from those who have experience with this phenomenon would be > greatly appreciated! > > Thanks, > Patrick > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.alken at Colorado.EDU Wed Feb 22 18:15:56 2012 From: patrick.alken at Colorado.EDU (Patrick Alken) Date: Wed, 22 Feb 2012 17:15:56 -0700 Subject: [petsc-users] Oscillations in finite difference solution In-Reply-To: References: <4F4582BA.70804@colorado.edu> Message-ID: <4F45853C.80407@colorado.edu> On 02/22/2012 05:13 PM, Jed Brown wrote: > On Wed, Feb 22, 2012 at 18:05, Patrick Alken > > wrote: > > Hi all, > > I have been trying to track down a problem for a few days with > solving a linear system arising from a finite differenced PDE in > spherical coordinates. I found that PETSc managed to converge to a > nice solution for my matrix at small grid sizes and everything > looks pretty good. > > But when I try larger more realistic grid sizes, PETSc fails to > converge. After trying with another direct solver library, I found > that the direct solver found a solution which exactly solves the > matrix equation, > > > This never happens, so what do you mean? You compute the residual and > it's similar to what you expect the rounding error to be? Yes I mean the direct solver residual is around 10e-15. The PETSc residual is 4e00 > but when plotting the solution, I see that it oscillates rapidly > between the grid points and therefore isn't a satisfactory > solution. (At smaller grids the solution is nice and smooth) > > > What sort of PDE are you solving? The PDE is: grad(f) . B = g where B is a known vector field, g is a known scalar function, and f is the unknown scalar function to be determined (I am discretizing this equation for f in spherical coords) > > I was wondering if this phenomenon is common in PDEs? and if > there is any way to correct for it? > > I am currently using 2nd order centered differences for interior > grid points, and 1st order forward/backward differences for edge > points. Would it be worthwhile to try moving to 4th order > differences instead? Or would that make the problem worse? > > I've even tried smoothing the parameters which go into the matrix > entries using moving averages...which doesn't seem to help too much. > > Any advice from those who have experience with this phenomenon > would be greatly appreciated! > > Thanks, > Patrick > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Wed Feb 22 18:24:09 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 22 Feb 2012 18:24:09 -0600 Subject: [petsc-users] Oscillations in finite difference solution In-Reply-To: <4F45853C.80407@colorado.edu> References: <4F4582BA.70804@colorado.edu> <4F45853C.80407@colorado.edu> Message-ID: On Wed, Feb 22, 2012 at 18:15, Patrick Alken wrote: > Yes I mean the direct solver residual is around 10e-15. The PETSc residual > is 4e00 > Did you try -pc_type lu? > What sort of PDE are you solving? > > > The PDE is: > > grad(f) . B = g > > where B is a known vector field, g is a known scalar function, and f is > the unknown scalar function to be determined (I am discretizing this > equation for f in spherical coords) > Also known as steady-state advection. This is the most famous test problem in which centered differences fails miserably. If you use an upwind method, it will be stable, but will also be first order. To get higher than first order accuracy, you need a nonlinear spatial discretization (see Godunov's Theorem, "total variation diminishing", or "total variation bounded"). -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Wed Feb 22 18:43:24 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 22 Feb 2012 16:43:24 -0800 Subject: [petsc-users] AO In-Reply-To: References: Message-ID: Thanks Jed. On Wed, Feb 22, 2012 at 2:59 PM, Jed Brown wrote: > On Wed, Feb 22, 2012 at 16:52, Mohammad Mirzadeh wrote: > >> Should I be worried about function call overhead if make calls like >> AOXXXXToYYYY(ao, 1, &node) for each node when setting up the linear system? >> If so, is there an array where AO internally saves the mapping information >> that I can access or do I need to first save the mapping once and reuse it? > > > If you only need to do it once, the function call overhead should not be a > big deal. If you need to apply the mapping frequently, it's better to do it > in batches. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Wed Feb 22 18:58:36 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 22 Feb 2012 16:58:36 -0800 Subject: [petsc-users] Oscillations in finite difference solution In-Reply-To: References: <4F4582BA.70804@colorado.edu> <4F45853C.80407@colorado.edu> Message-ID: Along the same lines, also have a look at Essentially Non-Oscillatory (ENO) and Weighted ENO schemes which basically constructed oscillation-free, high-order, 'upwind' approximations to grad(f). Any standard text on FD (such as the one by LeVeque or the one by Strikwerda) have chapters on why central finite differences fail and why you would need an upwind-based method. Mohammad On Wed, Feb 22, 2012 at 4:24 PM, Jed Brown wrote: > On Wed, Feb 22, 2012 at 18:15, Patrick Alken wrote: > >> Yes I mean the direct solver residual is around 10e-15. The PETSc >> residual is 4e00 >> > > Did you try -pc_type lu? > >> What sort of PDE are you solving? >> >> >> The PDE is: >> >> grad(f) . B = g >> >> where B is a known vector field, g is a known scalar function, and f is >> the unknown scalar function to be determined (I am discretizing this >> equation for f in spherical coords) >> > > Also known as steady-state advection. This is the most famous test problem > in which centered differences fails miserably. If you use an upwind method, > it will be stable, but will also be first order. To get higher than first > order accuracy, you need a nonlinear spatial discretization (see Godunov's > Theorem, "total variation diminishing", or "total variation bounded"). > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.hui.zhang at hotmail.com Thu Feb 23 03:31:51 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Thu, 23 Feb 2012 10:31:51 +0100 Subject: [petsc-users] how to pass complex scalar in command line Message-ID: I know PetscOptionsGetScalar(), but do not know how to write the complex number in command line. I tried to write it in some forms but all did not work. Say, the options is -acomplex. I tried -acomplex 1.0+1.0i, or -acomplex 1.0+1.0*i, or -acomplex 1.0+1.0*I ... no one works. thank you, Hui From jroman at dsic.upv.es Thu Feb 23 03:36:00 2012 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 23 Feb 2012 10:36:00 +0100 Subject: [petsc-users] how to pass complex scalar in command line In-Reply-To: References: Message-ID: <32DB6DF9-6516-4000-AEB6-C9F60FF40B29@dsic.upv.es> El 23/02/2012, a las 10:31, Hui Zhang escribi?: > I know PetscOptionsGetScalar(), but do not know how to write the complex number > in command line. I tried to write it in some forms but all did not work. > > Say, the options is -acomplex. I tried > -acomplex 1.0+1.0i, or > -acomplex 1.0+1.0*i, or > -acomplex 1.0+1.0*I > ... > no one works. > > thank you, > Hui -acomplex 1,1 Read the manpage http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsGetScalar.html Jose From mike.hui.zhang at hotmail.com Thu Feb 23 03:38:02 2012 From: mike.hui.zhang at hotmail.com (Hui Zhang) Date: Thu, 23 Feb 2012 10:38:02 +0100 Subject: [petsc-users] how to pass complex scalar in command line In-Reply-To: <32DB6DF9-6516-4000-AEB6-C9F60FF40B29@dsic.upv.es> References: <32DB6DF9-6516-4000-AEB6-C9F60FF40B29@dsic.upv.es> Message-ID: Ah.. thank you! I read it before but I forgot. On Feb 23, 2012, at 10:36 AM, Jose E. Roman wrote: > > El 23/02/2012, a las 10:31, Hui Zhang escribi?: > >> I know PetscOptionsGetScalar(), but do not know how to write the complex number >> in command line. I tried to write it in some forms but all did not work. >> >> Say, the options is -acomplex. I tried >> -acomplex 1.0+1.0i, or >> -acomplex 1.0+1.0*i, or >> -acomplex 1.0+1.0*I >> ... >> no one works. >> >> thank you, >> Hui > > -acomplex 1,1 > > Read the manpage > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsGetScalar.html > > Jose > From bojan.niceno at psi.ch Thu Feb 23 10:46:16 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 17:46:16 +0100 Subject: [petsc-users] Accessing Vector's ghost values Message-ID: <4F466D58.5020506@psi.ch> Hi all, I've never used a mailing list before, so I hope this message will reach PETSc users and experts and someone might be willing to help me. I am also novice in PETSc. I have developed an unstructured finite volume solver on top of PETSc libraries. In sequential, it works like a charm. For the parallel version, I do domain decomposition externally with Metis, and work out local and global numberings, as well as communication patterns between processor. (The latter don't seem to be needed for PETSc, though.) When I run my program in parallel, it also works, but I miss values in vectors' ghost points. I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); Is it possible to get the ghost values if a vector is created like this? I have tried to use VecCreateGhost, but for some reason which is beyond my comprehension, PETSc goes berserk when it reaches the command: VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) Can anyone help me? Either how to reach ghost values for vector created by VecCreate, or how to use VecCreateGhost properly? Kind regards, Bojan From knepley at gmail.com Thu Feb 23 10:53:03 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Feb 2012 10:53:03 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F466D58.5020506@psi.ch> References: <4F466D58.5020506@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno wrote: > Hi all, > > I've never used a mailing list before, so I hope this message will reach > PETSc users and experts and someone might be willing to help me. I am also > novice in PETSc. > > I have developed an unstructured finite volume solver on top of PETSc > libraries. In sequential, it works like a charm. For the parallel > version, I do domain decomposition externally with Metis, and work out > local and global numberings, as well as communication patterns between > processor. (The latter don't seem to be needed for PETSc, though.) When I > run my program in parallel, it also works, but I miss values in vectors' > ghost points. > > I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); > > Is it possible to get the ghost values if a vector is created like this? > I do not understand this question. By definition, "ghost values" are those not stored in the global vector. > I have tried to use VecCreateGhost, but for some reason which is beyond my > comprehension, PETSc goes berserk when it reaches the command: > VecCreateGhost(PETSC_COMM_**WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) > I think you can understand that "berserk" tells me absolutely nothing. Error message? Stack trace? Did you try to run an example which uses VecGhost? Thanks, Matt > Can anyone help me? Either how to reach ghost values for vector created > by VecCreate, or how to use VecCreateGhost properly? > > > Kind regards, > > Bojan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From liujuy at gmail.com Thu Feb 23 11:46:36 2012 From: liujuy at gmail.com (Ju LIU) Date: Thu, 23 Feb 2012 11:46:36 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F466D58.5020506@psi.ch> References: <4F466D58.5020506@psi.ch> Message-ID: 2012/2/23 Bojan Niceno > Hi all, > > I've never used a mailing list before, so I hope this message will reach > PETSc users and experts and someone might be willing to help me. I am also > novice in PETSc. > > I have developed an unstructured finite volume solver on top of PETSc > libraries. In sequential, it works like a charm. For the parallel > version, I do domain decomposition externally with Metis, and work out > local and global numberings, as well as communication patterns between > processor. (The latter don't seem to be needed for PETSc, though.) When I > run my program in parallel, it also works, but I miss values in vectors' > ghost points. > > I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); > > Is it possible to get the ghost values if a vector is created like this? > > I have tried to use VecCreateGhost, but for some reason which is beyond my > comprehension, PETSc goes berserk when it reaches the command: > VecCreateGhost(PETSC_COMM_**WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) > > Can anyone help me? Either how to reach ghost values for vector created > by VecCreate, or how to use VecCreateGhost properly? > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/vec/examples/tutorials/ex9.c.html could be helpful. > Bojan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno at psi.ch Thu Feb 23 12:05:26 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 19:05:26 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> Message-ID: <4F467FE6.40901@psi.ch> Dear Matthew, thank you for your response. When I use VecCreateGhost, I get the following: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 src/sys/objects/inherit.c [0]PETSC ERROR: [0] VecCreate line 32 src/vec/vec/interface/veccreate.c [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: [0] VecCreateGhost line 647 src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno Thu Feb 23 19:02:45 2012 [0]PETSC ERROR: Libraries linked from /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 [0]PETSC ERROR: Configure options [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file I don't understand what could be causing it. I took very good care to match the global numbers of ghost cells when calling VecCreateGhost Kind regards, Bojan On 2/23/2012 5:53 PM, Matthew Knepley wrote: > On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno > wrote: > > Hi all, > > I've never used a mailing list before, so I hope this message will > reach PETSc users and experts and someone might be willing to help > me. I am also novice in PETSc. > > I have developed an unstructured finite volume solver on top of > PETSc libraries. In sequential, it works like a charm. For the > parallel version, I do domain decomposition externally with Metis, > and work out local and global numberings, as well as communication > patterns between processor. (The latter don't seem to be needed > for PETSc, though.) When I run my program in parallel, it also > works, but I miss values in vectors' ghost points. > > I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); > > Is it possible to get the ghost values if a vector is created like > this? > > > I do not understand this question. By definition, "ghost values" are > those not stored in the global vector. > > I have tried to use VecCreateGhost, but for some reason which is > beyond my comprehension, PETSc goes berserk when it reaches the > command: VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, > ifrom, &x) > > > I think you can understand that "berserk" tells me absolutely nothing. > Error message? Stack trace? Did you try to run an > example which uses VecGhost? > > Thanks, > > Matt > > Can anyone help me? Either how to reach ghost values for vector > created by VecCreate, or how to use VecCreateGhost properly? > > > Kind regards, > > Bojan > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 12:07:44 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 19:07:44 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> Message-ID: <4F468070.9050903@psi.ch> Thanks Ju. I studied this case carefully, and it seems clear to me. When I apply the same techniques in my code, I get error messages I sent to my reply to Matthew. Cheers, Bojan On 2/23/2012 6:46 PM, Ju LIU wrote: > > > 2012/2/23 Bojan Niceno > > > Hi all, > > I've never used a mailing list before, so I hope this message will > reach PETSc users and experts and someone might be willing to help > me. I am also novice in PETSc. > > I have developed an unstructured finite volume solver on top of > PETSc libraries. In sequential, it works like a charm. For the > parallel version, I do domain decomposition externally with Metis, > and work out local and global numberings, as well as communication > patterns between processor. (The latter don't seem to be needed > for PETSc, though.) When I run my program in parallel, it also > works, but I miss values in vectors' ghost points. > > I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); > > Is it possible to get the ghost values if a vector is created like > this? > > I have tried to use VecCreateGhost, but for some reason which is > beyond my comprehension, PETSc goes berserk when it reaches the > command: VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, > ifrom, &x) > > Can anyone help me? Either how to reach ghost values for vector > created by VecCreate, or how to use VecCreateGhost properly? > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/vec/examples/tutorials/ex9.c.html > could be helpful. > > > Bojan > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From maxwellr at gmail.com Thu Feb 23 12:22:08 2012 From: maxwellr at gmail.com (Max Rudolph) Date: Thu, 23 Feb 2012 10:22:08 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F468070.9050903@psi.ch> References: <4F466D58.5020506@psi.ch> <4F468070.9050903@psi.ch> Message-ID: Did you try running with -on_error_attach_debugger and and using your debugger to try to figure out where your code is segfaulting? Max On Thu, Feb 23, 2012 at 10:07 AM, Bojan Niceno wrote: > Thanks Ju. I studied this case carefully, and it seems clear to me. > When I apply the same techniques in my code, I get error messages I sent to > my reply to Matthew. > > > Cheers, > > > Bojan > > > > On 2/23/2012 6:46 PM, Ju LIU wrote: > > > > 2012/2/23 Bojan Niceno > >> Hi all, >> >> I've never used a mailing list before, so I hope this message will reach >> PETSc users and experts and someone might be willing to help me. I am also >> novice in PETSc. >> >> I have developed an unstructured finite volume solver on top of PETSc >> libraries. In sequential, it works like a charm. For the parallel >> version, I do domain decomposition externally with Metis, and work out >> local and global numberings, as well as communication patterns between >> processor. (The latter don't seem to be needed for PETSc, though.) When I >> run my program in parallel, it also works, but I miss values in vectors' >> ghost points. >> >> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >> >> Is it possible to get the ghost values if a vector is created like this? >> >> I have tried to use VecCreateGhost, but for some reason which is beyond >> my comprehension, PETSc goes berserk when it reaches the command: >> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) >> >> Can anyone help me? Either how to reach ghost values for vector created >> by VecCreate, or how to use VecCreateGhost properly? >> >> > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/vec/examples/tutorials/ex9.c.html > could be helpful. > > >> Bojan >> > > > > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From knepley at gmail.com Thu Feb 23 12:24:45 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Feb 2012 12:24:45 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F467FE6.40901@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno wrote: > Dear Matthew, > > > thank you for your response. When I use VecCreateGhost, I get the > following: > It appears that you passed a bad communicator. Did you not initialize a 'comm' variable? Matt > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 > src/sys/objects/inherit.c > [0]PETSC ERROR: [0] VecCreate line 32 src/vec/vec/interface/veccreate.c > [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 > src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: [0] VecCreateGhost line 647 src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 > CST 2012 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno Thu Feb > 23 19:02:45 2012 > [0]PETSC ERROR: Libraries linked from > /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib > [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 > [0]PETSC ERROR: Configure options > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > I don't understand what could be causing it. I took very good care to > match the global numbers of ghost cells when calling VecCreateGhost > > > Kind regards, > > > Bojan > > > On 2/23/2012 5:53 PM, Matthew Knepley wrote: > > On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno wrote: > >> Hi all, >> >> I've never used a mailing list before, so I hope this message will reach >> PETSc users and experts and someone might be willing to help me. I am also >> novice in PETSc. >> >> I have developed an unstructured finite volume solver on top of PETSc >> libraries. In sequential, it works like a charm. For the parallel >> version, I do domain decomposition externally with Metis, and work out >> local and global numberings, as well as communication patterns between >> processor. (The latter don't seem to be needed for PETSc, though.) When I >> run my program in parallel, it also works, but I miss values in vectors' >> ghost points. >> >> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >> >> Is it possible to get the ghost values if a vector is created like this? >> > > I do not understand this question. By definition, "ghost values" are > those not stored in the global vector. > > >> I have tried to use VecCreateGhost, but for some reason which is beyond >> my comprehension, PETSc goes berserk when it reaches the command: >> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) >> > > I think you can understand that "berserk" tells me absolutely nothing. > Error message? Stack trace? Did you try to run an > example which uses VecGhost? > > Thanks, > > Matt > > >> Can anyone help me? Either how to reach ghost values for vector created >> by VecCreate, or how to use VecCreateGhost properly? >> >> >> Kind regards, >> >> Bojan >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 12:28:10 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 19:28:10 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> Message-ID: <4F46853A.7070409@psi.ch> On 2/23/2012 7:24 PM, Matthew Knepley wrote: > On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno > wrote: > > Dear Matthew, > > > thank you for your response. When I use VecCreateGhost, I get the > following: > > > It appears that you passed a bad communicator. Did you not initialize > a 'comm' variable? I pass PETSC_COMM_WORLD to VecCreateGhost. I don't know what you mean by 'comm' variable :-( I called all the routines to initialize PETSc. Cheers, Bojan > > Matt > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X > to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 > src/sys/objects/inherit.c > [0]PETSC ERROR: [0] VecCreate line 32 > src/vec/vec/interface/veccreate.c > [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 > src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: [0] VecCreateGhost line 647 > src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 > 09:28:45 CST 2012 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno > Thu Feb 23 19:02:45 2012 > [0]PETSC ERROR: Libraries linked from > /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib > [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 > [0]PETSC ERROR: Configure options > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown > directory unknown file > > I don't understand what could be causing it. I took very good > care to match the global numbers of ghost cells when calling > VecCreateGhost > > > Kind regards, > > > Bojan > > > On 2/23/2012 5:53 PM, Matthew Knepley wrote: >> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno >> > wrote: >> >> Hi all, >> >> I've never used a mailing list before, so I hope this message >> will reach PETSc users and experts and someone might be >> willing to help me. I am also novice in PETSc. >> >> I have developed an unstructured finite volume solver on top >> of PETSc libraries. In sequential, it works like a charm. >> For the parallel version, I do domain decomposition >> externally with Metis, and work out local and global >> numberings, as well as communication patterns between >> processor. (The latter don't seem to be needed for PETSc, >> though.) When I run my program in parallel, it also works, >> but I miss values in vectors' ghost points. >> >> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >> >> Is it possible to get the ghost values if a vector is created >> like this? >> >> >> I do not understand this question. By definition, "ghost values" >> are those not stored in the global vector. >> >> I have tried to use VecCreateGhost, but for some reason which >> is beyond my comprehension, PETSc goes berserk when it >> reaches the command: VecCreateGhost(PETSC_COMM_WORLD, n, >> PETSC_DECIDE, nghost, ifrom, &x) >> >> >> I think you can understand that "berserk" tells me absolutely >> nothing. Error message? Stack trace? Did you try to run an >> example which uses VecGhost? >> >> Thanks, >> >> Matt >> >> Can anyone help me? Either how to reach ghost values for >> vector created by VecCreate, or how to use VecCreateGhost >> properly? >> >> >> Kind regards, >> >> Bojan >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > -- > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 12:28:31 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 19:28:31 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F468070.9050903@psi.ch> Message-ID: <4F46854F.9010809@psi.ch> On 2/23/2012 7:22 PM, Max Rudolph wrote: > Did you try running with -on_error_attach_debugger and and using your > debugger to try to figure out where your code is segfaulting? Well, it's a bit embarrassing, but I should admit that I never use debuggers. I use "exit(0)" to find the place where program breaks. It breaks at the very call to VecCreateGhost. Kind regards, Bojan From knepley at gmail.com Thu Feb 23 12:44:18 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Feb 2012 12:44:18 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46853A.7070409@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno wrote: > On 2/23/2012 7:24 PM, Matthew Knepley wrote: > > On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno wrote: > >> Dear Matthew, >> >> >> thank you for your response. When I use VecCreateGhost, I get the >> following: >> > > It appears that you passed a bad communicator. Did you not initialize a > 'comm' variable? > > > I pass PETSC_COMM_WORLD to VecCreateGhost. > > I don't know what you mean by 'comm' variable :-( I called all the > routines to initialize PETSc. > Send your code to petsc-maint at mcs.anl.gov. Matt > > Cheers, > > > Bojan > > > Matt > > >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >> find memory corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c >> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 >> src/sys/objects/inherit.c >> [0]PETSC ERROR: [0] VecCreate line 32 src/vec/vec/interface/veccreate.c >> [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 >> src/vec/vec/impls/mpi/pbvec.c >> [0]PETSC ERROR: [0] VecCreateGhost line 647 src/vec/vec/impls/mpi/pbvec.c >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Signal received! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 >> CST 2012 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno Thu Feb >> 23 19:02:45 2012 >> [0]PETSC ERROR: Libraries linked from >> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >> [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 >> [0]PETSC ERROR: Configure options >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> >> I don't understand what could be causing it. I took very good care to >> match the global numbers of ghost cells when calling VecCreateGhost >> >> >> Kind regards, >> >> >> Bojan >> >> >> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >> >> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno wrote: >> >>> Hi all, >>> >>> I've never used a mailing list before, so I hope this message will reach >>> PETSc users and experts and someone might be willing to help me. I am also >>> novice in PETSc. >>> >>> I have developed an unstructured finite volume solver on top of PETSc >>> libraries. In sequential, it works like a charm. For the parallel >>> version, I do domain decomposition externally with Metis, and work out >>> local and global numberings, as well as communication patterns between >>> processor. (The latter don't seem to be needed for PETSc, though.) When I >>> run my program in parallel, it also works, but I miss values in vectors' >>> ghost points. >>> >>> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >>> >>> Is it possible to get the ghost values if a vector is created like this? >>> >> >> I do not understand this question. By definition, "ghost values" are >> those not stored in the global vector. >> >> >>> I have tried to use VecCreateGhost, but for some reason which is beyond >>> my comprehension, PETSc goes berserk when it reaches the command: >>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) >>> >> >> I think you can understand that "berserk" tells me absolutely nothing. >> Error message? Stack trace? Did you try to run an >> example which uses VecGhost? >> >> Thanks, >> >> Matt >> >> >>> Can anyone help me? Either how to reach ghost values for vector created >>> by VecCreate, or how to use VecCreateGhost properly? >>> >>> >>> Kind regards, >>> >>> Bojan >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> -- >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 12:51:04 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 19:51:04 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> Message-ID: <4F468A98.2010905@psi.ch> Dear Matt, are you sure? It is almost 4000 lines long! Shall I send only the function which bother me? If the entire code is what you need, shall I make a tarball and attach it? Kind regards, Bojan On 2/23/2012 7:44 PM, Matthew Knepley wrote: > On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno > wrote: > > On 2/23/2012 7:24 PM, Matthew Knepley wrote: >> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno >> > wrote: >> >> Dear Matthew, >> >> >> thank you for your response. When I use VecCreateGhost, I >> get the following: >> >> >> It appears that you passed a bad communicator. Did you not >> initialize a 'comm' variable? > > I pass PETSC_COMM_WORLD to VecCreateGhost. > > I don't know what you mean by 'comm' variable :-( I called all > the routines to initialize PETSc. > > > Send your code to petsc-maint at mcs.anl.gov > . > > Matt > > > Cheers, > > > Bojan > >> >> Matt >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> Violation, probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are >> not available, >> [0]PETSC ERROR: INSTEAD the line number of the start of >> the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 >> src/sys/objects/tagm.c >> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 >> src/sys/objects/inherit.c >> [0]PETSC ERROR: [0] VecCreate line 32 >> src/vec/vec/interface/veccreate.c >> [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 >> src/vec/vec/impls/mpi/pbvec.c >> [0]PETSC ERROR: [0] VecCreateGhost line 647 >> src/vec/vec/impls/mpi/pbvec.c >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Signal received! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan >> 11 09:28:45 CST 2012 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble >> shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by >> niceno Thu Feb 23 19:02:45 2012 >> [0]PETSC ERROR: Libraries linked from >> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >> [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 >> [0]PETSC ERROR: Configure options >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: User provided function() line 0 in unknown >> directory unknown file >> >> I don't understand what could be causing it. I took very >> good care to match the global numbers of ghost cells when >> calling VecCreateGhost >> >> >> Kind regards, >> >> >> Bojan >> >> >> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno >>> > wrote: >>> >>> Hi all, >>> >>> I've never used a mailing list before, so I hope this >>> message will reach PETSc users and experts and someone >>> might be willing to help me. I am also novice in PETSc. >>> >>> I have developed an unstructured finite volume solver on >>> top of PETSc libraries. In sequential, it works like a >>> charm. For the parallel version, I do domain >>> decomposition externally with Metis, and work out local >>> and global numberings, as well as communication patterns >>> between processor. (The latter don't seem to be needed >>> for PETSc, though.) When I run my program in parallel, >>> it also works, but I miss values in vectors' ghost points. >>> >>> I create vectors with command: >>> VecCreate(PETSC_COMM_WORLD, &x); >>> >>> Is it possible to get the ghost values if a vector is >>> created like this? >>> >>> >>> I do not understand this question. By definition, "ghost >>> values" are those not stored in the global vector. >>> >>> I have tried to use VecCreateGhost, but for some reason >>> which is beyond my comprehension, PETSc goes berserk >>> when it reaches the command: >>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, >>> nghost, ifrom, &x) >>> >>> >>> I think you can understand that "berserk" tells me >>> absolutely nothing. Error message? Stack trace? Did you try >>> to run an >>> example which uses VecGhost? >>> >>> Thanks, >>> >>> Matt >>> >>> Can anyone help me? Either how to reach ghost values >>> for vector created by VecCreate, or how to use >>> VecCreateGhost properly? >>> >>> >>> Kind regards, >>> >>> Bojan >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >> >> >> -- >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > -- > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From knepley at gmail.com Thu Feb 23 13:04:45 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Feb 2012 13:04:45 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F468A98.2010905@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 12:51 PM, Bojan Niceno wrote: > Dear Matt, > > > are you sure? It is almost 4000 lines long! Shall I send only the > function which bother me? > > If the entire code is what you need, shall I make a tarball and attach it? > Send something the builds and runs. Don't care how long it is. Matt > Kind regards, > > > Bojan > > On 2/23/2012 7:44 PM, Matthew Knepley wrote: > > On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno wrote: > >> On 2/23/2012 7:24 PM, Matthew Knepley wrote: >> >> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno wrote: >> >>> Dear Matthew, >>> >>> >>> thank you for your response. When I use VecCreateGhost, I get the >>> following: >>> >> >> It appears that you passed a bad communicator. Did you not initialize a >> 'comm' variable? >> >> >> I pass PETSC_COMM_WORLD to VecCreateGhost. >> >> I don't know what you mean by 'comm' variable :-( I called all the >> routines to initialize PETSc. >> > > Send your code to petsc-maint at mcs.anl.gov. > > Matt > > >> >> Cheers, >> >> >> Bojan >> >> >> Matt >> >> >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>> find memory corruption errors >>> [0]PETSC ERROR: likely location of problem given in stack below >>> [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> [0]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> [0]PETSC ERROR: is given. >>> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c >>> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 >>> src/sys/objects/inherit.c >>> [0]PETSC ERROR: [0] VecCreate line 32 src/vec/vec/interface/veccreate.c >>> [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 >>> src/vec/vec/impls/mpi/pbvec.c >>> [0]PETSC ERROR: [0] VecCreateGhost line 647 src/vec/vec/impls/mpi/pbvec.c >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Signal received! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>> 09:28:45 CST 2012 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno Thu >>> Feb 23 19:02:45 2012 >>> [0]PETSC ERROR: Libraries linked from >>> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >>> [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 >>> [0]PETSC ERROR: Configure options >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> >>> I don't understand what could be causing it. I took very good care to >>> match the global numbers of ghost cells when calling VecCreateGhost >>> >>> >>> Kind regards, >>> >>> >>> Bojan >>> >>> >>> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>> >>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno wrote: >>> >>>> Hi all, >>>> >>>> I've never used a mailing list before, so I hope this message will >>>> reach PETSc users and experts and someone might be willing to help me. I >>>> am also novice in PETSc. >>>> >>>> I have developed an unstructured finite volume solver on top of PETSc >>>> libraries. In sequential, it works like a charm. For the parallel >>>> version, I do domain decomposition externally with Metis, and work out >>>> local and global numberings, as well as communication patterns between >>>> processor. (The latter don't seem to be needed for PETSc, though.) When I >>>> run my program in parallel, it also works, but I miss values in vectors' >>>> ghost points. >>>> >>>> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >>>> >>>> Is it possible to get the ghost values if a vector is created like this? >>>> >>> >>> I do not understand this question. By definition, "ghost values" are >>> those not stored in the global vector. >>> >>> >>>> I have tried to use VecCreateGhost, but for some reason which is beyond >>>> my comprehension, PETSc goes berserk when it reaches the command: >>>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) >>>> >>> >>> I think you can understand that "berserk" tells me absolutely nothing. >>> Error message? Stack trace? Did you try to run an >>> example which uses VecGhost? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Can anyone help me? Either how to reach ghost values for vector >>>> created by VecCreate, or how to use VecCreateGhost properly? >>>> >>>> >>>> Kind regards, >>>> >>>> Bojan >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >>> -- >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> -- >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 13:33:19 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 20:33:19 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> Message-ID: <4F46947F.9090200@psi.ch> Dear Matt, I sent the code as an attached tarball. I sent it with case I run, so is 2 MB big. It is now in the cue for moderator's approval. Thanks. Kind regards, Bojan On 2/23/2012 8:04 PM, Matthew Knepley wrote: > On Thu, Feb 23, 2012 at 12:51 PM, Bojan Niceno > wrote: > > Dear Matt, > > > are you sure? It is almost 4000 lines long! Shall I send only > the function which bother me? > > If the entire code is what you need, shall I make a tarball and > attach it? > > > Send something the builds and runs. Don't care how long it is. > > Matt > > Kind regards, > > > Bojan > > On 2/23/2012 7:44 PM, Matthew Knepley wrote: >> On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno >> > wrote: >> >> On 2/23/2012 7:24 PM, Matthew Knepley wrote: >>> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno >>> > wrote: >>> >>> Dear Matthew, >>> >>> >>> thank you for your response. When I use VecCreateGhost, >>> I get the following: >>> >>> >>> It appears that you passed a bad communicator. Did you not >>> initialize a 'comm' variable? >> >> I pass PETSC_COMM_WORLD to VecCreateGhost. >> >> I don't know what you mean by 'comm' variable :-( I called >> all the routines to initialize PETSc. >> >> >> Send your code to petsc-maint at mcs.anl.gov >> . >> >> Matt >> >> >> Cheers, >> >> >> Bojan >> >>> >>> Matt >>> >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 11 SEGV: >>> Segmentation Violation, probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >>> ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> [0]PETSC ERROR: likely location of problem given in >>> stack below >>> [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> [0]PETSC ERROR: Note: The EXACT line numbers in the >>> stack are not available, >>> [0]PETSC ERROR: INSTEAD the line number of the >>> start of the function >>> [0]PETSC ERROR: is given. >>> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 >>> src/sys/objects/tagm.c >>> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 >>> src/sys/objects/inherit.c >>> [0]PETSC ERROR: [0] VecCreate line 32 >>> src/vec/vec/interface/veccreate.c >>> [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 >>> src/vec/vec/impls/mpi/pbvec.c >>> [0]PETSC ERROR: [0] VecCreateGhost line 647 >>> src/vec/vec/impls/mpi/pbvec.c >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Signal received! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, >>> Wed Jan 11 09:28:45 CST 2012 >>> [0]PETSC ERROR: See docs/changes/index.html for recent >>> updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about >>> trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 >>> by niceno Thu Feb 23 19:02:45 2012 >>> [0]PETSC ERROR: Libraries linked from >>> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >>> [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 >>> [0]PETSC ERROR: Configure options >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: User provided function() line 0 in >>> unknown directory unknown file >>> >>> I don't understand what could be causing it. I took >>> very good care to match the global numbers of ghost >>> cells when calling VecCreateGhost >>> >>> >>> Kind regards, >>> >>> >>> Bojan >>> >>> >>> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno >>>> > wrote: >>>> >>>> Hi all, >>>> >>>> I've never used a mailing list before, so I hope >>>> this message will reach PETSc users and experts and >>>> someone might be willing to help me. I am also >>>> novice in PETSc. >>>> >>>> I have developed an unstructured finite volume >>>> solver on top of PETSc libraries. In sequential, >>>> it works like a charm. For the parallel version, I >>>> do domain decomposition externally with Metis, and >>>> work out local and global numberings, as well as >>>> communication patterns between processor. (The >>>> latter don't seem to be needed for PETSc, though.) >>>> When I run my program in parallel, it also works, >>>> but I miss values in vectors' ghost points. >>>> >>>> I create vectors with command: >>>> VecCreate(PETSC_COMM_WORLD, &x); >>>> >>>> Is it possible to get the ghost values if a vector >>>> is created like this? >>>> >>>> >>>> I do not understand this question. By definition, >>>> "ghost values" are those not stored in the global vector. >>>> >>>> I have tried to use VecCreateGhost, but for some >>>> reason which is beyond my comprehension, PETSc goes >>>> berserk when it reaches the command: >>>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, >>>> nghost, ifrom, &x) >>>> >>>> >>>> I think you can understand that "berserk" tells me >>>> absolutely nothing. Error message? Stack trace? Did you >>>> try to run an >>>> example which uses VecGhost? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Can anyone help me? Either how to reach ghost >>>> values for vector created by VecCreate, or how to >>>> use VecCreateGhost properly? >>>> >>>> >>>> Kind regards, >>>> >>>> Bojan >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> -- >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >> >> >> -- >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > -- > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From knepley at gmail.com Thu Feb 23 13:36:17 2012 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Feb 2012 13:36:17 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46947F.9090200@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 1:33 PM, Bojan Niceno wrote: > Dear Matt, > > > I sent the code as an attached tarball. I sent it with case I run, so is > 2 MB big. It is now in the cue for moderator's approval. > No, you HAVE to send it to petsc-maint at mcs.anl.gov, as I said last time, for exactly this reason. Matt > Thanks. > > > Kind regards, > > > Bojan > > > On 2/23/2012 8:04 PM, Matthew Knepley wrote: > > On Thu, Feb 23, 2012 at 12:51 PM, Bojan Niceno wrote: > >> Dear Matt, >> >> >> are you sure? It is almost 4000 lines long! Shall I send only the >> function which bother me? >> >> If the entire code is what you need, shall I make a tarball and attach it? >> > > Send something the builds and runs. Don't care how long it is. > > Matt > > >> Kind regards, >> >> >> Bojan >> >> On 2/23/2012 7:44 PM, Matthew Knepley wrote: >> >> On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno wrote: >> >>> On 2/23/2012 7:24 PM, Matthew Knepley wrote: >>> >>> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno wrote: >>> >>>> Dear Matthew, >>>> >>>> >>>> thank you for your response. When I use VecCreateGhost, I get the >>>> following: >>>> >>> >>> It appears that you passed a bad communicator. Did you not initialize >>> a 'comm' variable? >>> >>> >>> I pass PETSC_COMM_WORLD to VecCreateGhost. >>> >>> I don't know what you mean by 'comm' variable :-( I called all the >>> routines to initialize PETSc. >>> >> >> Send your code to petsc-maint at mcs.anl.gov. >> >> Matt >> >> >>> >>> Cheers, >>> >>> >>> Bojan >>> >>> >>> Matt >>> >>> >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> [0]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [0]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>>> find memory corruption errors >>>> [0]PETSC ERROR: likely location of problem given in stack below >>>> [0]PETSC ERROR: --------------------- Stack Frames >>>> ------------------------------------ >>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>> available, >>>> [0]PETSC ERROR: INSTEAD the line number of the start of the >>>> function >>>> [0]PETSC ERROR: is given. >>>> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c >>>> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 >>>> src/sys/objects/inherit.c >>>> [0]PETSC ERROR: [0] VecCreate line 32 src/vec/vec/interface/veccreate.c >>>> [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 >>>> src/vec/vec/impls/mpi/pbvec.c >>>> [0]PETSC ERROR: [0] VecCreateGhost line 647 >>>> src/vec/vec/impls/mpi/pbvec.c >>>> [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> [0]PETSC ERROR: Signal received! >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>>> 09:28:45 CST 2012 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno Thu >>>> Feb 23 19:02:45 2012 >>>> [0]PETSC ERROR: Libraries linked from >>>> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >>>> [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 >>>> [0]PETSC ERROR: Configure options >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file >>>> >>>> I don't understand what could be causing it. I took very good care to >>>> match the global numbers of ghost cells when calling VecCreateGhost >>>> >>>> >>>> Kind regards, >>>> >>>> >>>> Bojan >>>> >>>> >>>> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>>> >>>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno wrote: >>>> >>>>> Hi all, >>>>> >>>>> I've never used a mailing list before, so I hope this message will >>>>> reach PETSc users and experts and someone might be willing to help me. I >>>>> am also novice in PETSc. >>>>> >>>>> I have developed an unstructured finite volume solver on top of PETSc >>>>> libraries. In sequential, it works like a charm. For the parallel >>>>> version, I do domain decomposition externally with Metis, and work out >>>>> local and global numberings, as well as communication patterns between >>>>> processor. (The latter don't seem to be needed for PETSc, though.) When I >>>>> run my program in parallel, it also works, but I miss values in vectors' >>>>> ghost points. >>>>> >>>>> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >>>>> >>>>> Is it possible to get the ghost values if a vector is created like >>>>> this? >>>>> >>>> >>>> I do not understand this question. By definition, "ghost values" are >>>> those not stored in the global vector. >>>> >>>> >>>>> I have tried to use VecCreateGhost, but for some reason which is >>>>> beyond my comprehension, PETSc goes berserk when it reaches the command: >>>>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) >>>>> >>>> >>>> I think you can understand that "berserk" tells me absolutely >>>> nothing. Error message? Stack trace? Did you try to run an >>>> example which uses VecGhost? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Can anyone help me? Either how to reach ghost values for vector >>>>> created by VecCreate, or how to use VecCreateGhost properly? >>>>> >>>>> >>>>> Kind regards, >>>>> >>>>> Bojan >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>>> -- >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >>> -- >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> -- >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From fpoulin at uwaterloo.ca Thu Feb 23 14:16:16 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Thu, 23 Feb 2012 15:16:16 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c Message-ID: Hello, I am learning to use PetSc but am just a notice. I have a rather basic question to ask and couldn't not find it on the achieves. I am wanting to test the scalability of a Multigrid solver to the 3D Poisson equation. I found ksp/ex22.c that seems to solve the problem that I'm interested in. I ran it on a large server using different processors. The syntax that I use to run using MPI was ./ex22 -da_grid_x 64 -da_grid_y 64 -da_grid_z 32 I tested it using 2, 4, 8, 16 cpus and found that the time increases. See below. Clearly there is something that I don't understand since the time should be reduced. n wtime --------------------- 2 3m58s 4 3m54s 8 5m51s 16 7m23s Any advice would be greatly appreciated. Best regrads, Francis From jedbrown at mcs.anl.gov Thu Feb 23 14:27:29 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 23 Feb 2012 14:27:29 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: References: Message-ID: Always send output with -log_summary for each run that you do. On Thu, Feb 23, 2012 at 14:16, Francis Poulin wrote: > Hello, > > I am learning to use PetSc but am just a notice. I have a rather basic > question to ask and couldn't not find it on the achieves. > > I am wanting to test the scalability of a Multigrid solver to the 3D > Poisson equation. I found ksp/ex22.c that seems to solve the problem that > I'm interested in. I ran it on a large server using different processors. > > The syntax that I use to run using MPI was > > ./ex22 -da_grid_x 64 -da_grid_y 64 -da_grid_z 32 > Which version of PETSc? > > I tested it using 2, 4, 8, 16 cpus and found that the time increases. See > below. Clearly there is something that I don't understand since the time > should be reduced. > > n wtime > --------------------- > 2 3m58s > 4 3m54s > 8 5m51s > 16 7m23s > > Any advice would be greatly appreciated. > > Best regrads, > Francis > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Feb 23 14:29:17 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 23 Feb 2012 14:29:17 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: References: Message-ID: 1) Run with -ksp_view to see what solver options it is really using. By default it is not likely using multigrid like you hope. 2) See http://www.mcs.anl.gov/petsc/documentation/faq.html#computers 3) Run with -log_summary to see where the time is being spent in the different cases Barry On Feb 23, 2012, at 2:16 PM, Francis Poulin wrote: > Hello, > > I am learning to use PetSc but am just a notice. I have a rather basic question to ask and couldn't not find it on the achieves. > > I am wanting to test the scalability of a Multigrid solver to the 3D Poisson equation. I found ksp/ex22.c that seems to solve the problem that I'm interested in. I ran it on a large server using different processors. > > The syntax that I use to run using MPI was > > ./ex22 -da_grid_x 64 -da_grid_y 64 -da_grid_z 32 > > I tested it using 2, 4, 8, 16 cpus and found that the time increases. See below. Clearly there is something that I don't understand since the time should be reduced. > > n wtime > --------------------- > 2 3m58s > 4 3m54s > 8 5m51s > 16 7m23s > > Any advice would be greatly appreciated. > > Best regrads, > Francis > > From bojan.niceno at psi.ch Thu Feb 23 14:43:52 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 21:43:52 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> Message-ID: <4F46A508.8030207@psi.ch> Dear Matt, I have a new insight, although is not the full resolution. If I change my code in PETScSolver.cpp from: /*-------------------------------------------------+ | Make necessary PETSc intializations for vetors | +-------------------------------------------------*/ Int nghost = N - n; Int * ghosts = new Int(nghost); for(Int n=0; n= 0); assert( M.mesh.nodes[n].global_number < 14065); } for(Int i=n; i= 0); assert( M.mesh.nodes[i].global_number < 14065); assert( ! (M.mesh.nodes[i].global_number >= n_start && M.mesh.nodes[i].global_number < n_end) ); ghosts[i] = M.mesh.nodes[i].global_number; } VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], &x); to: /*-------------------------------------------------+ | Make necessary PETSc intializations for vetors | +-------------------------------------------------*/ Int nghost = N - n; Indices ghosts; // <---= NEW! for(Int n=0; n= 0); assert( M.mesh.nodes[n].global_number < 14065); } for(Int i=n; i= 0); assert( M.mesh.nodes[i].global_number < 14065); assert( ! (M.mesh.nodes[i].global_number >= n_start && M.mesh.nodes[i].global_number < n_end) ); ghosts.push_back( M.mesh.nodes[i].global_number ); // <---= NEW! } assert( ghosts.size() == nghost ); // <---= NEW! VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], &x); I pass the VecCreateGhost phase. "Indices" is an STL container of integers. It seems it works better than classical C array for this case. However, I still do not see the ghost values, i.e. I get the following error: [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Can only get local values, trying 3529! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Argument out of range! [1]PETSC ERROR: Can only get local values, trying 22! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Can only get local values, trying 86! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 when I am trying to access values in ghost cells. What do I have to use to see them ghosts? I reckon VecGhostGetLocalForm should be used, right? Kind regards, Bojan On 2/23/2012 8:36 PM, Matthew Knepley wrote: > On Thu, Feb 23, 2012 at 1:33 PM, Bojan Niceno > wrote: > > Dear Matt, > > > I sent the code as an attached tarball. I sent it with case I > run, so is 2 MB big. It is now in the cue for moderator's approval. > > > No, you HAVE to send it to petsc-maint at mcs.anl.gov > , as I said last time, for exactly > this reason. > > Matt > > Thanks. > > > Kind regards, > > > Bojan > > > On 2/23/2012 8:04 PM, Matthew Knepley wrote: >> On Thu, Feb 23, 2012 at 12:51 PM, Bojan Niceno >> > wrote: >> >> Dear Matt, >> >> >> are you sure? It is almost 4000 lines long! Shall I send >> only the function which bother me? >> >> If the entire code is what you need, shall I make a tarball >> and attach it? >> >> >> Send something the builds and runs. Don't care how long it is. >> >> Matt >> >> Kind regards, >> >> >> Bojan >> >> On 2/23/2012 7:44 PM, Matthew Knepley wrote: >>> On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno >>> > wrote: >>> >>> On 2/23/2012 7:24 PM, Matthew Knepley wrote: >>>> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno >>>> > wrote: >>>> >>>> Dear Matthew, >>>> >>>> >>>> thank you for your response. When I use >>>> VecCreateGhost, I get the following: >>>> >>>> >>>> It appears that you passed a bad communicator. Did you >>>> not initialize a 'comm' variable? >>> >>> I pass PETSC_COMM_WORLD to VecCreateGhost. >>> >>> I don't know what you mean by 'comm' variable :-( I >>> called all the routines to initialize PETSc. >>> >>> >>> Send your code to petsc-maint at mcs.anl.gov >>> . >>> >>> Matt >>> >>> >>> Cheers, >>> >>> >>> Bojan >>> >>>> >>>> Matt >>>> >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 11 SEGV: >>>> Segmentation Violation, probably memory access out >>>> of range >>>> [0]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [0]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >>>> ERROR: or try http://valgrind.org on GNU/linux and >>>> Apple Mac OS X to find memory corruption errors >>>> [0]PETSC ERROR: likely location of problem given in >>>> stack below >>>> [0]PETSC ERROR: --------------------- Stack Frames >>>> ------------------------------------ >>>> [0]PETSC ERROR: Note: The EXACT line numbers in the >>>> stack are not available, >>>> [0]PETSC ERROR: INSTEAD the line number of >>>> the start of the function >>>> [0]PETSC ERROR: is given. >>>> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 >>>> src/sys/objects/tagm.c >>>> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line >>>> 30 src/sys/objects/inherit.c >>>> [0]PETSC ERROR: [0] VecCreate line 32 >>>> src/vec/vec/interface/veccreate.c >>>> [0]PETSC ERROR: [0] VecCreateGhostWithArray line >>>> 567 src/vec/vec/impls/mpi/pbvec.c >>>> [0]PETSC ERROR: [0] VecCreateGhost line 647 >>>> src/vec/vec/impls/mpi/pbvec.c >>>> [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> [0]PETSC ERROR: Signal received! >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch >>>> 6, Wed Jan 11 09:28:45 CST 2012 >>>> [0]PETSC ERROR: See docs/changes/index.html for >>>> recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about >>>> trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named >>>> lccfd06 by niceno Thu Feb 23 19:02:45 2012 >>>> [0]PETSC ERROR: Libraries linked from >>>> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >>>> [0]PETSC ERROR: Configure run at Fri Feb 10 >>>> 10:24:13 2012 >>>> [0]PETSC ERROR: Configure options >>>> [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: User provided function() line 0 in >>>> unknown directory unknown file >>>> >>>> I don't understand what could be causing it. I >>>> took very good care to match the global numbers of >>>> ghost cells when calling VecCreateGhost >>>> >>>> >>>> Kind regards, >>>> >>>> >>>> Bojan >>>> >>>> >>>> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>>>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno >>>>> > >>>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I've never used a mailing list before, so I >>>>> hope this message will reach PETSc users and >>>>> experts and someone might be willing to help >>>>> me. I am also novice in PETSc. >>>>> >>>>> I have developed an unstructured finite volume >>>>> solver on top of PETSc libraries. In >>>>> sequential, it works like a charm. For the >>>>> parallel version, I do domain decomposition >>>>> externally with Metis, and work out local and >>>>> global numberings, as well as communication >>>>> patterns between processor. (The latter don't >>>>> seem to be needed for PETSc, though.) When I >>>>> run my program in parallel, it also works, but >>>>> I miss values in vectors' ghost points. >>>>> >>>>> I create vectors with command: >>>>> VecCreate(PETSC_COMM_WORLD, &x); >>>>> >>>>> Is it possible to get the ghost values if a >>>>> vector is created like this? >>>>> >>>>> >>>>> I do not understand this question. By definition, >>>>> "ghost values" are those not stored in the global >>>>> vector. >>>>> >>>>> I have tried to use VecCreateGhost, but for >>>>> some reason which is beyond my comprehension, >>>>> PETSc goes berserk when it reaches the >>>>> command: VecCreateGhost(PETSC_COMM_WORLD, n, >>>>> PETSC_DECIDE, nghost, ifrom, &x) >>>>> >>>>> >>>>> I think you can understand that "berserk" tells me >>>>> absolutely nothing. Error message? Stack trace? >>>>> Did you try to run an >>>>> example which uses VecGhost? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Can anyone help me? Either how to reach ghost >>>>> values for vector created by VecCreate, or how >>>>> to use VecCreateGhost properly? >>>>> >>>>> >>>>> Kind regards, >>>>> >>>>> Bojan >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before >>>>> they begin their experiments is infinitely more >>>>> interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> -- >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >> >> >> -- >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > -- > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From mirzadeh at gmail.com Thu Feb 23 15:00:28 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 23 Feb 2012 13:00:28 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46A508.8030207@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> Message-ID: Are you using local numbering when accessing the local part of ghost nodes? On Thu, Feb 23, 2012 at 12:43 PM, Bojan Niceno wrote: > Dear Matt, > > > I have a new insight, although is not the full resolution. If I change my > code in PETScSolver.cpp from: > > > /*-------------------------------------------------+ > | Make necessary PETSc intializations for vetors | > +-------------------------------------------------*/ > Int nghost = N - n; > Int * ghosts = new Int(nghost); > for(Int n=0; n assert( M.mesh.nodes[n].global_number >= 0); > assert( M.mesh.nodes[n].global_number < 14065); > } > for(Int i=n; i assert( M.mesh.nodes[i].global_number >= 0); > assert( M.mesh.nodes[i].global_number < 14065); > assert( ! (M.mesh.nodes[i].global_number >= n_start && > M.mesh.nodes[i].global_number < n_end) ); > ghosts[i] = M.mesh.nodes[i].global_number; > } > > VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], > &x); > > to: > > /*-------------------------------------------------+ > | Make necessary PETSc intializations for vetors | > +-------------------------------------------------*/ > Int nghost = N - n; > Indices ghosts; // <---= NEW! > for(Int n=0; n assert( M.mesh.nodes[n].global_number >= 0); > assert( M.mesh.nodes[n].global_number < 14065); > } > for(Int i=n; i assert( M.mesh.nodes[i].global_number >= 0); > assert( M.mesh.nodes[i].global_number < 14065); > assert( ! (M.mesh.nodes[i].global_number >= n_start && > M.mesh.nodes[i].global_number < n_end) ); > ghosts.push_back( M.mesh.nodes[i].global_number ); // <---= NEW! > > } > assert( ghosts.size() == nghost ); // <---= NEW! > > VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], > &x); > > I pass the VecCreateGhost phase. "Indices" is an STL container of > integers. It seems it works better than classical C array for this case. > > > However, I still do not see the ghost values, i.e. I get the following > error: > > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Can only get local values, trying 3529! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Argument out of range! > [1]PETSC ERROR: Can only get local values, trying 22! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 > CST 2012 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Argument out of range! > [2]PETSC ERROR: Can only get local values, trying 86! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 > CST 2012 > > when I am trying to access values in ghost cells. What do I have to use > to see them ghosts? I reckon VecGhostGetLocalForm should be used, right? > > > Kind regards, > > > Bojan > > > > On 2/23/2012 8:36 PM, Matthew Knepley wrote: > > On Thu, Feb 23, 2012 at 1:33 PM, Bojan Niceno wrote: > >> Dear Matt, >> >> >> I sent the code as an attached tarball. I sent it with case I run, so is >> 2 MB big. It is now in the cue for moderator's approval. >> > > No, you HAVE to send it to petsc-maint at mcs.anl.gov, as I said last time, > for exactly this reason. > > Matt > > >> Thanks. >> >> >> Kind regards, >> >> >> Bojan >> >> >> On 2/23/2012 8:04 PM, Matthew Knepley wrote: >> >> On Thu, Feb 23, 2012 at 12:51 PM, Bojan Niceno wrote: >> >>> Dear Matt, >>> >>> >>> are you sure? It is almost 4000 lines long! Shall I send only the >>> function which bother me? >>> >>> If the entire code is what you need, shall I make a tarball and attach >>> it? >>> >> >> Send something the builds and runs. Don't care how long it is. >> >> Matt >> >> >>> Kind regards, >>> >>> >>> Bojan >>> >>> On 2/23/2012 7:44 PM, Matthew Knepley wrote: >>> >>> On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno wrote: >>> >>>> On 2/23/2012 7:24 PM, Matthew Knepley wrote: >>>> >>>> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno wrote: >>>> >>>>> Dear Matthew, >>>>> >>>>> >>>>> thank you for your response. When I use VecCreateGhost, I get the >>>>> following: >>>>> >>>> >>>> It appears that you passed a bad communicator. Did you not initialize >>>> a 'comm' variable? >>>> >>>> >>>> I pass PETSC_COMM_WORLD to VecCreateGhost. >>>> >>>> I don't know what you mean by 'comm' variable :-( I called all the >>>> routines to initialize PETSc. >>>> >>> >>> Send your code to petsc-maint at mcs.anl.gov. >>> >>> Matt >>> >>> >>>> >>>> Cheers, >>>> >>>> >>>> Bojan >>>> >>>> >>>> Matt >>>> >>>> >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>>> probably memory access out of range >>>>> [0]PETSC ERROR: Try option -start_in_debugger or >>>>> -on_error_attach_debugger >>>>> [0]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>>>> find memory corruption errors >>>>> [0]PETSC ERROR: likely location of problem given in stack below >>>>> [0]PETSC ERROR: --------------------- Stack Frames >>>>> ------------------------------------ >>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>> available, >>>>> [0]PETSC ERROR: INSTEAD the line number of the start of the >>>>> function >>>>> [0]PETSC ERROR: is given. >>>>> [0]PETSC ERROR: [0] PetscCommDuplicate line 140 src/sys/objects/tagm.c >>>>> [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 30 >>>>> src/sys/objects/inherit.c >>>>> [0]PETSC ERROR: [0] VecCreate line 32 src/vec/vec/interface/veccreate.c >>>>> [0]PETSC ERROR: [0] VecCreateGhostWithArray line 567 >>>>> src/vec/vec/impls/mpi/pbvec.c >>>>> [0]PETSC ERROR: [0] VecCreateGhost line 647 >>>>> src/vec/vec/impls/mpi/pbvec.c >>>>> [0]PETSC ERROR: --------------------- Error Message >>>>> ------------------------------------ >>>>> [0]PETSC ERROR: Signal received! >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >>>>> 09:28:45 CST 2012 >>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: ./PSI-Flow on a arch-linu named lccfd06 by niceno Thu >>>>> Feb 23 19:02:45 2012 >>>>> [0]PETSC ERROR: Libraries linked from >>>>> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >>>>> [0]PETSC ERROR: Configure run at Fri Feb 10 10:24:13 2012 >>>>> [0]PETSC ERROR: Configure options >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>>> unknown file >>>>> >>>>> I don't understand what could be causing it. I took very good care to >>>>> match the global numbers of ghost cells when calling VecCreateGhost >>>>> >>>>> >>>>> Kind regards, >>>>> >>>>> >>>>> Bojan >>>>> >>>>> >>>>> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>>>> >>>>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan Niceno wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I've never used a mailing list before, so I hope this message will >>>>>> reach PETSc users and experts and someone might be willing to help me. I >>>>>> am also novice in PETSc. >>>>>> >>>>>> I have developed an unstructured finite volume solver on top of PETSc >>>>>> libraries. In sequential, it works like a charm. For the parallel >>>>>> version, I do domain decomposition externally with Metis, and work out >>>>>> local and global numberings, as well as communication patterns between >>>>>> processor. (The latter don't seem to be needed for PETSc, though.) When I >>>>>> run my program in parallel, it also works, but I miss values in vectors' >>>>>> ghost points. >>>>>> >>>>>> I create vectors with command: VecCreate(PETSC_COMM_WORLD, &x); >>>>>> >>>>>> Is it possible to get the ghost values if a vector is created like >>>>>> this? >>>>>> >>>>> >>>>> I do not understand this question. By definition, "ghost values" are >>>>> those not stored in the global vector. >>>>> >>>>> >>>>>> I have tried to use VecCreateGhost, but for some reason which is >>>>>> beyond my comprehension, PETSc goes berserk when it reaches the command: >>>>>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, ifrom, &x) >>>>>> >>>>> >>>>> I think you can understand that "berserk" tells me absolutely >>>>> nothing. Error message? Stack trace? Did you try to run an >>>>> example which uses VecGhost? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Can anyone help me? Either how to reach ghost values for vector >>>>>> created by VecCreate, or how to use VecCreateGhost properly? >>>>>> >>>>>> >>>>>> Kind regards, >>>>>> >>>>>> Bojan >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>>> -- >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >>> -- >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> -- >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From fpoulin at uwaterloo.ca Thu Feb 23 15:03:21 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Thu, 23 Feb 2012 16:03:21 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: References: Message-ID: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> Hello again, I am using v3.1 of PETSc. I changed the grid sizes slightly and I'm including 4 log_summary files. The times are shown below. I have not modified the example at all except in specifying the matrix size. Could it be that I need much larger? When I tried much larger matrices I think I might have got an error because I was using too much memory. n time 2 22s 4 29.8s 8 33.7s 16 28.3s Sorry for my first email but I hope this has more information. Cheers, Francis > On 2012-02-23, at 3:27 PM, Jed Brown wrote: > Always send output with -log_summary for each run that you do. > On Thu, Feb 23, 2012 at 14:16, Francis Poulin wrote: > Hello, > > I am learning to use PetSc but am just a notice. I have a rather basic question to ask and couldn't not find it on the achieves. > > I am wanting to test the scalability of a Multigrid solver to the 3D Poisson equation. I found ksp/ex22.c that seems to solve the problem that I'm interested in. I ran it on a large server using different processors. > > The syntax that I use to run using MPI was > > ./ex22 -da_grid_x 64 -da_grid_y 64 -da_grid_z 32 > > Which version of PETSc? > > > I tested it using 2, 4, 8, 16 cpus and found that the time increases. See below. Clearly there is something that I don't understand since the time should be reduced. > > n wtime > --------------------- > 2 3m58s > 4 3m54s > 8 5m51s > 16 7m23s > > Any advice would be greatly appreciated. > > Best regrads, > Francis > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n2.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n4.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n8.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n16.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno at psi.ch Thu Feb 23 15:05:14 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 22:05:14 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> Message-ID: <4F46AA0A.4080209@psi.ch> No, I use global. for(Int i=0; i Are you using local numbering when accessing the local part of ghost > nodes? > > On Thu, Feb 23, 2012 at 12:43 PM, Bojan Niceno > wrote: > > Dear Matt, > > > I have a new insight, although is not the full resolution. If I > change my code in PETScSolver.cpp from: > > > /*-------------------------------------------------+ > | Make necessary PETSc intializations for vetors | > +-------------------------------------------------*/ > Int nghost = N - n; > Int * ghosts = new Int(nghost); > for(Int n=0; n assert( M.mesh.nodes[n].global_number >= 0); > assert( M.mesh.nodes[n].global_number < 14065); > } > for(Int i=n; i assert( M.mesh.nodes[i].global_number >= 0); > assert( M.mesh.nodes[i].global_number < 14065); > assert( ! (M.mesh.nodes[i].global_number >= n_start && > M.mesh.nodes[i].global_number < n_end) ); > ghosts[i] = M.mesh.nodes[i].global_number; > } > > VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, > &ghosts[0], &x); > > to: > > /*-------------------------------------------------+ > | Make necessary PETSc intializations for vetors | > +-------------------------------------------------*/ > Int nghost = N - n; > Indices ghosts; // <---= > NEW! > for(Int n=0; n assert( M.mesh.nodes[n].global_number >= 0); > assert( M.mesh.nodes[n].global_number < 14065); > } > for(Int i=n; i assert( M.mesh.nodes[i].global_number >= 0); > assert( M.mesh.nodes[i].global_number < 14065); > assert( ! (M.mesh.nodes[i].global_number >= n_start && > M.mesh.nodes[i].global_number < n_end) ); > ghosts.push_back( M.mesh.nodes[i].global_number ); // <---= > NEW! > } > assert( ghosts.size() == nghost ); // <---= NEW! > > VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, > &ghosts[0], &x); > > I pass the VecCreateGhost phase. "Indices" is an STL container of > integers. It seems it works better than classical C array for > this case. > > > However, I still do not see the ghost values, i.e. I get the > following error: > > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error > Message ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Can only get local values, trying 3529! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Argument out of range! > [1]PETSC ERROR: Can only get local values, trying 22! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 > 09:28:45 CST 2012 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error > Message ------------------------------------ > [2]PETSC ERROR: Argument out of range! > [2]PETSC ERROR: Can only get local values, trying 86! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 > 09:28:45 CST 2012 > > when I am trying to access values in ghost cells. What do I have > to use to see them ghosts? I reckon VecGhostGetLocalForm should > be used, right? > > > Kind regards, > > > Bojan > > > > On 2/23/2012 8:36 PM, Matthew Knepley wrote: >> On Thu, Feb 23, 2012 at 1:33 PM, Bojan Niceno >> > wrote: >> >> Dear Matt, >> >> >> I sent the code as an attached tarball. I sent it with case >> I run, so is 2 MB big. It is now in the cue for moderator's >> approval. >> >> >> No, you HAVE to send it to petsc-maint at mcs.anl.gov >> , as I said last time, for >> exactly this reason. >> >> Matt >> >> Thanks. >> >> >> Kind regards, >> >> >> Bojan >> >> >> On 2/23/2012 8:04 PM, Matthew Knepley wrote: >>> On Thu, Feb 23, 2012 at 12:51 PM, Bojan Niceno >>> > wrote: >>> >>> Dear Matt, >>> >>> >>> are you sure? It is almost 4000 lines long! Shall I >>> send only the function which bother me? >>> >>> If the entire code is what you need, shall I make a >>> tarball and attach it? >>> >>> >>> Send something the builds and runs. Don't care how long it is. >>> >>> Matt >>> >>> Kind regards, >>> >>> >>> Bojan >>> >>> On 2/23/2012 7:44 PM, Matthew Knepley wrote: >>>> On Thu, Feb 23, 2012 at 12:28 PM, Bojan Niceno >>>> > wrote: >>>> >>>> On 2/23/2012 7:24 PM, Matthew Knepley wrote: >>>>> On Thu, Feb 23, 2012 at 12:05 PM, Bojan Niceno >>>>> > >>>>> wrote: >>>>> >>>>> Dear Matthew, >>>>> >>>>> >>>>> thank you for your response. When I use >>>>> VecCreateGhost, I get the following: >>>>> >>>>> >>>>> It appears that you passed a bad communicator. Did >>>>> you not initialize a 'comm' variable? >>>> >>>> I pass PETSC_COMM_WORLD to VecCreateGhost. >>>> >>>> I don't know what you mean by 'comm' variable :-( >>>> I called all the routines to initialize PETSc. >>>> >>>> >>>> Send your code to petsc-maint at mcs.anl.gov >>>> . >>>> >>>> Matt >>>> >>>> >>>> Cheers, >>>> >>>> >>>> Bojan >>>> >>>>> >>>>> Matt >>>>> >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: >>>>> Segmentation Violation, probably memory access >>>>> out of range >>>>> [0]PETSC ERROR: Try option -start_in_debugger >>>>> or -on_error_attach_debugger >>>>> [0]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC >>>>> ERROR: or try http://valgrind.org on GNU/linux >>>>> and Apple Mac OS X to find memory corruption >>>>> errors >>>>> [0]PETSC ERROR: likely location of problem >>>>> given in stack below >>>>> [0]PETSC ERROR: --------------------- Stack >>>>> Frames ------------------------------------ >>>>> [0]PETSC ERROR: Note: The EXACT line numbers >>>>> in the stack are not available, >>>>> [0]PETSC ERROR: INSTEAD the line number >>>>> of the start of the function >>>>> [0]PETSC ERROR: is given. >>>>> [0]PETSC ERROR: [0] PetscCommDuplicate line >>>>> 140 src/sys/objects/tagm.c >>>>> [0]PETSC ERROR: [0] PetscHeaderCreate_Private >>>>> line 30 src/sys/objects/inherit.c >>>>> [0]PETSC ERROR: [0] VecCreate line 32 >>>>> src/vec/vec/interface/veccreate.c >>>>> [0]PETSC ERROR: [0] VecCreateGhostWithArray >>>>> line 567 src/vec/vec/impls/mpi/pbvec.c >>>>> [0]PETSC ERROR: [0] VecCreateGhost line 647 >>>>> src/vec/vec/impls/mpi/pbvec.c >>>>> [0]PETSC ERROR: --------------------- Error >>>>> Message ------------------------------------ >>>>> [0]PETSC ERROR: Signal received! >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: Petsc Release Version 3.2.0, >>>>> Patch 6, Wed Jan 11 09:28:45 CST 2012 >>>>> [0]PETSC ERROR: See docs/changes/index.html >>>>> for recent updates. >>>>> [0]PETSC ERROR: See docs/faq.html for hints >>>>> about trouble shooting. >>>>> [0]PETSC ERROR: See docs/index.html for manual >>>>> pages. >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: ./PSI-Flow on a arch-linu >>>>> named lccfd06 by niceno Thu Feb 23 19:02:45 2012 >>>>> [0]PETSC ERROR: Libraries linked from >>>>> /homecfd/niceno/PETSc-3.2-p6/arch-linux2-c-debug/lib >>>>> [0]PETSC ERROR: Configure run at Fri Feb 10 >>>>> 10:24:13 2012 >>>>> [0]PETSC ERROR: Configure options >>>>> [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> [0]PETSC ERROR: User provided function() line >>>>> 0 in unknown directory unknown file >>>>> >>>>> I don't understand what could be causing it. >>>>> I took very good care to match the global >>>>> numbers of ghost cells when calling VecCreateGhost >>>>> >>>>> >>>>> Kind regards, >>>>> >>>>> >>>>> Bojan >>>>> >>>>> >>>>> On 2/23/2012 5:53 PM, Matthew Knepley wrote: >>>>>> On Thu, Feb 23, 2012 at 10:46 AM, Bojan >>>>>> Niceno >>>>> > wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I've never used a mailing list before, so >>>>>> I hope this message will reach PETSc >>>>>> users and experts and someone might be >>>>>> willing to help me. I am also novice in >>>>>> PETSc. >>>>>> >>>>>> I have developed an unstructured finite >>>>>> volume solver on top of PETSc libraries. >>>>>> In sequential, it works like a charm. >>>>>> For the parallel version, I do domain >>>>>> decomposition externally with Metis, and >>>>>> work out local and global numberings, as >>>>>> well as communication patterns between >>>>>> processor. (The latter don't seem to be >>>>>> needed for PETSc, though.) When I run my >>>>>> program in parallel, it also works, but I >>>>>> miss values in vectors' ghost points. >>>>>> >>>>>> I create vectors with command: >>>>>> VecCreate(PETSC_COMM_WORLD, &x); >>>>>> >>>>>> Is it possible to get the ghost values if >>>>>> a vector is created like this? >>>>>> >>>>>> >>>>>> I do not understand this question. By >>>>>> definition, "ghost values" are those not >>>>>> stored in the global vector. >>>>>> >>>>>> I have tried to use VecCreateGhost, but >>>>>> for some reason which is beyond my >>>>>> comprehension, PETSc goes berserk when it >>>>>> reaches the command: >>>>>> VecCreateGhost(PETSC_COMM_WORLD, n, >>>>>> PETSC_DECIDE, nghost, ifrom, &x) >>>>>> >>>>>> >>>>>> I think you can understand that "berserk" >>>>>> tells me absolutely nothing. Error message? >>>>>> Stack trace? Did you try to run an >>>>>> example which uses VecGhost? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Can anyone help me? Either how to reach >>>>>> ghost values for vector created by >>>>>> VecCreate, or how to use VecCreateGhost >>>>>> properly? >>>>>> >>>>>> >>>>>> Kind regards, >>>>>> >>>>>> Bojan >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted >>>>>> before they begin their experiments is >>>>>> infinitely more interesting than any results >>>>>> to which their experiments lead. >>>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before >>>>> they begin their experiments is infinitely more >>>>> interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> -- >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >> >> >> -- >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > -- > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From jedbrown at mcs.anl.gov Thu Feb 23 15:10:18 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 23 Feb 2012 15:10:18 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46AA0A.4080209@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 15:05, Bojan Niceno wrote: > No, I use global. > The local form is just a local vector. It doesn't even know that a global problem exists. You can't index into it using global indices. (In general, there is no efficient way to look up information in the local vector (includes ghost points) using global indices.) > > for(Int i=0; i Int gi = mesh.nodes[i].global_number; > VecGetValues(x, 1, &gi, &unk[i]); > } > > "n" is defined as the number of cells inside, i.e. without buffers. "unk" > is my external array. If I try to access buffer values, I use: > > for(Int i=0; i Int gi = mesh.nodes[i].global_number; > VecGetValues(x, 1, &gi, &unk[i]); > } > > But then I end up with tons of warnings, presumably because I am going > beyond "n". Vector x was created with VecCreateGhost. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno at psi.ch Thu Feb 23 15:16:48 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 22:16:48 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> Message-ID: <4F46ACC0.2000904@psi.ch> Dear Jed, thanks. Now I have the following: - Array unk, which should hold values inside the partition and in ghost cells. It is big enough to hold both - Vec x, created by command VecCreateGhost, with proper padding for ghost cells - Successful call to linear solver in parallel. But I need to copy ghost values from x to my array unk. How can I do it? Kind regards, Bojan On 2/23/2012 10:10 PM, Jed Brown wrote: > On Thu, Feb 23, 2012 at 15:05, Bojan Niceno > wrote: > > No, I use global. > > > The local form is just a local vector. It doesn't even know that a > global problem exists. You can't index into it using global indices. > (In general, there is no efficient way to look up information in the > local vector (includes ghost points) using global indices.) > > > for(Int i=0; i Int gi = mesh.nodes[i].global_number; > VecGetValues(x, 1, &gi, &unk[i]); > } > > "n" is defined as the number of cells inside, i.e. without > buffers. "unk" is my external array. If I try to access buffer > values, I use: > > for(Int i=0; i Int gi = mesh.nodes[i].global_number; > VecGetValues(x, 1, &gi, &unk[i]); > } > > But then I end up with tons of warnings, presumably because I am > going beyond "n". Vector x was created with VecCreateGhost. > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu Feb 23 15:20:27 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 23 Feb 2012 15:20:27 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> Message-ID: <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> Still need the -ksp_view output. It is spending most of the time in the LU factorization and solve. I suspect the coarse problem is way to big (like you are using two levels of multigrid) and since it is solved redundantly that takes all the time. Run with say 5 levels. Barry On Feb 23, 2012, at 3:03 PM, Francis Poulin wrote: > Hello again, > > I am using v3.1 of PETSc. > > I changed the grid sizes slightly and I'm including 4 log_summary files. > > The times are shown below. I have not modified the example at all except in specifying the matrix size. Could it be that I need much larger? When I tried much larger matrices I think I might have got an error because I was using too much memory. > > n time > 2 22s > 4 29.8s > 8 33.7s > 16 28.3s > > Sorry for my first email but I hope this has more information. > > Cheers, Francis > > > > > > >> > > On 2012-02-23, at 3:27 PM, Jed Brown wrote: > >> Always send output with -log_summary for each run that you do. >> On Thu, Feb 23, 2012 at 14:16, Francis Poulin wrote: >> Hello, >> >> I am learning to use PetSc but am just a notice. I have a rather basic question to ask and couldn't not find it on the achieves. >> >> I am wanting to test the scalability of a Multigrid solver to the 3D Poisson equation. I found ksp/ex22.c that seems to solve the problem that I'm interested in. I ran it on a large server using different processors. >> >> The syntax that I use to run using MPI was >> >> ./ex22 -da_grid_x 64 -da_grid_y 64 -da_grid_z 32 >> >> Which version of PETSc? >> >> >> I tested it using 2, 4, 8, 16 cpus and found that the time increases. See below. Clearly there is something that I don't understand since the time should be reduced. >> >> n wtime >> --------------------- >> 2 3m58s >> 4 3m54s >> 8 5m51s >> 16 7m23s >> >> Any advice would be greatly appreciated. >> >> Best regrads, >> Francis >> >> >> > From mirzadeh at gmail.com Thu Feb 23 16:23:06 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 23 Feb 2012 14:23:06 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46ACC0.2000904@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> Message-ID: just index x with the local numberings. if you have 'n' local nodes and 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: > Dear Jed, > > thanks. > > Now I have the following: > > - Array unk, which should hold values inside the partition and in ghost > cells. It is big enough to hold both > - Vec x, created by command VecCreateGhost, with proper padding for ghost > cells > - Successful call to linear solver in parallel. > > But I need to copy ghost values from x to my array unk. > > How can I do it? > > > Kind regards, > > > Bojan > > On 2/23/2012 10:10 PM, Jed Brown wrote: > > On Thu, Feb 23, 2012 at 15:05, Bojan Niceno wrote: > >> No, I use global. >> > > The local form is just a local vector. It doesn't even know that a > global problem exists. You can't index into it using global indices. (In > general, there is no efficient way to look up information in the local > vector (includes ghost points) using global indices.) > > >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> "n" is defined as the number of cells inside, i.e. without buffers. >> "unk" is my external array. If I try to access buffer values, I use: >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> But then I end up with tons of warnings, presumably because I am going >> beyond "n". Vector x was created with VecCreateGhost. >> > > > > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 16:32:18 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Thu, 23 Feb 2012 23:32:18 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> Message-ID: <4F46BE72.5070209@psi.ch> Dear Mohammad, it doesn't help me, or I did not understand your explanation. If I do this: /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ for(Int i=0; i > just index x with the local numberings. if you have 'n' local nodes > and 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' > > On Feb 23, 2012 1:16 PM, "Bojan Niceno" > wrote: > > Dear Jed, > > thanks. > > Now I have the following: > > - Array unk, which should hold values inside the partition and in > ghost cells. It is big enough to hold both > - Vec x, created by command VecCreateGhost, with proper padding > for ghost cells > - Successful call to linear solver in parallel. > > But I need to copy ghost values from x to my array unk. > > How can I do it? > > > Kind regards, > > > Bojan > > On 2/23/2012 10:10 PM, Jed Brown wrote: >> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno > > wrote: >> >> No, I use global. >> >> >> The local form is just a local vector. It doesn't even know that >> a global problem exists. You can't index into it using global >> indices. (In general, there is no efficient way to look up >> information in the local vector (includes ghost points) using >> global indices.) >> >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> "n" is defined as the number of cells inside, i.e. without >> buffers. "unk" is my external array. If I try to access >> buffer values, I use: >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> But then I end up with tons of warnings, presumably because I >> am going beyond "n". Vector x was created with VecCreateGhost. >> >> > > > -- > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From mirzadeh at gmail.com Thu Feb 23 16:49:50 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 23 Feb 2012 14:49:50 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46BE72.5070209@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> Message-ID: based on, VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], &x); it seems to me that x is your actual ghosted vector. If this is true, then you need to get its "local" form via VecGhostGetLocalForm(). Once you have done, you should be able to access the ghosted nodes. Are you calling this function anywhere? On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno wrote: > Dear Mohammad, > > > it doesn't help me, or I did not understand your explanation. > > If I do this: > > /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ > > for(Int i=0; i Int gi = mesh.nodes[i].global_number; > VecGetValues(x, 1, &gi, &unk[i]); > } > > /* copy ghost values (CREATES MANY WARNINGS */ > > for(Int i=n; i VecGetValues(x, 1, &i, &unk[i]); > } > > I get arnings are like this. > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Can only get local values, trying 3518! > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Argument out of range! > [3]PETSC ERROR: Can only get local values, trying 3511! > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > What am I doing wrong here? > > > Cheers, > > > Bojan > > > > On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: > > just index x with the local numberings. if you have 'n' local nodes and > 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' > On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: > >> Dear Jed, >> >> thanks. >> >> Now I have the following: >> >> - Array unk, which should hold values inside the partition and in ghost >> cells. It is big enough to hold both >> - Vec x, created by command VecCreateGhost, with proper padding for ghost >> cells >> - Successful call to linear solver in parallel. >> >> But I need to copy ghost values from x to my array unk. >> >> How can I do it? >> >> >> Kind regards, >> >> >> Bojan >> >> On 2/23/2012 10:10 PM, Jed Brown wrote: >> >> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno wrote: >> >>> No, I use global. >>> >> >> The local form is just a local vector. It doesn't even know that a >> global problem exists. You can't index into it using global indices. (In >> general, there is no efficient way to look up information in the local >> vector (includes ghost points) using global indices.) >> >> >>> >>> for(Int i=0; i>> Int gi = mesh.nodes[i].global_number; >>> VecGetValues(x, 1, &gi, &unk[i]); >>> } >>> >>> "n" is defined as the number of cells inside, i.e. without buffers. >>> "unk" is my external array. If I try to access buffer values, I use: >>> >>> for(Int i=0; i>> Int gi = mesh.nodes[i].global_number; >>> VecGetValues(x, 1, &gi, &unk[i]); >>> } >>> >>> But then I end up with tons of warnings, presumably because I am going >>> beyond "n". Vector x was created with VecCreateGhost. >>> >> >> >> >> -- >> > > > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 17:18:39 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Fri, 24 Feb 2012 00:18:39 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> Message-ID: <4F46C94F.9080508@psi.ch> On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: > based on, > > VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], &x); > > > it seems to me that x is your actual ghosted vector. If this is true, > then you need to get its "local" form via VecGhostGetLocalForm > (). > Once you have done, you should be able to access the ghosted nodes. > Are you calling this function anywhere? I tried that before. I did: Vec lx; VecGhostGetLocalForm(x, &lx) then I copied "lx" to my variable, like for(Int i=1; i > On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno > wrote: > > Dear Mohammad, > > > it doesn't help me, or I did not understand your explanation. > > If I do this: > > /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER > VALUES) */ > > for(Int i=0; i Int gi = mesh.nodes[i].global_number; > VecGetValues(x, 1, &gi, &unk[i]); > } > > /* copy ghost values (CREATES MANY WARNINGS */ > > for(Int i=n; i VecGetValues(x, 1, &i, &unk[i]); > } > > I get arnings are like this. > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Can only get local values, trying 3518! > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Argument out of range! > [3]PETSC ERROR: Can only get local values, trying 3511! > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > What am I doing wrong here? > > > Cheers, > > > Bojan > > > > On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: >> >> just index x with the local numberings. if you have 'n' local >> nodes and 'g' ghost nodes in x, ghost nodes indecies run from 'n' >> to 'n+g-1' >> >> On Feb 23, 2012 1:16 PM, "Bojan Niceno" > > wrote: >> >> Dear Jed, >> >> thanks. >> >> Now I have the following: >> >> - Array unk, which should hold values inside the partition >> and in ghost cells. It is big enough to hold both >> - Vec x, created by command VecCreateGhost, with proper >> padding for ghost cells >> - Successful call to linear solver in parallel. >> >> But I need to copy ghost values from x to my array unk. >> >> How can I do it? >> >> >> Kind regards, >> >> >> Bojan >> >> On 2/23/2012 10:10 PM, Jed Brown wrote: >>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno >>> > wrote: >>> >>> No, I use global. >>> >>> >>> The local form is just a local vector. It doesn't even know >>> that a global problem exists. You can't index into it using >>> global indices. (In general, there is no efficient way to >>> look up information in the local vector (includes ghost >>> points) using global indices.) >>> >>> >>> for(Int i=0; i>> Int gi = mesh.nodes[i].global_number; >>> VecGetValues(x, 1, &gi, &unk[i]); >>> } >>> >>> "n" is defined as the number of cells inside, i.e. >>> without buffers. "unk" is my external array. If I try >>> to access buffer values, I use: >>> >>> for(Int i=0; i>> Int gi = mesh.nodes[i].global_number; >>> VecGetValues(x, 1, &gi, &unk[i]); >>> } >>> >>> But then I end up with tons of warnings, presumably >>> because I am going beyond "n". Vector x was created >>> with VecCreateGhost. >>> >>> >> >> >> -- >> > > > -- > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From mirzadeh at gmail.com Thu Feb 23 17:27:21 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 23 Feb 2012 15:27:21 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46C94F.9080508@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> Message-ID: You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() functions to update the ghost values if you change them in the global representation. See Petsc Manual 3.2 pp 55-56 Mohammad On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno wrote: > On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: > > based on, > > VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], > &x); > > > it seems to me that x is your actual ghosted vector. If this is true, > then you need to get its "local" form via VecGhostGetLocalForm(). > Once you have done, you should be able to access the ghosted nodes. Are you > calling this function anywhere? > > > I tried that before. I did: > > Vec lx; > VecGhostGetLocalForm(x, &lx) > > then I copied "lx" to my variable, like > > for(Int i=1; i unk[i] = lx[i] > > but ghost values were also zero. I am thinking that PETSc somehow clears > the ghost values after a call to KSP. Is it the case? > > > Kind regards, > > > Bojan > > > > On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno wrote: > >> Dear Mohammad, >> >> >> it doesn't help me, or I did not understand your explanation. >> >> If I do this: >> >> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> /* copy ghost values (CREATES MANY WARNINGS */ >> >> for(Int i=n; i> VecGetValues(x, 1, &i, &unk[i]); >> } >> >> I get arnings are like this. >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Argument out of range! >> [0]PETSC ERROR: Can only get local values, trying 3518! >> [3]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [3]PETSC ERROR: Argument out of range! >> [3]PETSC ERROR: Can only get local values, trying 3511! >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> What am I doing wrong here? >> >> >> Cheers, >> >> >> Bojan >> >> >> >> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: >> >> just index x with the local numberings. if you have 'n' local nodes and >> 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' >> On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: >> >>> Dear Jed, >>> >>> thanks. >>> >>> Now I have the following: >>> >>> - Array unk, which should hold values inside the partition and in ghost >>> cells. It is big enough to hold both >>> - Vec x, created by command VecCreateGhost, with proper padding for >>> ghost cells >>> - Successful call to linear solver in parallel. >>> >>> But I need to copy ghost values from x to my array unk. >>> >>> How can I do it? >>> >>> >>> Kind regards, >>> >>> >>> Bojan >>> >>> On 2/23/2012 10:10 PM, Jed Brown wrote: >>> >>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno wrote: >>> >>>> No, I use global. >>>> >>> >>> The local form is just a local vector. It doesn't even know that a >>> global problem exists. You can't index into it using global indices. (In >>> general, there is no efficient way to look up information in the local >>> vector (includes ghost points) using global indices.) >>> >>> >>>> >>>> for(Int i=0; i>>> Int gi = mesh.nodes[i].global_number; >>>> VecGetValues(x, 1, &gi, &unk[i]); >>>> } >>>> >>>> "n" is defined as the number of cells inside, i.e. without buffers. >>>> "unk" is my external array. If I try to access buffer values, I use: >>>> >>>> for(Int i=0; i>>> Int gi = mesh.nodes[i].global_number; >>>> VecGetValues(x, 1, &gi, &unk[i]); >>>> } >>>> >>>> But then I end up with tons of warnings, presumably because I am going >>>> beyond "n". Vector x was created with VecCreateGhost. >>>> >>> >>> >>> >>> -- >>> >> >> >> -- >> > > > > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 17:29:03 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Fri, 24 Feb 2012 00:29:03 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> Message-ID: <4F46CBBF.8030403@psi.ch> A-ha! Will do, thanks :-) On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: > You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() > functions to update the ghost values if you change them in the global > representation. See Petsc Manual 3.2 pp 55-56 > > Mohammad > > On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno > wrote: > > On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: >> based on, >> >> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, >> &ghosts[0], &x); >> >> >> it seems to me that x is your actual ghosted vector. If this is >> true, then you need to get its "local" form via >> VecGhostGetLocalForm >> (). >> Once you have done, you should be able to access the ghosted >> nodes. Are you calling this function anywhere? > > I tried that before. I did: > > Vec lx; > VecGhostGetLocalForm(x, &lx) > > then I copied "lx" to my variable, like > > for(Int i=1; i unk[i] = lx[i] > > but ghost values were also zero. I am thinking that PETSc > somehow clears the ghost values after a call to KSP. Is it the case? > > > Kind regards, > > > Bojan > > >> >> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno >> > wrote: >> >> Dear Mohammad, >> >> >> it doesn't help me, or I did not understand your explanation. >> >> If I do this: >> >> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER >> VALUES) */ >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> /* copy ghost values (CREATES MANY WARNINGS */ >> >> for(Int i=n; i> VecGetValues(x, 1, &i, &unk[i]); >> } >> >> I get arnings are like this. >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Argument out of range! >> [0]PETSC ERROR: Can only get local values, trying 3518! >> [3]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [3]PETSC ERROR: Argument out of range! >> [3]PETSC ERROR: Can only get local values, trying 3511! >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> What am I doing wrong here? >> >> >> Cheers, >> >> >> Bojan >> >> >> >> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: >>> >>> just index x with the local numberings. if you have 'n' >>> local nodes and 'g' ghost nodes in x, ghost nodes indecies >>> run from 'n' to 'n+g-1' >>> >>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" >> > wrote: >>> >>> Dear Jed, >>> >>> thanks. >>> >>> Now I have the following: >>> >>> - Array unk, which should hold values inside the >>> partition and in ghost cells. It is big enough to hold both >>> - Vec x, created by command VecCreateGhost, with proper >>> padding for ghost cells >>> - Successful call to linear solver in parallel. >>> >>> But I need to copy ghost values from x to my array unk. >>> >>> How can I do it? >>> >>> >>> Kind regards, >>> >>> >>> Bojan >>> >>> On 2/23/2012 10:10 PM, Jed Brown wrote: >>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno >>>> > wrote: >>>> >>>> No, I use global. >>>> >>>> >>>> The local form is just a local vector. It doesn't even >>>> know that a global problem exists. You can't index into >>>> it using global indices. (In general, there is no >>>> efficient way to look up information in the local >>>> vector (includes ghost points) using global indices.) >>>> >>>> >>>> for(Int i=0; i>>> Int gi = mesh.nodes[i].global_number; >>>> VecGetValues(x, 1, &gi, &unk[i]); >>>> } >>>> >>>> "n" is defined as the number of cells inside, i.e. >>>> without buffers. "unk" is my external array. If I >>>> try to access buffer values, I use: >>>> >>>> for(Int i=0; i>>> Int gi = mesh.nodes[i].global_number; >>>> VecGetValues(x, 1, &gi, &unk[i]); >>>> } >>>> >>>> But then I end up with tons of warnings, presumably >>>> because I am going beyond "n". Vector x was >>>> created with VecCreateGhost. >>>> >>>> >>> >>> >>> -- >>> >> >> >> -- >> >> > > > -- > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Thu Feb 23 17:38:46 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Fri, 24 Feb 2012 00:38:46 +0100 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> Message-ID: <4F46CE06.3020809@psi.ch> Yeeee-ha! VecGhostUpdateBegin() / VecGhostUpdateEnd() was indeed what I was missing. Thank you Mohammad, thank you all who helped me today! Cheers Bojan On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: > You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() > functions to update the ghost values if you change them in the global > representation. See Petsc Manual 3.2 pp 55-56 > > Mohammad > > On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno > wrote: > > On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: >> based on, >> >> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, >> &ghosts[0], &x); >> >> >> it seems to me that x is your actual ghosted vector. If this is >> true, then you need to get its "local" form via >> VecGhostGetLocalForm >> (). >> Once you have done, you should be able to access the ghosted >> nodes. Are you calling this function anywhere? > > I tried that before. I did: > > Vec lx; > VecGhostGetLocalForm(x, &lx) > > then I copied "lx" to my variable, like > > for(Int i=1; i unk[i] = lx[i] > > but ghost values were also zero. I am thinking that PETSc > somehow clears the ghost values after a call to KSP. Is it the case? > > > Kind regards, > > > Bojan > > >> >> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno >> > wrote: >> >> Dear Mohammad, >> >> >> it doesn't help me, or I did not understand your explanation. >> >> If I do this: >> >> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER >> VALUES) */ >> >> for(Int i=0; i> Int gi = mesh.nodes[i].global_number; >> VecGetValues(x, 1, &gi, &unk[i]); >> } >> >> /* copy ghost values (CREATES MANY WARNINGS */ >> >> for(Int i=n; i> VecGetValues(x, 1, &i, &unk[i]); >> } >> >> I get arnings are like this. >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Argument out of range! >> [0]PETSC ERROR: Can only get local values, trying 3518! >> [3]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [3]PETSC ERROR: Argument out of range! >> [3]PETSC ERROR: Can only get local values, trying 3511! >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> What am I doing wrong here? >> >> >> Cheers, >> >> >> Bojan >> >> >> >> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: >>> >>> just index x with the local numberings. if you have 'n' >>> local nodes and 'g' ghost nodes in x, ghost nodes indecies >>> run from 'n' to 'n+g-1' >>> >>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" >> > wrote: >>> >>> Dear Jed, >>> >>> thanks. >>> >>> Now I have the following: >>> >>> - Array unk, which should hold values inside the >>> partition and in ghost cells. It is big enough to hold both >>> - Vec x, created by command VecCreateGhost, with proper >>> padding for ghost cells >>> - Successful call to linear solver in parallel. >>> >>> But I need to copy ghost values from x to my array unk. >>> >>> How can I do it? >>> >>> >>> Kind regards, >>> >>> >>> Bojan >>> >>> On 2/23/2012 10:10 PM, Jed Brown wrote: >>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno >>>> > wrote: >>>> >>>> No, I use global. >>>> >>>> >>>> The local form is just a local vector. It doesn't even >>>> know that a global problem exists. You can't index into >>>> it using global indices. (In general, there is no >>>> efficient way to look up information in the local >>>> vector (includes ghost points) using global indices.) >>>> >>>> >>>> for(Int i=0; i>>> Int gi = mesh.nodes[i].global_number; >>>> VecGetValues(x, 1, &gi, &unk[i]); >>>> } >>>> >>>> "n" is defined as the number of cells inside, i.e. >>>> without buffers. "unk" is my external array. If I >>>> try to access buffer values, I use: >>>> >>>> for(Int i=0; i>>> Int gi = mesh.nodes[i].global_number; >>>> VecGetValues(x, 1, &gi, &unk[i]); >>>> } >>>> >>>> But then I end up with tons of warnings, presumably >>>> because I am going beyond "n". Vector x was >>>> created with VecCreateGhost. >>>> >>>> >>> >>> >>> -- >>> >> >> >> -- >> >> > > > -- > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From mirzadeh at gmail.com Thu Feb 23 17:52:16 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 23 Feb 2012 15:52:16 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46CE06.3020809@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> <4F46CE06.3020809@psi.ch> Message-ID: glad to hear that :) On Thu, Feb 23, 2012 at 3:38 PM, Bojan Niceno wrote: > Yeeee-ha! > > VecGhostUpdateBegin() / VecGhostUpdateEnd() was indeed what I was missing. > > Thank you Mohammad, thank you all who helped me today! > > > Cheers > > > Bojan > > > > > On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: > > You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() functions > to update the ghost values if you change them in the global representation. > See Petsc Manual 3.2 pp 55-56 > > Mohammad > > On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno wrote: > >> On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: >> >> based on, >> >> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], >> &x); >> >> >> it seems to me that x is your actual ghosted vector. If this is true, >> then you need to get its "local" form via VecGhostGetLocalForm(). >> Once you have done, you should be able to access the ghosted nodes. Are you >> calling this function anywhere? >> >> >> I tried that before. I did: >> >> Vec lx; >> VecGhostGetLocalForm(x, &lx) >> >> then I copied "lx" to my variable, like >> >> for(Int i=1; i> unk[i] = lx[i] >> >> but ghost values were also zero. I am thinking that PETSc somehow >> clears the ghost values after a call to KSP. Is it the case? >> >> >> Kind regards, >> >> >> Bojan >> >> >> >> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno wrote: >> >>> Dear Mohammad, >>> >>> >>> it doesn't help me, or I did not understand your explanation. >>> >>> If I do this: >>> >>> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ >>> >>> for(Int i=0; i>> Int gi = mesh.nodes[i].global_number; >>> VecGetValues(x, 1, &gi, &unk[i]); >>> } >>> >>> /* copy ghost values (CREATES MANY WARNINGS */ >>> >>> for(Int i=n; i>> VecGetValues(x, 1, &i, &unk[i]); >>> } >>> >>> I get arnings are like this. >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Argument out of range! >>> [0]PETSC ERROR: Can only get local values, trying 3518! >>> [3]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [3]PETSC ERROR: Argument out of range! >>> [3]PETSC ERROR: Can only get local values, trying 3511! >>> [3]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >>> What am I doing wrong here? >>> >>> >>> Cheers, >>> >>> >>> Bojan >>> >>> >>> >>> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: >>> >>> just index x with the local numberings. if you have 'n' local nodes and >>> 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' >>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: >>> >>>> Dear Jed, >>>> >>>> thanks. >>>> >>>> Now I have the following: >>>> >>>> - Array unk, which should hold values inside the partition and in ghost >>>> cells. It is big enough to hold both >>>> - Vec x, created by command VecCreateGhost, with proper padding for >>>> ghost cells >>>> - Successful call to linear solver in parallel. >>>> >>>> But I need to copy ghost values from x to my array unk. >>>> >>>> How can I do it? >>>> >>>> >>>> Kind regards, >>>> >>>> >>>> Bojan >>>> >>>> On 2/23/2012 10:10 PM, Jed Brown wrote: >>>> >>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno wrote: >>>> >>>>> No, I use global. >>>>> >>>> >>>> The local form is just a local vector. It doesn't even know that a >>>> global problem exists. You can't index into it using global indices. (In >>>> general, there is no efficient way to look up information in the local >>>> vector (includes ghost points) using global indices.) >>>> >>>> >>>>> >>>>> for(Int i=0; i>>>> Int gi = mesh.nodes[i].global_number; >>>>> VecGetValues(x, 1, &gi, &unk[i]); >>>>> } >>>>> >>>>> "n" is defined as the number of cells inside, i.e. without buffers. >>>>> "unk" is my external array. If I try to access buffer values, I use: >>>>> >>>>> for(Int i=0; i>>>> Int gi = mesh.nodes[i].global_number; >>>>> VecGetValues(x, 1, &gi, &unk[i]); >>>>> } >>>>> >>>>> But then I end up with tons of warnings, presumably because I am going >>>>> beyond "n". Vector x was created with VecCreateGhost. >>>>> >>>> >>>> >>>> >>>> -- >>>> >>> >>> >>> -- >>> >> >> >> >> -- >> > > > > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From fpoulin at uwaterloo.ca Thu Feb 23 18:58:39 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Thu, 23 Feb 2012 19:58:39 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> Message-ID: <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> Hello Barry, I can do it for each of them if that helps but I suspect the method is the same so I'm sending the information for the first 3, n = 2, 4, 8. In the mean time I will figure out how to change the number of levels.. Thanks, Francis -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n8_info.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n4_info.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: saw_log_summary_n2_info.txt URL: -------------- next part -------------- On 2012-02-23, at 4:20 PM, Barry Smith wrote: > > Still need the -ksp_view output. It is spending most of the time in the LU factorization and solve. I suspect the coarse problem is way to big (like you are using two levels of multigrid) and since it is solved redundantly that takes all the time. Run with say 5 levels. > > Barry > > On Feb 23, 2012, at 3:03 PM, Francis Poulin wrote: > >> Hello again, >> >> I am using v3.1 of PETSc. >> >> I changed the grid sizes slightly and I'm including 4 log_summary files. >> >> The times are shown below. I have not modified the example at all except in specifying the matrix size. Could it be that I need much larger? When I tried much larger matrices I think I might have got an error because I was using too much memory. >> >> n time >> 2 22s >> 4 29.8s >> 8 33.7s >> 16 28.3s >> >> Sorry for my first email but I hope this has more information. >> >> Cheers, Francis >> >> >> >> >> >> >>> >> >> On 2012-02-23, at 3:27 PM, Jed Brown wrote: >> >>> Always send output with -log_summary for each run that you do. >>> On Thu, Feb 23, 2012 at 14:16, Francis Poulin wrote: >>> Hello, >>> >>> I am learning to use PetSc but am just a notice. I have a rather basic question to ask and couldn't not find it on the achieves. >>> >>> I am wanting to test the scalability of a Multigrid solver to the 3D Poisson equation. I found ksp/ex22.c that seems to solve the problem that I'm interested in. I ran it on a large server using different processors. >>> >>> The syntax that I use to run using MPI was >>> >>> ./ex22 -da_grid_x 64 -da_grid_y 64 -da_grid_z 32 >>> >>> Which version of PETSc? >>> >>> >>> I tested it using 2, 4, 8, 16 cpus and found that the time increases. See below. Clearly there is something that I don't understand since the time should be reduced. >>> >>> n wtime >>> --------------------- >>> 2 3m58s >>> 4 3m54s >>> 8 5m51s >>> 16 7m23s >>> >>> Any advice would be greatly appreciated. >>> >>> Best regrads, >>> Francis >>> >>> >>> >> > From jedbrown at mcs.anl.gov Thu Feb 23 20:24:17 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 23 Feb 2012 20:24:17 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> Message-ID: On Thu, Feb 23, 2012 at 18:58, Francis Poulin wrote: > I can do it for each of them if that helps but I suspect the method is the > same so I'm sending the information for the first 3, n = 2, 4, 8. In the > mean time I will figure out how to change the number of levels.. > There are only three levels. The coarsest level has 32k degrees of freedom, which is very expensive to solve (redundantly) with a direct solver. Run this, it's higher resolution and will be much faster than what you had. mpiexec -n 2 ./ex22 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -dmmg_nlevels 6 -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary Also, please switch to a more recent release of PETSc as soon as possible and do not develop new code using DMMG since that component has been removed (and its functionality incorporated into SNES and KSP). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Thu Feb 23 20:35:30 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Thu, 23 Feb 2012 20:35:30 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46ACC0.2000904@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> Message-ID: On Thu, Feb 23, 2012 at 15:16, Bojan Niceno wrote: > - Array unk, which should hold values inside the partition and in ghost > cells. It is big enough to hold both > - Vec x, created by command VecCreateGhost, with proper padding for ghost > cells > - Successful call to linear solver in parallel. > > But I need to copy ghost values from x to my array unk. > I'm not sure what you're asking. You create the ghost array and use VecGhostGetLocalForm()/VecGhostRestoreLocalForm() to access and release access to the local vector (which includes the ghost points). You can into or out of the local form if you want, but usually people set up their indexing to use the ordering in the local form (owned values followed by ghosted values). -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Feb 23 21:44:13 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 23 Feb 2012 21:44:13 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <4F46CE06.3020809@psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> <4F46CE06.3020809@psi.ch> Message-ID: I have added the following to the VecGhostGetLocalForm() manual page to make this clearer for everyone: To update the ghost values from the locations on the other processes one must call VecGhostUpdateBegin() and VecGhostUpdateEnd() before accessing the ghost values. Thus normal usage is $ VecGhostUpdateBegin(x,INSERT_VALUES,SCATTER_FORWARD); $ VecGhostUpdateEnd(x,INSERT_VALUES,SCATTER_FORWARD); $ VecGhostGetLocalForm(x,&xlocal); $ VecGetArray(xlocal,&xvalues); $ /* access the non-ghost values in locations xvalues[0:n-1] and ghost values in locations xvalues[n:n+nghost]; */ $ VecRestoreArray(xlocal,&xvalues); $ VecGhostRestoreLocalForm(x,&xlocal); On Feb 23, 2012, at 5:38 PM, Bojan Niceno wrote: > Yeeee-ha! > > VecGhostUpdateBegin() / VecGhostUpdateEnd() was indeed what I was missing. > > Thank you Mohammad, thank you all who helped me today! > > > Cheers > > > Bojan > > > > On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: >> You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() functions to update the ghost values if you change them in the global representation. See Petsc Manual 3.2 pp 55-56 >> >> Mohammad >> >> On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno wrote: >> On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: >>> based on, >>> >>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], &x); >>> >>> >>> it seems to me that x is your actual ghosted vector. If this is true, then you need to get its "local" form via VecGhostGetLocalForm(). Once you have done, you should be able to access the ghosted nodes. Are you calling this function anywhere? >> >> I tried that before. I did: >> >> Vec lx; >> VecGhostGetLocalForm(x, &lx) >> >> then I copied "lx" to my variable, like >> >> for(Int i=1; i> unk[i] = lx[i] >> >> but ghost values were also zero. I am thinking that PETSc somehow clears the ghost values after a call to KSP. Is it the case? >> >> >> Kind regards, >> >> >> Bojan >> >> >>> >>> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno wrote: >>> Dear Mohammad, >>> >>> >>> it doesn't help me, or I did not understand your explanation. >>> >>> If I do this: >>> >>> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ >>> >>> for(Int i=0; i>> Int gi = mesh.nodes[i].global_number; >>> VecGetValues(x, 1, &gi, &unk[i]); >>> } >>> >>> /* copy ghost values (CREATES MANY WARNINGS */ >>> >>> for(Int i=n; i>> VecGetValues(x, 1, &i, &unk[i]); >>> } >>> >>> I get arnings are like this. >>> >>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------ >>> [0]PETSC ERROR: Argument out of range! >>> [0]PETSC ERROR: Can only get local values, trying 3518! >>> [3]PETSC ERROR: --------------------- Error Message ------------------------------------ >>> [3]PETSC ERROR: Argument out of range! >>> [3]PETSC ERROR: Can only get local values, trying 3511! >>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>> >>> What am I doing wrong here? >>> >>> >>> Cheers, >>> >>> >>> Bojan >>> >>> >>> >>> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: >>>> just index x with the local numberings. if you have 'n' local nodes and 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' >>>> >>>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: >>>> Dear Jed, >>>> >>>> thanks. >>>> >>>> Now I have the following: >>>> >>>> - Array unk, which should hold values inside the partition and in ghost cells. It is big enough to hold both >>>> - Vec x, created by command VecCreateGhost, with proper padding for ghost cells >>>> - Successful call to linear solver in parallel. >>>> >>>> But I need to copy ghost values from x to my array unk. >>>> >>>> How can I do it? >>>> >>>> >>>> Kind regards, >>>> >>>> >>>> Bojan >>>> >>>> On 2/23/2012 10:10 PM, Jed Brown wrote: >>>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno wrote: >>>>> No, I use global. >>>>> >>>>> The local form is just a local vector. It doesn't even know that a global problem exists. You can't index into it using global indices. (In general, there is no efficient way to look up information in the local vector (includes ghost points) using global indices.) >>>>> >>>>> >>>>> for(Int i=0; i>>>> Int gi = mesh.nodes[i].global_number; >>>>> VecGetValues(x, 1, &gi, &unk[i]); >>>>> } >>>>> >>>>> "n" is defined as the number of cells inside, i.e. without buffers. "unk" is my external array. If I try to access buffer values, I use: >>>>> >>>>> for(Int i=0; i>>>> Int gi = mesh.nodes[i].global_number; >>>>> VecGetValues(x, 1, &gi, &unk[i]); >>>>> } >>>>> >>>>> But then I end up with tons of warnings, presumably because I am going beyond "n". Vector x was created with VecCreateGhost. >>>>> >>>> >>>> >>>> -- >>>> >>> >>> >>> -- >>> >>> >> >> >> -- >> >> > > > -- > From bsmith at mcs.anl.gov Thu Feb 23 22:30:03 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 23 Feb 2012 22:30:03 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> Message-ID: <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> On Feb 23, 2012, at 8:24 PM, Jed Brown wrote: > On Thu, Feb 23, 2012 at 18:58, Francis Poulin wrote: > I can do it for each of them if that helps but I suspect the method is the same so I'm sending the information for the first 3, n = 2, 4, 8. In the mean time I will figure out how to change the number of levels.. > > There are only three levels. The coarsest level has 32k degrees of freedom, which is very expensive to solve (redundantly) with a direct solver. > > Run this, it's higher resolution and will be much faster than what you had. > > mpiexec -n 2 ./ex22 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -dmmg_nlevels 6 -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary > > > Also, please switch to a more recent release of PETSc as soon as possible and do not develop new code using DMMG since that component has been removed (and its functionality incorporated into SNES and KSP). Using Petsc Release Version 3.1.0, Patch 4, Fri Jul 30 14:42:02 CDT 2010 Goodness gracious, you won't run performance tests on your grandpa's desk calculator would you? Switch to petsc-dev immediately http://www.mcs.anl.gov/petsc/developers/index.html and use src/ksp/ksp/examples/tutorials/ex45.c and use the options -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary -da_refine 6 Barry From mirzadeh at gmail.com Fri Feb 24 03:42:46 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Fri, 24 Feb 2012 01:42:46 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> <4F46CE06.3020809@psi.ch> Message-ID: Actually now that we are at it, what would you guys recommend as the best way to find local indecies from global ones when accessing the _ghost_ values in the local form? This becomes even more complicated when your global petsc ordering is different from global application ordering. Its quite easy to access the local indecies for the local nodes -- you use ao to convert application global to petsc global and from petsc global, it is easy to get petsc local. However, when you have ghost nodes, there is no such simple relationship between petsc global and the _ghosted_ petsc local (Even though there is a simple one from petsc local to petsc global). The best thing I have came up with so far is to use a sequential second ao to map local indecies of ghost points to petsc global indecies. Is there a better way of doing this? Mohammad On Thu, Feb 23, 2012 at 7:44 PM, Barry Smith wrote: > > I have added the following to the VecGhostGetLocalForm() manual page to > make this clearer for everyone: > > To update the ghost values from the locations on the other processes > one must call > VecGhostUpdateBegin() and VecGhostUpdateEnd() before accessing the > ghost values. Thus normal > usage is > $ VecGhostUpdateBegin(x,INSERT_VALUES,SCATTER_FORWARD); > $ VecGhostUpdateEnd(x,INSERT_VALUES,SCATTER_FORWARD); > $ VecGhostGetLocalForm(x,&xlocal); > $ VecGetArray(xlocal,&xvalues); > $ /* access the non-ghost values in locations xvalues[0:n-1] and > ghost values in locations xvalues[n:n+nghost]; */ > $ VecRestoreArray(xlocal,&xvalues); > $ VecGhostRestoreLocalForm(x,&xlocal); > > > On Feb 23, 2012, at 5:38 PM, Bojan Niceno wrote: > > > Yeeee-ha! > > > > VecGhostUpdateBegin() / VecGhostUpdateEnd() was indeed what I was > missing. > > > > Thank you Mohammad, thank you all who helped me today! > > > > > > Cheers > > > > > > Bojan > > > > > > > > On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: > >> You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() > functions to update the ghost values if you change them in the global > representation. See Petsc Manual 3.2 pp 55-56 > >> > >> Mohammad > >> > >> On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno > wrote: > >> On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: > >>> based on, > >>> > >>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], > &x); > >>> > >>> > >>> it seems to me that x is your actual ghosted vector. If this is true, > then you need to get its "local" form via VecGhostGetLocalForm(). Once you > have done, you should be able to access the ghosted nodes. Are you calling > this function anywhere? > >> > >> I tried that before. I did: > >> > >> Vec lx; > >> VecGhostGetLocalForm(x, &lx) > >> > >> then I copied "lx" to my variable, like > >> > >> for(Int i=1; i >> unk[i] = lx[i] > >> > >> but ghost values were also zero. I am thinking that PETSc somehow > clears the ghost values after a call to KSP. Is it the case? > >> > >> > >> Kind regards, > >> > >> > >> Bojan > >> > >> > >>> > >>> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno > wrote: > >>> Dear Mohammad, > >>> > >>> > >>> it doesn't help me, or I did not understand your explanation. > >>> > >>> If I do this: > >>> > >>> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ > >>> > >>> for(Int i=0; i >>> Int gi = mesh.nodes[i].global_number; > >>> VecGetValues(x, 1, &gi, &unk[i]); > >>> } > >>> > >>> /* copy ghost values (CREATES MANY WARNINGS */ > >>> > >>> for(Int i=n; i >>> VecGetValues(x, 1, &i, &unk[i]); > >>> } > >>> > >>> I get arnings are like this. > >>> > >>> [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > >>> [0]PETSC ERROR: Argument out of range! > >>> [0]PETSC ERROR: Can only get local values, trying 3518! > >>> [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > >>> [3]PETSC ERROR: Argument out of range! > >>> [3]PETSC ERROR: Can only get local values, trying 3511! > >>> [3]PETSC ERROR: > ------------------------------------------------------------------------ > >>> > >>> What am I doing wrong here? > >>> > >>> > >>> Cheers, > >>> > >>> > >>> Bojan > >>> > >>> > >>> > >>> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: > >>>> just index x with the local numberings. if you have 'n' local nodes > and 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' > >>>> > >>>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: > >>>> Dear Jed, > >>>> > >>>> thanks. > >>>> > >>>> Now I have the following: > >>>> > >>>> - Array unk, which should hold values inside the partition and in > ghost cells. It is big enough to hold both > >>>> - Vec x, created by command VecCreateGhost, with proper padding for > ghost cells > >>>> - Successful call to linear solver in parallel. > >>>> > >>>> But I need to copy ghost values from x to my array unk. > >>>> > >>>> How can I do it? > >>>> > >>>> > >>>> Kind regards, > >>>> > >>>> > >>>> Bojan > >>>> > >>>> On 2/23/2012 10:10 PM, Jed Brown wrote: > >>>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno > wrote: > >>>>> No, I use global. > >>>>> > >>>>> The local form is just a local vector. It doesn't even know that a > global problem exists. You can't index into it using global indices. (In > general, there is no efficient way to look up information in the local > vector (includes ghost points) using global indices.) > >>>>> > >>>>> > >>>>> for(Int i=0; i >>>>> Int gi = mesh.nodes[i].global_number; > >>>>> VecGetValues(x, 1, &gi, &unk[i]); > >>>>> } > >>>>> > >>>>> "n" is defined as the number of cells inside, i.e. without buffers. > "unk" is my external array. If I try to access buffer values, I use: > >>>>> > >>>>> for(Int i=0; i >>>>> Int gi = mesh.nodes[i].global_number; > >>>>> VecGetValues(x, 1, &gi, &unk[i]); > >>>>> } > >>>>> > >>>>> But then I end up with tons of warnings, presumably because I am > going beyond "n". Vector x was created with VecCreateGhost. > >>>>> > >>>> > >>>> > >>>> -- > >>>> > >>> > >>> > >>> -- > >>> > >>> > >> > >> > >> -- > >> > >> > > > > > > -- > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno at psi.ch Fri Feb 24 03:54:25 2012 From: bojan.niceno at psi.ch (Niceno Bojan) Date: Fri, 24 Feb 2012 10:54:25 +0100 Subject: [petsc-users] Accessing Vector's ghost values References: <4F466D58.5020506@psi.ch><4F467FE6.40901@psi.ch><4F46853A.7070409@psi.ch><4F468A98.2010905@psi.ch><4F46947F.9090200@psi.ch><4F46A508.8030207@psi.ch><4F46AA0A.4080209@psi.ch><4F46ACC0.2000904@psi.ch><4F46BE72.5070209@psi.ch><4F46C94F.9080508@psi.ch><4F46CE06.3020809@psi.ch> Message-ID: <0C9CAFE8C4E5CA49884E636AE1D7CD69025B056D@MAILBOX0B.psi.ch> I do re-numbering during the domain decomposition stage. I re-number nodes (unknows) in the way I believe PETSc does internaly anyhow. It was a guess-work, I admit, but my communication patterns indeed work. Here is what I do: /*-------------------------------------+ | Assign global numbers to nodes | +-------------------------------------*/ Int global_number = 0; for(Int p=0; p wrote: > > I have added the following to the VecGhostGetLocalForm() manual page to > make this clearer for everyone: > > To update the ghost values from the locations on the other processes > one must call > VecGhostUpdateBegin() and VecGhostUpdateEnd() before accessing the > ghost values. Thus normal > usage is > $ VecGhostUpdateBegin(x,INSERT_VALUES,SCATTER_FORWARD); > $ VecGhostUpdateEnd(x,INSERT_VALUES,SCATTER_FORWARD); > $ VecGhostGetLocalForm(x,&xlocal); > $ VecGetArray(xlocal,&xvalues); > $ /* access the non-ghost values in locations xvalues[0:n-1] and > ghost values in locations xvalues[n:n+nghost]; */ > $ VecRestoreArray(xlocal,&xvalues); > $ VecGhostRestoreLocalForm(x,&xlocal); > > > On Feb 23, 2012, at 5:38 PM, Bojan Niceno wrote: > > > Yeeee-ha! > > > > VecGhostUpdateBegin() / VecGhostUpdateEnd() was indeed what I was > missing. > > > > Thank you Mohammad, thank you all who helped me today! > > > > > > Cheers > > > > > > Bojan > > > > > > > > On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: > >> You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() > functions to update the ghost values if you change them in the global > representation. See Petsc Manual 3.2 pp 55-56 > >> > >> Mohammad > >> > >> On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno > wrote: > >> On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: > >>> based on, > >>> > >>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], > &x); > >>> > >>> > >>> it seems to me that x is your actual ghosted vector. If this is true, > then you need to get its "local" form via VecGhostGetLocalForm(). Once you > have done, you should be able to access the ghosted nodes. Are you calling > this function anywhere? > >> > >> I tried that before. I did: > >> > >> Vec lx; > >> VecGhostGetLocalForm(x, &lx) > >> > >> then I copied "lx" to my variable, like > >> > >> for(Int i=1; i >> unk[i] = lx[i] > >> > >> but ghost values were also zero. I am thinking that PETSc somehow > clears the ghost values after a call to KSP. Is it the case? > >> > >> > >> Kind regards, > >> > >> > >> Bojan > >> > >> > >>> > >>> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno > wrote: > >>> Dear Mohammad, > >>> > >>> > >>> it doesn't help me, or I did not understand your explanation. > >>> > >>> If I do this: > >>> > >>> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) */ > >>> > >>> for(Int i=0; i >>> Int gi = mesh.nodes[i].global_number; > >>> VecGetValues(x, 1, &gi, &unk[i]); > >>> } > >>> > >>> /* copy ghost values (CREATES MANY WARNINGS */ > >>> > >>> for(Int i=n; i >>> VecGetValues(x, 1, &i, &unk[i]); > >>> } > >>> > >>> I get arnings are like this. > >>> > >>> [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > >>> [0]PETSC ERROR: Argument out of range! > >>> [0]PETSC ERROR: Can only get local values, trying 3518! > >>> [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > >>> [3]PETSC ERROR: Argument out of range! > >>> [3]PETSC ERROR: Can only get local values, trying 3511! > >>> [3]PETSC ERROR: > ------------------------------------------------------------------------ > >>> > >>> What am I doing wrong here? > >>> > >>> > >>> Cheers, > >>> > >>> > >>> Bojan > >>> > >>> > >>> > >>> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: > >>>> just index x with the local numberings. if you have 'n' local nodes > and 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' > >>>> > >>>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" wrote: > >>>> Dear Jed, > >>>> > >>>> thanks. > >>>> > >>>> Now I have the following: > >>>> > >>>> - Array unk, which should hold values inside the partition and in > ghost cells. It is big enough to hold both > >>>> - Vec x, created by command VecCreateGhost, with proper padding for > ghost cells > >>>> - Successful call to linear solver in parallel. > >>>> > >>>> But I need to copy ghost values from x to my array unk. > >>>> > >>>> How can I do it? > >>>> > >>>> > >>>> Kind regards, > >>>> > >>>> > >>>> Bojan > >>>> > >>>> On 2/23/2012 10:10 PM, Jed Brown wrote: > >>>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno > wrote: > >>>>> No, I use global. > >>>>> > >>>>> The local form is just a local vector. It doesn't even know that a > global problem exists. You can't index into it using global indices. (In > general, there is no efficient way to look up information in the local > vector (includes ghost points) using global indices.) > >>>>> > >>>>> > >>>>> for(Int i=0; i >>>>> Int gi = mesh.nodes[i].global_number; > >>>>> VecGetValues(x, 1, &gi, &unk[i]); > >>>>> } > >>>>> > >>>>> "n" is defined as the number of cells inside, i.e. without buffers. > "unk" is my external array. If I try to access buffer values, I use: > >>>>> > >>>>> for(Int i=0; i >>>>> Int gi = mesh.nodes[i].global_number; > >>>>> VecGetValues(x, 1, &gi, &unk[i]); > >>>>> } > >>>>> > >>>>> But then I end up with tons of warnings, presumably because I am > going beyond "n". Vector x was created with VecCreateGhost. > >>>>> > >>>> > >>>> > >>>> -- > >>>> > >>> > >>> > >>> -- > >>> > >>> > >> > >> > >> -- > >> > >> > > > > > > -- > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 6890 bytes Desc: not available URL: From mirzadeh at gmail.com Fri Feb 24 04:43:43 2012 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Fri, 24 Feb 2012 02:43:43 -0800 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: <0C9CAFE8C4E5CA49884E636AE1D7CD69025B056D@MAILBOX0B.psi.ch> References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> <4F46CE06.3020809@psi.ch> <0C9CAFE8C4E5CA49884E636AE1D7CD69025B056D@MAILBOX0B.psi.ch> Message-ID: Thanks Bojan. I suppose that would work. Problem arises, however, when the actual application numbering matters and you have to preserve it somehow. One example, is where you have a library that is currently sequential and a lot of functionality is directly based on the data structure used for the grid. For example when computing derivatives or performing interpolation and so on. Naturally you would want to reuse as much of the code as possible. So, you would think you can have a function, or a class, that, for example, takes the ghosted vector and the original global numbering and returns the value so that you could still use the old functions, say to compute gradients. Now, when you are constructing the ghosted vector, you know order you are putting the ghost nodes, lets say the first one has global index 42, next has 21, next 56 and so on ... However, when you call the function with the global index 21, you need to know what local ghost index it corresponds to in the local form of the ghosted vector. I can think of the following options to do this: 1- Search the ghost nodes indecies that you created the ghosted vector with: here, search int ghost [] = {42, 21, 56} for 21 and return the index which is 2. This, you can do at O(log n) e.g with binary search 2- Make a vector consisting of _all_ nodes on each processor that returns the ghost index, i.e. have something like ghostIndex[21] = 2, ghostIndex[42] = 1, ghostIndex[56] = 3. Using this you get the index in O(1) but at the cost of extra memory 3- Make an ao that maps {1,2,3} to {42, 21, 56}. This seems to be the best of both worlds So, in short, I'm interested in something that wont alter (at least permanently) the application global numbering so that I can use my old pieces of code and at the same time something that can compute the ghost location fast and efficient from a given global numbering. Mohammad On Fri, Feb 24, 2012 at 1:54 AM, Niceno Bojan wrote: > I do re-numbering during the domain decomposition stage. I re-number > nodes (unknows) in the way I believe PETSc does internaly anyhow. It was a > guess-work, I admit, but my communication patterns indeed work. Here is > what I do: > > > /*-------------------------------------+ > | Assign global numbers to nodes | > +-------------------------------------*/ > Int global_number = 0; > for(Int p=0; p for(Int n=0; n if( nodes[n].partition == p) { > nodes[n].global_number = global_number; > global_number++; > } > } > } > assert( global_number == nodes.size()); > > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov on behalf of Mohammad Mirzadeh > Sent: Fri 2/24/2012 10:42 AM > To: PETSc users list > Subject: Re: [petsc-users] Accessing Vector's ghost values > > Actually now that we are at it, what would you guys recommend as the best > way to find local indecies from global ones when accessing the _ghost_ > values in the local form? This becomes even more complicated when your > global petsc ordering is different from global application ordering. > > Its quite easy to access the local indecies for the local nodes -- you use > ao to convert application global to petsc global and from petsc global, it > is easy to get petsc local. However, when you have ghost nodes, there is no > such simple relationship between petsc global and the _ghosted_ petsc local > (Even though there is a simple one from petsc local to petsc global). The > best thing I have came up with so far is to use a sequential second ao to > map local indecies of ghost points to petsc global indecies. Is there a > better way of doing this? > > Mohammad > > On Thu, Feb 23, 2012 at 7:44 PM, Barry Smith wrote: > > > > > I have added the following to the VecGhostGetLocalForm() manual page to > > make this clearer for everyone: > > > > To update the ghost values from the locations on the other processes > > one must call > > VecGhostUpdateBegin() and VecGhostUpdateEnd() before accessing the > > ghost values. Thus normal > > usage is > > $ VecGhostUpdateBegin(x,INSERT_VALUES,SCATTER_FORWARD); > > $ VecGhostUpdateEnd(x,INSERT_VALUES,SCATTER_FORWARD); > > $ VecGhostGetLocalForm(x,&xlocal); > > $ VecGetArray(xlocal,&xvalues); > > $ /* access the non-ghost values in locations xvalues[0:n-1] and > > ghost values in locations xvalues[n:n+nghost]; */ > > $ VecRestoreArray(xlocal,&xvalues); > > $ VecGhostRestoreLocalForm(x,&xlocal); > > > > > > On Feb 23, 2012, at 5:38 PM, Bojan Niceno wrote: > > > > > Yeeee-ha! > > > > > > VecGhostUpdateBegin() / VecGhostUpdateEnd() was indeed what I was > > missing. > > > > > > Thank you Mohammad, thank you all who helped me today! > > > > > > > > > Cheers > > > > > > > > > Bojan > > > > > > > > > > > > On 2/24/2012 12:27 AM, Mohammad Mirzadeh wrote: > > >> You also need calls to VecGhostUpdateBegin()/VecGhostUpdateEnd() > > functions to update the ghost values if you change them in the global > > representation. See Petsc Manual 3.2 pp 55-56 > > >> > > >> Mohammad > > >> > > >> On Thu, Feb 23, 2012 at 3:18 PM, Bojan Niceno > > wrote: > > >> On 2/23/2012 11:49 PM, Mohammad Mirzadeh wrote: > > >>> based on, > > >>> > > >>> VecCreateGhost(PETSC_COMM_WORLD, n, PETSC_DECIDE, nghost, &ghosts[0], > > &x); > > >>> > > >>> > > >>> it seems to me that x is your actual ghosted vector. If this is true, > > then you need to get its "local" form via VecGhostGetLocalForm(). Once > you > > have done, you should be able to access the ghosted nodes. Are you > calling > > this function anywhere? > > >> > > >> I tried that before. I did: > > >> > > >> Vec lx; > > >> VecGhostGetLocalForm(x, &lx) > > >> > > >> then I copied "lx" to my variable, like > > >> > > >> for(Int i=1; i > >> unk[i] = lx[i] > > >> > > >> but ghost values were also zero. I am thinking that PETSc somehow > > clears the ghost values after a call to KSP. Is it the case? > > >> > > >> > > >> Kind regards, > > >> > > >> > > >> Bojan > > >> > > >> > > >>> > > >>> On Thu, Feb 23, 2012 at 2:32 PM, Bojan Niceno > > wrote: > > >>> Dear Mohammad, > > >>> > > >>> > > >>> it doesn't help me, or I did not understand your explanation. > > >>> > > >>> If I do this: > > >>> > > >>> /* copy internal values (THIS WORKS, BUT COPIES NO BUFFER VALUES) > */ > > >>> > > >>> for(Int i=0; i > >>> Int gi = mesh.nodes[i].global_number; > > >>> VecGetValues(x, 1, &gi, &unk[i]); > > >>> } > > >>> > > >>> /* copy ghost values (CREATES MANY WARNINGS */ > > >>> > > >>> for(Int i=n; i > >>> VecGetValues(x, 1, &i, &unk[i]); > > >>> } > > >>> > > >>> I get arnings are like this. > > >>> > > >>> [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > >>> [0]PETSC ERROR: Argument out of range! > > >>> [0]PETSC ERROR: Can only get local values, trying 3518! > > >>> [3]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > >>> [3]PETSC ERROR: Argument out of range! > > >>> [3]PETSC ERROR: Can only get local values, trying 3511! > > >>> [3]PETSC ERROR: > > ------------------------------------------------------------------------ > > >>> > > >>> What am I doing wrong here? > > >>> > > >>> > > >>> Cheers, > > >>> > > >>> > > >>> Bojan > > >>> > > >>> > > >>> > > >>> On 2/23/2012 11:23 PM, Mohammad Mirzadeh wrote: > > >>>> just index x with the local numberings. if you have 'n' local nodes > > and 'g' ghost nodes in x, ghost nodes indecies run from 'n' to 'n+g-1' > > >>>> > > >>>> On Feb 23, 2012 1:16 PM, "Bojan Niceno" > wrote: > > >>>> Dear Jed, > > >>>> > > >>>> thanks. > > >>>> > > >>>> Now I have the following: > > >>>> > > >>>> - Array unk, which should hold values inside the partition and in > > ghost cells. It is big enough to hold both > > >>>> - Vec x, created by command VecCreateGhost, with proper padding for > > ghost cells > > >>>> - Successful call to linear solver in parallel. > > >>>> > > >>>> But I need to copy ghost values from x to my array unk. > > >>>> > > >>>> How can I do it? > > >>>> > > >>>> > > >>>> Kind regards, > > >>>> > > >>>> > > >>>> Bojan > > >>>> > > >>>> On 2/23/2012 10:10 PM, Jed Brown wrote: > > >>>>> On Thu, Feb 23, 2012 at 15:05, Bojan Niceno > > wrote: > > >>>>> No, I use global. > > >>>>> > > >>>>> The local form is just a local vector. It doesn't even know that a > > global problem exists. You can't index into it using global indices. (In > > general, there is no efficient way to look up information in the local > > vector (includes ghost points) using global indices.) > > >>>>> > > >>>>> > > >>>>> for(Int i=0; i > >>>>> Int gi = mesh.nodes[i].global_number; > > >>>>> VecGetValues(x, 1, &gi, &unk[i]); > > >>>>> } > > >>>>> > > >>>>> "n" is defined as the number of cells inside, i.e. without buffers. > > "unk" is my external array. If I try to access buffer values, I use: > > >>>>> > > >>>>> for(Int i=0; i > >>>>> Int gi = mesh.nodes[i].global_number; > > >>>>> VecGetValues(x, 1, &gi, &unk[i]); > > >>>>> } > > >>>>> > > >>>>> But then I end up with tons of warnings, presumably because I am > > going beyond "n". Vector x was created with VecCreateGhost. > > >>>>> > > >>>> > > >>>> > > >>>> -- > > >>>> > > >>> > > >>> > > >>> -- > > >>> > > >>> > > >> > > >> > > >> -- > > >> > > >> > > > > > > > > > -- > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 24 07:20:35 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 24 Feb 2012 07:20:35 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> <4F46CE06.3020809@psi.ch> Message-ID: On Fri, Feb 24, 2012 at 03:42, Mohammad Mirzadeh wrote: > Actually now that we are at it, what would you guys recommend as the best > way to find local indecies from global ones when accessing the _ghost_ > values in the local form? This becomes even more complicated when your > global petsc ordering is different from global application ordering. > > Its quite easy to access the local indecies for the local nodes -- you use > ao to convert application global to petsc global and from petsc global, it > is easy to get petsc local. However, when you have ghost nodes, there is no > such simple relationship between petsc global and the _ghosted_ petsc local > (Even though there is a simple one from petsc local to petsc global). The > best thing I have came up with so far is to use a sequential second ao to > map local indecies of ghost points to petsc global indecies. Is there a > better way of doing this? > Do the conversion the other way. If your nodes are labeled with global indices, do a traversal (in some ordering, probably owned then ghosted) assigning local indices. As you point out, it is not efficient to convert global indices for ghost points into local indices, so you should build the data structure in terms of local indices and just keep a local-to-global mapping to glue the subdomains together. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 24 07:22:38 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 24 Feb 2012 07:22:38 -0600 Subject: [petsc-users] Accessing Vector's ghost values In-Reply-To: References: <4F466D58.5020506@psi.ch> <4F467FE6.40901@psi.ch> <4F46853A.7070409@psi.ch> <4F468A98.2010905@psi.ch> <4F46947F.9090200@psi.ch> <4F46A508.8030207@psi.ch> <4F46AA0A.4080209@psi.ch> <4F46ACC0.2000904@psi.ch> <4F46BE72.5070209@psi.ch> <4F46C94F.9080508@psi.ch> <4F46CE06.3020809@psi.ch> <0C9CAFE8C4E5CA49884E636AE1D7CD69025B056D@MAILBOX0B.psi.ch> Message-ID: On Fri, Feb 24, 2012 at 04:43, Mohammad Mirzadeh wrote: > Thanks Bojan. I suppose that would work. Problem arises, however, when the > actual application numbering matters and you have to preserve it somehow. > One example, is where you have a library that is currently sequential and a > lot of functionality is directly based on the data structure used for the > grid. For example when computing derivatives or performing interpolation > and so on. That serial library probably needs the local numbering anyway, if you give it the global indices, it won't be able to access those locations (at least not without a translation step that they did not write). It's better to build the data structure in terms of local indices. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fpoulin at uwaterloo.ca Fri Feb 24 08:08:40 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Fri, 24 Feb 2012 09:08:40 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> Message-ID: <20F1CF8C-C93B-4E7A-AD79-C9801CF301AD@uwaterloo.ca> Hello Barry and Jed, Thanks a lot for the responses, Barry's in particular made me laugh. In my defence I did install v3.2 on my desktop but the server that I'm using only has v3.1 installed. I've asked them nicely if they can update it. If not I'll figure out a way of doing it myself. In the mean time I will do the testing in my desktop to set things up before I do serious runs on the big cluster. I will use ex45.c and the options that you both suggested. Thanks again for the help, Francis On 2012-02-23, at 11:30 PM, Barry Smith wrote: > > On Feb 23, 2012, at 8:24 PM, Jed Brown wrote: > >> On Thu, Feb 23, 2012 at 18:58, Francis Poulin wrote: >> I can do it for each of them if that helps but I suspect the method is the same so I'm sending the information for the first 3, n = 2, 4, 8. In the mean time I will figure out how to change the number of levels.. >> >> There are only three levels. The coarsest level has 32k degrees of freedom, which is very expensive to solve (redundantly) with a direct solver. >> >> Run this, it's higher resolution and will be much faster than what you had. >> >> mpiexec -n 2 ./ex22 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -dmmg_nlevels 6 -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary >> >> >> Also, please switch to a more recent release of PETSc as soon as possible and do not develop new code using DMMG since that component has been removed (and its functionality incorporated into SNES and KSP). > > Using Petsc Release Version 3.1.0, Patch 4, Fri Jul 30 14:42:02 CDT 2010 > > Goodness gracious, you won't run performance tests on your grandpa's desk calculator would you? Switch to petsc-dev immediately http://www.mcs.anl.gov/petsc/developers/index.html and use src/ksp/ksp/examples/tutorials/ex45.c and use the options -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary -da_refine 6 > > Barry > From fpoulin at uwaterloo.ca Fri Feb 24 09:54:45 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Fri, 24 Feb 2012 10:54:45 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> Message-ID: <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Hello again, I am now running v3.2p6 and working with ksp/ex45.c, as suggested. When I try running ex22 like Jed suggested it works fine. When I try running ex45 like Barry suggested it runs but it does not seem to recognize two options. WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-mg_levels_ksp_type value: richardson Option left: name:-mg_levels_pc_type value: sor Also, I am getting an message saying that it is running it with the debugger and I should rerun ./configure to turn it off and it will run 2 or 3 times faster. Does that mean that the installation always uses the debugger or never uses it? I thought I would like it keep it for testing, assuming i figure out how it works, but then turn it off for serious runs. Thanks again, Francis On 2012-02-23, at 11:30 PM, Barry Smith wrote: > > On Feb 23, 2012, at 8:24 PM, Jed Brown wrote: > >> On Thu, Feb 23, 2012 at 18:58, Francis Poulin wrote: >> I can do it for each of them if that helps but I suspect the method is the same so I'm sending the information for the first 3, n = 2, 4, 8. In the mean time I will figure out how to change the number of levels.. >> >> There are only three levels. The coarsest level has 32k degrees of freedom, which is very expensive to solve (redundantly) with a direct solver. >> >> Run this, it's higher resolution and will be much faster than what you had. >> >> mpiexec -n 2 ./ex22 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -dmmg_nlevels 6 -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary >> >> >> Also, please switch to a more recent release of PETSc as soon as possible and do not develop new code using DMMG since that component has been removed (and its functionality incorporated into SNES and KSP). > > Using Petsc Release Version 3.1.0, Patch 4, Fri Jul 30 14:42:02 CDT 2010 > > Goodness gracious, you won't run performance tests on your grandpa's desk calculator would you? Switch to petsc-dev immediately http://www.mcs.anl.gov/petsc/developers/index.html and use src/ksp/ksp/examples/tutorials/ex45.c and use the options -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary -da_refine 6 > > Barry > From bsmith at mcs.anl.gov Fri Feb 24 10:04:15 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 24 Feb 2012 10:04:15 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: <4052729B-534D-4FA5-AAA2-7243E00C8812@mcs.anl.gov> On Feb 24, 2012, at 9:54 AM, Francis Poulin wrote: > Hello again, > > I am now running v3.2p6 and working with ksp/ex45.c, as suggested. > > When I try running ex22 like Jed suggested it works fine. > > When I try running ex45 like Barry suggested it runs but it does not seem to recognize two options. > > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-mg_levels_ksp_type value: richardson > Option left: name:-mg_levels_pc_type value: sor You likely need -pc_type mg Also run with -ksp_view to see what solver it is using NEVER assume you know what solver it is using without explicitly checking with -ksp_view > > Also, I am getting an message saying that it is running it with the debugger and I should rerun ./configure to turn it off and it will run 2 or 3 times faster. Does that mean that the installation always uses the debugger or never uses it? I thought I would like it keep it for testing, assuming i figure out how it works, but then turn it off for serious runs. By default we build the debug version which has more error checking and all the debug symbols. You should use this for ALL code development and testing of correctness. For production runs and testing of SPEED you should run ./configure WITH A DIFFERENT string name for PETSC_ARCH with the ./configure option --with-debugging=0 Barry > > Thanks again, > Francis > > On 2012-02-23, at 11:30 PM, Barry Smith wrote: > >> >> On Feb 23, 2012, at 8:24 PM, Jed Brown wrote: >> >>> On Thu, Feb 23, 2012 at 18:58, Francis Poulin wrote: >>> I can do it for each of them if that helps but I suspect the method is the same so I'm sending the information for the first 3, n = 2, 4, 8. In the mean time I will figure out how to change the number of levels.. >>> >>> There are only three levels. The coarsest level has 32k degrees of freedom, which is very expensive to solve (redundantly) with a direct solver. >>> >>> Run this, it's higher resolution and will be much faster than what you had. >>> >>> mpiexec -n 2 ./ex22 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -dmmg_nlevels 6 -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary >>> >>> >>> Also, please switch to a more recent release of PETSc as soon as possible and do not develop new code using DMMG since that component has been removed (and its functionality incorporated into SNES and KSP). >> >> Using Petsc Release Version 3.1.0, Patch 4, Fri Jul 30 14:42:02 CDT 2010 >> >> Goodness gracious, you won't run performance tests on your grandpa's desk calculator would you? Switch to petsc-dev immediately http://www.mcs.anl.gov/petsc/developers/index.html and use src/ksp/ksp/examples/tutorials/ex45.c and use the options -ksp_monitor -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary -da_refine 6 >> >> Barry >> > From jedbrown at mcs.anl.gov Fri Feb 24 10:39:56 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 24 Feb 2012 10:39:56 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: On Fri, Feb 24, 2012 at 09:54, Francis Poulin wrote: > I am now running v3.2p6 and working with ksp/ex45.c, as suggested. > Great. > > When I try running ex22 like Jed suggested it works fine. > > When I try running ex45 like Barry suggested it runs but it does not seem > to recognize two options. > > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-mg_levels_ksp_type value: richardson > Option left: name:-mg_levels_pc_type value: sor > Use these, it will run the same method and sizes as the options I gave for ex22 before. mpiexec.hydra -n 2 ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 5 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary > > Also, I am getting an message saying that it is running it with the > debugger and I should rerun ./configure to turn it off and it will run 2 or > 3 times faster. Does that mean that the installation always uses the > debugger or never uses it? I thought I would like it keep it for testing, > assuming i figure out how it works, but then turn it off for serious runs. > Configure --with-debugging=0. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fpoulin at uwaterloo.ca Fri Feb 24 11:49:46 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Fri, 24 Feb 2012 12:49:46 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: I don't seem to have the hydra in my bin folder but I do have petscmpiexec that I've been using. > > Use these, it will run the same method and sizes as the options I gave for ex22 before. > > mpiexec.hydra -n 2 ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 5 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary > Also, I want to install PetSc on an SGI machine that I have access to. I have been told that using MPT would give better performance compared to mpich2. When I configure petsc on this server I don't suppose -with-mpi-dir=/opt/sgi/mpt the above would work because of the different name. Do you have a suggestion as to what I could try? Francis -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Fri Feb 24 11:59:08 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 24 Feb 2012 11:59:08 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: On Fri, Feb 24, 2012 at 11:49, Francis Poulin wrote: > I don't seem to have the hydra in my bin folder but I do have petscmpiexec > that I've been using. > That's just the name of my mpiexec. Use whichever one is used with your build of PETSc. > > > Use these, it will run the same method and sizes as the options I gave for > ex22 before. > > mpiexec.hydra -n 2 ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 > -da_refine 5 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson > -mg_levels_pc_type sor -log_summary > > > > Also, I want to install PetSc on an SGI machine that I have access to. I > have been told that using MPT would give better performance compared to > mpich2. When I configure petsc on this server I don't suppose > > -with-mpi-dir=/opt/sgi/mpt > > the above would work because of the different name. Do you have a > suggestion as to what I could try? > It's just the way you launch parallel jobs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fpoulin at uwaterloo.ca Fri Feb 24 14:43:35 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Fri, 24 Feb 2012 15:43:35 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: Hello, I wanted to thank everyone for the help and say that I managed to get it running on the cluster and I have done some efficiency calculations. To run the code I used, mpirun -np # ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 6 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary as suggested and found the following p(#cpu) Tp (parallel) T1 (serial) Efficiency [ = Tp/(p*T1) ] -------------------------------------------------------------------------------------- 1 904 904 1.00 2 553 904 0.82 4 274 904 0.83 8 138 904 0.82 16 70 904 0.81 32 36 904 0.78 It seems to scale beautifully starting at 2 but there is a big drop from 1 to 2. I suspect there's a very good reason for this, I just don't know what. Thanks again for all of your help, Francis On 2012-02-24, at 12:59 PM, Jed Brown wrote: > On Fri, Feb 24, 2012 at 11:49, Francis Poulin wrote: > I don't seem to have the hydra in my bin folder but I do have petscmpiexec that I've been using. > > That's just the name of my mpiexec. Use whichever one is used with your build of PETSc. > > >> >> Use these, it will run the same method and sizes as the options I gave for ex22 before. >> >> mpiexec.hydra -n 2 ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 5 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary >> > > Also, I want to install PetSc on an SGI machine that I have access to. I have been told that using MPT would give better performance compared to mpich2. When I configure petsc on this server I don't suppose > > -with-mpi-dir=/opt/sgi/mpt > > the above would work because of the different name. Do you have a suggestion as to what I could try? > > It's just the way you launch parallel jobs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Feb 24 15:08:16 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 24 Feb 2012 15:08:16 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: <100B94EF-A456-4805-8111-5913D0B72526@mcs.anl.gov> On Feb 24, 2012, at 2:43 PM, Francis Poulin wrote: > Hello, > > I wanted to thank everyone for the help and say that I managed to get it running on the cluster and I have done some efficiency calculations. To run the code I used, > > mpirun -np # ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 6 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary > > as suggested and found the following > > p(#cpu) Tp (parallel) T1 (serial) Efficiency [ = Tp/(p*T1) ] > -------------------------------------------------------------------------------------- > 1 904 904 1.00 > 2 553 904 0.82 > 4 274 904 0.83 > 8 138 904 0.82 > 16 70 904 0.81 > 32 36 904 0.78 > > It seems to scale beautifully starting at 2 but there is a big drop from 1 to 2. I suspect there's a very good reason for this, I just don't know what. You need to understand the issues of memory bandwidth shared between cores and between processors, memory affinity and thread affinity ("binding") see http://www.mcs.anl.gov/petsc/documentation/faq.html#computers Barry > > Thanks again for all of your help, > Francis > > On 2012-02-24, at 12:59 PM, Jed Brown wrote: > >> On Fri, Feb 24, 2012 at 11:49, Francis Poulin wrote: >> I don't seem to have the hydra in my bin folder but I do have petscmpiexec that I've been using. >> >> That's just the name of my mpiexec. Use whichever one is used with your build of PETSc. >> >> >>> >>> Use these, it will run the same method and sizes as the options I gave for ex22 before. >>> >>> mpiexec.hydra -n 2 ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 5 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type sor -log_summary >>> >> >> Also, I want to install PetSc on an SGI machine that I have access to. I have been told that using MPT would give better performance compared to mpich2. When I configure petsc on this server I don't suppose >> >> -with-mpi-dir=/opt/sgi/mpt >> >> the above would work because of the different name. Do you have a suggestion as to what I could try? >> >> It's just the way you launch parallel jobs. > From fpoulin at uwaterloo.ca Fri Feb 24 15:22:00 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Fri, 24 Feb 2012 16:22:00 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: <100B94EF-A456-4805-8111-5913D0B72526@mcs.anl.gov> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> <100B94EF-A456-4805-8111-5913D0B72526@mcs.anl.gov> Message-ID: <1C244AD3-9D21-4275-9E66-161D4C18443F@uwaterloo.ca> Thanks for the link. I did the tests on an SGI SMP machine that, if I understand correctly, shares the memory much better than most other systems. I will read about these different components and see what I can figure out. Thanks, Francis > > You need to understand the issues of memory bandwidth shared between cores and between processors, memory affinity and thread affinity ("binding") see http://www.mcs.anl.gov/petsc/documentation/faq.html#computers > > Barry > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Feb 24 15:46:43 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 24 Feb 2012 15:46:43 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: <1C244AD3-9D21-4275-9E66-161D4C18443F@uwaterloo.ca> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> <100B94EF-A456-4805-8111-5913D0B72526@mcs.anl.gov> <1C244AD3-9D21-4275-9E66-161D4C18443F@uwaterloo.ca> Message-ID: <09614DB6-D4F2-4321-BD0D-06DE5226E0C3@mcs.anl.gov> On Feb 24, 2012, at 3:22 PM, Francis Poulin wrote: > Thanks for the link. I did the tests on an SGI SMP machine that, if I understand correctly, shares the memory much better than most other systems. I will read about these different components and see what I can figure out. Yes, you could be right. Unlike traditional linux that machine is more likely to do the right thing by default. Barry > > Thanks, > Francis > >> >> You need to understand the issues of memory bandwidth shared between cores and between processors, memory affinity and thread affinity ("binding") see http://www.mcs.anl.gov/petsc/documentation/faq.html#computers >> >> Barry >> > From jedbrown at mcs.anl.gov Fri Feb 24 20:08:42 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Fri, 24 Feb 2012 20:08:42 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: On Fri, Feb 24, 2012 at 14:43, Francis Poulin wrote: > It seems to scale beautifully starting at 2 but there is a big drop from 1 > to 2. I suspect there's a very good reason for this, I just don't know > what. Can you send the full output? The smoother is slightly different, so the number of iterations could be different by 1 (for example). The -log_summary part of the output will show us where the time is being spent, so we'll be able to say what did not scale well between 1 and 2 procs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fpoulin at uwaterloo.ca Fri Feb 24 21:20:16 2012 From: fpoulin at uwaterloo.ca (Francis Poulin) Date: Fri, 24 Feb 2012 22:20:16 -0500 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> Message-ID: <1657A66A-AC60-44E9-85CE-A1B65E002E5C@uwaterloo.ca> Hello Barry, Thanks for offering to look at this. I configured a different version that does not have the debugger and the results are a lot faster but the efficiency is still similar. Below you'll see the results. n Tp Efficiency ------------------------------------- 1 162 1 2 110 0.74 4 47 0.86 8 24 0.84 16 12 0.84 32 6 0.84 I'm also including the output of the log_summary for n =1 and 2. As I said before, this is an SMP machine so I would expect it to be better than the typical cluster. That being said I am still learning how this works. I am very happy that I managed to get some encouraging results in a day. Cheers, Francis On 2012-02-24, at 9:08 PM, Jed Brown wrote: > On Fri, Feb 24, 2012 at 14:43, Francis Poulin wrote: > It seems to scale beautifully starting at 2 but there is a big drop from 1 to 2. I suspect there's a very good reason for this, I just don't know what. > > Can you send the full output? The smoother is slightly different, so the number of iterations could be different by 1 (for example). The -log_summary part of the output will show us where the time is being spent, so we'll be able to say what did not scale well between 1 and 2 procs. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output_n1.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: output_n2.txt URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Feb 24 21:36:46 2012 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 24 Feb 2012 21:36:46 -0600 Subject: [petsc-users] testing scalability for ksp/ex22.c (now ex45.c) In-Reply-To: <1657A66A-AC60-44E9-85CE-A1B65E002E5C@uwaterloo.ca> References: <9FAADF24-0A23-472E-A0D7-0007D64A13E9@uwaterloo.ca> <5DBCF778-354C-4F87-89AE-BEE22D65E360@mcs.anl.gov> <1EFED56D-D125-4B42-8BB6-1188BAFB5981@uwaterloo.ca> <9BF9DE18-E482-4653-8FA6-84615B451353@mcs.anl.gov> <8859B5A3-387F-4F97-9A7A-A9EE1D89EC54@uwaterloo.ca> <1657A66A-AC60-44E9-85CE-A1B65E002E5C@uwaterloo.ca> Message-ID: You need to understand FOR THIS SPECIFIC MACHINE ARCHITECTURE what happens when it gos from using one processor to two? Is the second processor another core that shares common memory with the first processor? Or is it a completely separate core on a different node that has its own memory? Consider MatSOR 48 1.0 3.8607e+01 1.0 3.27e+09 1.0 0.0e+00 0.0e+00 0.0e+00 24 45 0 0 0 24 45 0 0 0 85 MatSOR 48 1.0 2.5600e+01 1.1 1.64e+09 1.0 4.8e+01 1.2e+05 4.8e+01 22 45 19 30 8 22 45 19 30 8 128 this computation across the two cores is embarrassing parallel, hence the flop rate for two processes should be 170, not 128 (the ratio is .75 very close to the .74 efficiency you get on two processes). So why is it not 170? The most likely answer is that the two cores shared a common memory and that memory is not fast enough (does not have enough memory bandwidth) to server both cores at the individual speed (85) that each of them can run. This is the curse of virtually any shared memory systems (that most people like to gloss over). Unless the memory bandwidth grows linearly with the number of cores you use the performance cannot grow linearly with the number of cores. On almost no system with shared memory does the memory bandwidth grow in that direction. You need to find out for this machine how to direct the executable to be run so that each of the two processes runs on different NODES of the system so they don't shared memory, then you will see a number better than the .74 for the two processes. But note that once you want to use all the cores on the system you will have cores shared memory and your parallel efficiency will go down. This is all material that should be presented in the first week of a parallel computing class and is crucial to understand if one plans to do "parallel computing". Barry On Feb 24, 2012, at 9:20 PM, Francis Poulin wrote: > Hello Barry, > > Thanks for offering to look at this. > > I configured a different version that does not have the debugger and the results are a lot faster but the efficiency is still similar. Below you'll see the results. > > n Tp Efficiency > ------------------------------------- > 1 162 1 > 2 110 0.74 > 4 47 0.86 > 8 24 0.84 > 16 12 0.84 > 32 6 0.84 > > I'm also including the output of the log_summary for n =1 and 2. > > As I said before, this is an SMP machine so I would expect it to be better than the typical cluster. That being said I am still learning how this works. > > I am very happy that I managed to get some encouraging results in a day. > > Cheers, Francis > > > > > > On 2012-02-24, at 9:08 PM, Jed Brown wrote: > >> On Fri, Feb 24, 2012 at 14:43, Francis Poulin wrote: >> It seems to scale beautifully starting at 2 but there is a big drop from 1 to 2. I suspect there's a very good reason for this, I just don't know what. >> >> Can you send the full output? The smoother is slightly different, so the number of iterations could be different by 1 (for example). The -log_summary part of the output will show us where the time is being spent, so we'll be able to say what did not scale well between 1 and 2 procs. > From wumengda at gmail.com Sat Feb 25 18:01:13 2012 From: wumengda at gmail.com (Mengda Wu) Date: Sat, 25 Feb 2012 19:01:13 -0500 Subject: [petsc-users] Important Petsc header files to include Message-ID: Hi all, I am trying to select a set of important petsc header files to included in my installation of Petsc. Basically, I would like the header and library files to be outside of Petsc source directory. I found there are many headers under PETSC_DIR/include. Are these all necessary for a program to just use PETSC functions? Thanks, Mengda -------------- next part -------------- An HTML attachment was scrubbed... URL: From jedbrown at mcs.anl.gov Sat Feb 25 18:05:09 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Sat, 25 Feb 2012 18:05:09 -0600 Subject: [petsc-users] Important Petsc header files to include In-Reply-To: References: Message-ID: On Sat, Feb 25, 2012 at 18:01, Mengda Wu wrote: > I am trying to select a set of important petsc header files to included in > my installation of Petsc. Basically, I would like the header and library > files to be outside of Petsc source directory. I found there are many > headers under PETSC_DIR/include. Are these all necessary for a program to > just use PETSC functions? It depends which level interface you use (e.g. if you only use KSP, you wouldn't need the SNES or TS headers). But why do you want to copy these somewhere? (You're going to have endless problems if you don't know what you're doing.) It sounds like a waste of time to me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Feb 25 18:50:13 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 25 Feb 2012 18:50:13 -0600 Subject: [petsc-users] Important Petsc header files to include In-Reply-To: References: Message-ID: On Sat, Feb 25, 2012 at 6:01 PM, Mengda Wu wrote: > Hi all, > > I am trying to select a set of important petsc header files to > included in my installation of Petsc. Basically, I would like the header > and library files to be outside of Petsc source directory. I found there > are many headers under PETSC_DIR/include. Are these all necessary for a > program to just use PETSC functions? > It sounds like what you want is installation. Configure using --prefix= and then 'make; make install'. Everything you need will be in that directory. Matt > Thanks, > Mengda > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronroland at gmx.de Sun Feb 26 10:16:57 2012 From: aaronroland at gmx.de (Aron Roland) Date: Sun, 26 Feb 2012 17:16:57 +0100 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4A5AC3.6030203@wb.tu-darmstadt.de> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> Message-ID: <4F4A5AF9.3060006@gmx.de> Dear All, I hope somebody can help us on this or give at least some clearance. We have just included PETSc as an solver for our sparse matrix evolving from an unstructured mesh advection scheme. The problem is that we are using the mpiaij matrix type, since our matrix is naturally sparse. However it seems that PETSc has no PC for this, except the PCSOR, which showed to be not very effective for our problem. All others give the error msg. of the mail subject, where XXX are the different PC tried. The manual is a bit diffuse on this e.g. http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html it is claimed that certain PC's are running on aij matrices ... but these are to be defined either as seq. or parallel (mpiaij) matrices. Moreover in the above mentioned list are two columns parallel/seriel, what is the intention of parallel capability when not applicable to matrices stored within the parallel mpiaij framework. I guess we just not understanding the concept or have some other difficulties of understanding of all this. Any comments help is welcome Aron From mark.adams at columbia.edu Sun Feb 26 10:46:03 2012 From: mark.adams at columbia.edu (Mark F. Adams) Date: Sun, 26 Feb 2012 11:46:03 -0500 Subject: [petsc-users] [petsc-maint #107250] VecAssemblyBegin/End() In-Reply-To: References: Message-ID: yes On Feb 26, 2012, at 11:43 AM, saturday luis wrote: > Hi: > > Is it necessary to use VecAssemblyBegin/End between INSERT_VALUES and > ADD_VALUES mode of VecSetValues function call? > > Luis > > From maxwellr at gmail.com Sun Feb 26 10:48:48 2012 From: maxwellr at gmail.com (Max Rudolph) Date: Sun, 26 Feb 2012 08:48:48 -0800 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4A5AF9.3060006@gmx.de> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> Message-ID: MPIAIJ and SEQQIJ matrices are subtypes of the AIJ matrix type. Looking at that table, you should be able to use any of the PCs that supports AIJ and has an X under 'parallel'. Max On Sun, Feb 26, 2012 at 8:16 AM, Aron Roland wrote: > Dear All, > > I hope somebody can help us on this or give at least some clearance. > > We have just included PETSc as an solver for our sparse matrix evolving > from an unstructured mesh advection scheme. > > The problem is that we are using the mpiaij matrix type, since our matrix > is naturally sparse. However it seems that PETSc has no PC for this, except > the PCSOR, which showed to be not very effective for our problem. > > All others give the error msg. of the mail subject, where XXX are the > different PC tried. > > The manual is a bit diffuse on this e.g. > > http://www.mcs.anl.gov/petsc/**documentation/**linearsolvertable.html > > it is claimed that certain PC's are running on aij matrices ... but these > are to be defined either as seq. or parallel (mpiaij) matrices. Moreover in > the above mentioned list are two columns parallel/seriel, what is the > intention of parallel capability when not applicable to matrices stored > within the parallel mpiaij framework. > > I guess we just not understanding the concept or have some other > difficulties of understanding of all this. > > Any comments help is welcome > > Aron > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Feb 26 11:17:29 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 26 Feb 2012 11:17:29 -0600 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> Message-ID: On Sun, Feb 26, 2012 at 10:48 AM, Max Rudolph wrote: > MPIAIJ and SEQQIJ matrices are subtypes of the AIJ matrix type. Looking at > that table, you should be able to use any of the PCs that supports AIJ and > has an X under 'parallel'. Max is correct. For instance, the most popular general purpose parallel solver is ASM (Additive Schwarz Method), which then has a sequential subsolver for each block, which defaults to ILU. Matt > Max > > > On Sun, Feb 26, 2012 at 8:16 AM, Aron Roland wrote: > >> Dear All, >> >> I hope somebody can help us on this or give at least some clearance. >> >> We have just included PETSc as an solver for our sparse matrix evolving >> from an unstructured mesh advection scheme. >> >> The problem is that we are using the mpiaij matrix type, since our matrix >> is naturally sparse. However it seems that PETSc has no PC for this, except >> the PCSOR, which showed to be not very effective for our problem. >> >> All others give the error msg. of the mail subject, where XXX are the >> different PC tried. >> >> The manual is a bit diffuse on this e.g. >> >> http://www.mcs.anl.gov/petsc/**documentation/**linearsolvertable.html >> >> it is claimed that certain PC's are running on aij matrices ... but these >> are to be defined either as seq. or parallel (mpiaij) matrices. Moreover in >> the above mentioned list are two columns parallel/seriel, what is the >> intention of parallel capability when not applicable to matrices stored >> within the parallel mpiaij framework. >> >> I guess we just not understanding the concept or have some other >> difficulties of understanding of all this. >> >> Any comments help is welcome >> >> Aron >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland at wb.tu-darmstadt.de Sun Feb 26 10:16:03 2012 From: roland at wb.tu-darmstadt.de (Aron Roland) Date: Sun, 26 Feb 2012 17:16:03 +0100 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! Message-ID: <4F4A5AC3.6030203@wb.tu-darmstadt.de> Dear All, I hope somebody can help us on this or give at least some clearance. We have just included PETSc as an solver for our sparse matrix evolving from an unstructured mesh advection scheme. The problem is that we are using the mpiaij matrix type, since our matrix is naturally sparse. However it seems that PETSc has no PC for this, except the PCSOR, which showed to be not very effective for our problem. All others give the error msg. of the mail subject, where XXX are the different PC tried. The manual is a bit diffuse on this e.g. http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html it is claimed that certain PC's are running on aij matrices ... but these are to be defined either as seq. or parallel (mpiaij) matrices. Moreover in the above mentioned list are two columns parallel/seriel, what is the intention of parallel capability when not applicable to matrices stored within the parallel mpiaij framework. I guess we just not understanding the concept or have some other difficulties of understanding of all this. Any comments help is welcome Aron From luis.saturday at gmail.com Sun Feb 26 10:43:31 2012 From: luis.saturday at gmail.com (saturday luis) Date: Sun, 26 Feb 2012 10:43:31 -0600 Subject: [petsc-users] VecAssemblyBegin/End() Message-ID: Hi: Is it necessary to use VecAssemblyBegin/End between INSERT_VALUES and ADD_VALUES mode of VecSetValues function call? Luis -------------- next part -------------- An HTML attachment was scrubbed... URL: From maxwellr at gmail.com Sun Feb 26 16:39:44 2012 From: maxwellr at gmail.com (Max Rudolph) Date: Sun, 26 Feb 2012 14:39:44 -0800 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: I did eventually make my test case work using the split preconditioning, first using additive schwartz as the preconditioner for the upper left (0,0) block and then using ML with asm and gmres within each multigrid level. I am posting my runtime options in case it might be helpful for someone else out there who wants to try this. The key to getting the solver to converge for me was starting with a good initial guess (the solution diverged with a zero initial guess), especially for the pressure field, using gcr as the outer ksp and, small ksp_rtol values for both of the inner ksps. Max -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type multiplicative \ -stokes_ksp_initial_guess_nonzero \ -stokes_fieldsplit_0_pc_type asm \ -stokes_fieldsplit_0_ksp_type gmres \ -stokes_fieldsplit_0_ksp_initial_guess_nonzero \ -stokes_fieldsplit_0_ksp_max_it 10 \ -stokes_fieldsplit_0_ksp_rtol 1.0e-9 \ -stokes_fieldsplit_1_pc_type jacobi \ -stokes_fieldsplit_1_ksp_type gmres \ -stokes_fieldsplit_1_ksp_max_it 10 \ -stokes_fieldsplit_1_ksp_rtol 1.0e-9 \ -stokes_ksp_type gcr \ -stokes_ksp_monitor_blocks \ -stokes_ksp_monitor \ -stokes_ksp_view \ -stokes_ksp_atol 1e-2 \ -stokes_ksp_rtol 0.0 \ On Wed, Feb 22, 2012 at 1:29 PM, Max Rudolph wrote: > > > On Tue, Feb 21, 2012 at 3:36 AM, Dave May wrote: > >> Max, >> >> > >> > The test case that I am working with is isoviscous convection, benchmark >> > case 1a from Blankenbach 1989. >> > >> >> Okay, I know this problem. >> An iso viscous problem, solved on a uniform grid using dx=dy=dz >> discretised >> via FV should be super easy to precondition. >> >> > >> > >> > I think that this is the problem. The (2,2) slot in the LHS matrix is >> all >> > zero (pressure does not appear in the continuity equation), so I think >> that >> > the preconditioner is meaningless. I am still confused as to why this >> choice >> > of preconditioner was suggested in the tutorial, and what is a better >> choice >> > of preconditioner for this block? Should I be using one of the Schur >> > complement methods instead of the additive or multiplicative field >> split? >> > >> >> No, you need to define an appropriate stokes preconditioner >> You should assemble this matrix >> B = ( K,B ; B^T, -1/eta* I ) >> as the preconditioner for stokes. >> Here eta* is a measure of the local viscosity within each pressure >> control volume. >> Unless you specify to use the real diagonal >> >> Pass this into the third argument in KSPSetOperators() (i.e. the Pmat >> variable) >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetOperators.html >> >> Not sure how you represent A and B, but if you really want to run just >> additive with fieldsplit, you don't need the off diagonal blocks, so >> B = ( K,0 ; 0, -1/eta* I ) >> would yield the same result. Depending on your matrix representation, >> this may save you some memory. >> >> PCFieldsplit will use the B(1,1) and B(2,2) to build the stokes >> preconditioner unless you ask for it to use the real diagonal - but >> for the stokes operator A, this makes no sense. >> >> This is the right thing to do (as Matt states). >> Try it out, and let us know how it goes. >> >> >> Cheers, >> Dave >> > > Dave and Matt, > Thanks for your help. I had some time to work on this a little more. I now > have a stokes operator A that looks like this: > A=(K B; B^T 0) and a matrix from which the preconditioner is generated > P=(K B; B^T -1/eta*I) > > I verified that I can solve this system using the default ksp and pc > settings in 77 iterations for the first timestep (initial guess zero) and > in 31 iterations for the second timestep (nonzero initial guess). > > I adopted your suggestion to use the multiplicative field split as a > starting point. My reading of the PETSc manual suggests to me that the > preconditioner formed should then look like: > > B = (ksp(K,K) 0;-B^T*ksp(K,K)*ksp(0,-1/eta*I) ksp(0,-1/eta*I)) > > My interpretation of the output suggests that the solvers within each > fieldsplit are converging nicely, but the global residual is not decreasing > after the first few iterations. Given the disparity in residual sizes, I > think that there might be a problem with the scaling of the pressure > variable (I scaled the continuity equation by eta/dx where dx is my grid > spacing). I also scaled the (1,1) block in the preconditioner by this scale > factor. Thanks again for all of your help. > > Max > > > Options used: > -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ > -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type multiplicative \ > -stokes_fieldsplit_0_pc_type ml \ > -stokes_fieldsplit_0_ksp_type gmres \ > -stokes_fieldsplit_0_ksp_monitor_true_residual \ > -stokes_fieldsplit_0_ksp_norm_type UNPRECONDITIONED \ > -stokes_fieldsplit_0_ksp_max_it 3 \ > -stokes_fieldsplit_0_ksp_type gmres \ > -stokes_fieldsplit_0_ksp_rtol 1.0e-4 \ > -stokes_fieldsplit_0_mg_levels_ksp_type gmres \ > -stokes_fieldsplit_0_mg_levels_pc_type bjacobi \ > -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 \ > -stokes_fieldsplit_1_pc_type jacobi \ > -stokes_fieldsplit_1_ksp_type preonly \ > -stokes_fieldsplit_1_ksp_max_it 3 \ > -stokes_fieldsplit_1_ksp_monitor_true_residual \ > -stokes_ksp_type gcr \ > -stokes_ksp_monitor_blocks \ > -stokes_ksp_monitor_draw \ > -stokes_ksp_view \ > -stokes_ksp_atol 1e-6 \ > -stokes_ksp_rtol 1e-6 \ > -stokes_ksp_max_it 100 \ > -stokes_ksp_norm_type UNPRECONDITIONED \ > -stokes_ksp_monitor_true_residual \ > > Output: > > 0 KSP Component U,V,P residual norm [ 0.000000000000e+00, > 1.165111661413e+06, 0.000000000000e+00 ] > Residual norms for stokes_ solve. > 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm > 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm > 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 3.173622513625e+05 true resid norm > 3.173622513625e+05 ||r(i)||/||b|| 2.723878421898e-01 > 2 KSP unpreconditioned resid norm 5.634119635158e+04 true resid norm > 1.725996376799e+05 ||r(i)||/||b|| 1.481399967026e-01 > 3 KSP unpreconditioned resid norm 1.218418968344e+03 true resid norm > 1.559727441168e+05 ||r(i)||/||b|| 1.338693528546e-01 > 1 KSP Component U,V,P residual norm [ 5.763380362961e+04, > 1.154490085631e+05, 3.370358145704e-12 ] > 1 KSP unpreconditioned resid norm 1.290353784783e+05 true resid norm > 1.290353784783e+05 ||r(i)||/||b|| 1.107493665644e-01 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.290353784783e+05 true resid norm > 1.290353784783e+05 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 1.655137188235e+04 true resid norm > 1.655137188235e+04 ||r(i)||/||b|| 1.282700301076e-01 > 2 KSP unpreconditioned resid norm 1.195941831181e+03 true resid norm > 4.554417355181e+03 ||r(i)||/||b|| 3.529588093508e-02 > 3 KSP unpreconditioned resid norm 8.479547025398e+01 true resid norm > 3.817072778396e+03 ||r(i)||/||b|| 2.958159865466e-02 > 2 KSP Component U,V,P residual norm [ 2.026983725663e+03, > 2.531521226429e+03, 3.419060873106e-12 ] > 2 KSP unpreconditioned resid norm 3.243032954498e+03 true resid norm > 3.243032954498e+03 ||r(i)||/||b|| 2.783452489493e-03 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 3.243032954498e+03 true resid norm > 3.243032954498e+03 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 1.170090628031e+02 true resid norm > 1.170090628031e+02 ||r(i)||/||b|| 3.608013376517e-02 > 2 KSP unpreconditioned resid norm 9.782830529900e+00 true resid norm > 1.741722174777e+01 ||r(i)||/||b|| 5.370658267168e-03 > 3 KSP unpreconditioned resid norm 6.886950142735e-01 true resid norm > 1.636749336722e+01 ||r(i)||/||b|| 5.046971028932e-03 > 3 KSP Component U,V,P residual norm [ 7.515013854917e+01, > 7.515663601801e+01, 3.418919176066e-12 ] > 3 KSP unpreconditioned resid norm 1.062829396540e+02 true resid norm > 1.062829396540e+02 ||r(i)||/||b|| 9.122124786317e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.062829396540e+02 true resid norm > 1.062829396540e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.373578062042e+01 true resid norm > 5.373578062042e+01 ||r(i)||/||b|| 5.055917797846e-01 > 2 KSP unpreconditioned resid norm 1.199305097134e+00 true resid norm > 3.492111756827e+01 ||r(i)||/||b|| 3.285674792393e-01 > 3 KSP unpreconditioned resid norm 9.508597255523e-02 true resid norm > 3.452079362567e+01 ||r(i)||/||b|| 3.248008922038e-01 > 4 KSP Component U,V,P residual norm [ 7.495897679790e+01, > 7.527868410560e+01, 3.418919160091e-12 ] > 4 KSP unpreconditioned resid norm 1.062343093509e+02 true resid norm > 1.062343093509e+02 ||r(i)||/||b|| 9.117950911420e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.062343093509e+02 true resid norm > 1.062343093509e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.419252803207e+01 true resid norm > 5.419252803207e+01 ||r(i)||/||b|| 5.101226558840e-01 > 2 KSP unpreconditioned resid norm 1.431134174522e+00 true resid norm > 3.339055236737e+01 ||r(i)||/||b|| 3.143104386088e-01 > 3 KSP unpreconditioned resid norm 9.760479467902e-02 true resid norm > 3.304522520358e+01 ||r(i)||/||b|| 3.110598205561e-01 > 5 KSP Component U,V,P residual norm [ 7.491128585963e+01, > 7.523275560552e+01, 3.418919008441e-12 ] > 5 KSP unpreconditioned resid norm 1.061681132221e+02 true resid norm > 1.061681132221e+02 ||r(i)||/||b|| 9.112269384837e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.061681132221e+02 true resid norm > 1.061681132221e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.343215079492e+01 true resid norm > 5.343215079492e+01 ||r(i)||/||b|| 5.032787074508e-01 > 2 KSP unpreconditioned resid norm 1.288069736759e+00 true resid norm > 3.308925591301e+01 ||r(i)||/||b|| 3.116684935691e-01 > 3 KSP unpreconditioned resid norm 9.505248953960e-02 true resid norm > 3.281875055845e+01 ||r(i)||/||b|| 3.091205971589e-01 > 6 KSP Component U,V,P residual norm [ 7.481188568118e+01, > 7.527346267608e+01, 3.418918860626e-12 ] > 6 KSP unpreconditioned resid norm 1.061268694649e+02 true resid norm > 1.061268694649e+02 ||r(i)||/||b|| 9.108729487455e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.061268694649e+02 true resid norm > 1.061268694649e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.300383444945e+01 true resid norm > 5.300383444945e+01 ||r(i)||/||b|| 4.994384053416e-01 > 2 KSP unpreconditioned resid norm 1.118785004087e+00 true resid norm > 3.282090953364e+01 ||r(i)||/||b|| 3.092610730828e-01 > 3 KSP unpreconditioned resid norm 9.758015489979e-02 true resid norm > 3.259718081014e+01 ||r(i)||/||b|| 3.071529479244e-01 > 7 KSP Component U,V,P residual norm [ 7.475024970669e+01, > 7.530858268154e+01, 3.418918784089e-12 ] > 7 KSP unpreconditioned resid norm 1.061083524362e+02 true resid norm > 1.061083524362e+02 ||r(i)||/||b|| 9.107140195255e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.061083524362e+02 true resid norm > 1.061083524362e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.296981668051e+01 true resid norm > 5.296981668051e+01 ||r(i)||/||b|| 4.992049679820e-01 > 2 KSP unpreconditioned resid norm 9.379451887610e-01 true resid norm > 3.378466967056e+01 ||r(i)||/||b|| 3.183978348066e-01 > 3 KSP unpreconditioned resid norm 9.102580142867e-02 true resid norm > 3.360853440947e+01 ||r(i)||/||b|| 3.167378781957e-01 > 8 KSP Component U,V,P residual norm [ 7.464535615814e+01, > 7.537007679541e+01, 3.418918790515e-12 ] > 8 KSP unpreconditioned resid norm 1.060781677449e+02 true resid norm > 1.060781677449e+02 ||r(i)||/||b|| 9.104549482946e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.060781677449e+02 true resid norm > 1.060781677449e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.281972737642e+01 true resid norm > 5.281972737642e+01 ||r(i)||/||b|| 4.979321240109e-01 > 2 KSP unpreconditioned resid norm 9.224594814880e-01 true resid norm > 3.351285171891e+01 ||r(i)||/||b|| 3.159260046751e-01 > 3 KSP unpreconditioned resid norm 9.143100662935e-02 true resid norm > 3.329269756083e+01 ||r(i)||/||b|| 3.138506091177e-01 > 9 KSP Component U,V,P residual norm [ 7.451688471900e+01, > 7.544516987344e+01, 3.418918860847e-12 ] > 9 KSP unpreconditioned resid norm 1.060412172952e+02 true resid norm > 1.060412172952e+02 ||r(i)||/||b|| 9.101378074496e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.060412172952e+02 true resid norm > 1.060412172952e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.275132249899e+01 true resid norm > 5.275132249899e+01 ||r(i)||/||b|| 4.974605520805e-01 > 2 KSP unpreconditioned resid norm 7.755381284769e-01 true resid norm > 3.453933285011e+01 ||r(i)||/||b|| 3.257161105001e-01 > 3 KSP unpreconditioned resid norm 7.298768665179e-02 true resid norm > 3.435179316160e+01 ||r(i)||/||b|| 3.239475558447e-01 > 10 KSP Component U,V,P residual norm [ 7.451431102619e+01, > 7.544762349626e+01, 3.418918857322e-12 ] > 10 KSP unpreconditioned resid norm 1.060411544587e+02 true resid norm > 1.060411544587e+02 ||r(i)||/||b|| 9.101372681321e-05 > Residual norms for stokes_fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.060411544587e+02 true resid norm > 1.060411544587e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.276103518337e+01 true resid norm > 5.276103518337e+01 ||r(i)||/||b|| 4.975524403961e-01 > 2 KSP unpreconditioned resid norm 7.777079890360e-01 true resid norm > 3.454373663425e+01 ||r(i)||/||b|| 3.257578325186e-01 > 3 KSP unpreconditioned resid norm 7.356028471071e-02 true resid norm > 3.435584054266e+01 ||r(i)||/||b|| 3.239859158269e-01 > 11 KSP Component U,V,P residual norm [ 7.438335197779e+01, > 7.553731959735e+01, 3.418918856471e-12 ] > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Feb 26 18:03:03 2012 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 26 Feb 2012 18:03:03 -0600 Subject: [petsc-users] Starting point for Stokes fieldsplit In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 4:39 PM, Max Rudolph wrote: > I did eventually make my test case work using the split preconditioning, > first using additive schwartz as the preconditioner for the upper left > (0,0) block and then using ML with asm and gmres within each multigrid > level. I am posting my runtime options in case it might be helpful for > someone else out there who wants to try this. The key to getting the solver > to converge for me was starting with a good initial guess (the solution > diverged with a zero initial guess), especially for the pressure field, > using gcr as the outer ksp and, small ksp_rtol values for both of the inner > ksps. This sounds wrong. I think you have a bug. The discussion motivated me to put all the simple Stokes preconditioners in ex62 (tests 30-37 in builder.py). Once I get matrix-free application going, I will put in the auxiliary operator PCs that Dave suggested. Also, always start using LU until you understand how the outer iteration behaves, then back off to ML or GAMG. Matt > Max > > -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ > -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type multiplicative \ > -stokes_ksp_initial_guess_nonzero \ > -stokes_fieldsplit_0_pc_type asm \ > -stokes_fieldsplit_0_ksp_type gmres \ > -stokes_fieldsplit_0_ksp_initial_guess_nonzero \ > -stokes_fieldsplit_0_ksp_max_it 10 \ > -stokes_fieldsplit_0_ksp_rtol 1.0e-9 \ > -stokes_fieldsplit_1_pc_type jacobi \ > -stokes_fieldsplit_1_ksp_type gmres \ > -stokes_fieldsplit_1_ksp_max_it 10 \ > -stokes_fieldsplit_1_ksp_rtol 1.0e-9 \ > -stokes_ksp_type gcr \ > -stokes_ksp_monitor_blocks \ > -stokes_ksp_monitor \ > -stokes_ksp_view \ > -stokes_ksp_atol 1e-2 \ > -stokes_ksp_rtol 0.0 \ > > On Wed, Feb 22, 2012 at 1:29 PM, Max Rudolph wrote: > >> >> >> On Tue, Feb 21, 2012 at 3:36 AM, Dave May wrote: >> >>> Max, >>> >>> > >>> > The test case that I am working with is isoviscous convection, >>> benchmark >>> > case 1a from Blankenbach 1989. >>> > >>> >>> Okay, I know this problem. >>> An iso viscous problem, solved on a uniform grid using dx=dy=dz >>> discretised >>> via FV should be super easy to precondition. >>> >>> > >>> > >>> > I think that this is the problem. The (2,2) slot in the LHS matrix is >>> all >>> > zero (pressure does not appear in the continuity equation), so I think >>> that >>> > the preconditioner is meaningless. I am still confused as to why this >>> choice >>> > of preconditioner was suggested in the tutorial, and what is a better >>> choice >>> > of preconditioner for this block? Should I be using one of the Schur >>> > complement methods instead of the additive or multiplicative field >>> split? >>> > >>> >>> No, you need to define an appropriate stokes preconditioner >>> You should assemble this matrix >>> B = ( K,B ; B^T, -1/eta* I ) >>> as the preconditioner for stokes. >>> Here eta* is a measure of the local viscosity within each pressure >>> control volume. >>> Unless you specify to use the real diagonal >>> >>> Pass this into the third argument in KSPSetOperators() (i.e. the Pmat >>> variable) >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetOperators.html >>> >>> Not sure how you represent A and B, but if you really want to run just >>> additive with fieldsplit, you don't need the off diagonal blocks, so >>> B = ( K,0 ; 0, -1/eta* I ) >>> would yield the same result. Depending on your matrix representation, >>> this may save you some memory. >>> >>> PCFieldsplit will use the B(1,1) and B(2,2) to build the stokes >>> preconditioner unless you ask for it to use the real diagonal - but >>> for the stokes operator A, this makes no sense. >>> >>> This is the right thing to do (as Matt states). >>> Try it out, and let us know how it goes. >>> >>> >>> Cheers, >>> Dave >>> >> >> Dave and Matt, >> Thanks for your help. I had some time to work on this a little more. I >> now have a stokes operator A that looks like this: >> A=(K B; B^T 0) and a matrix from which the preconditioner is generated >> P=(K B; B^T -1/eta*I) >> >> I verified that I can solve this system using the default ksp and pc >> settings in 77 iterations for the first timestep (initial guess zero) and >> in 31 iterations for the second timestep (nonzero initial guess). >> >> I adopted your suggestion to use the multiplicative field split as a >> starting point. My reading of the PETSc manual suggests to me that the >> preconditioner formed should then look like: >> >> B = (ksp(K,K) 0;-B^T*ksp(K,K)*ksp(0,-1/eta*I) ksp(0,-1/eta*I)) >> >> My interpretation of the output suggests that the solvers within each >> fieldsplit are converging nicely, but the global residual is not decreasing >> after the first few iterations. Given the disparity in residual sizes, I >> think that there might be a problem with the scaling of the pressure >> variable (I scaled the continuity equation by eta/dx where dx is my grid >> spacing). I also scaled the (1,1) block in the preconditioner by this scale >> factor. Thanks again for all of your help. >> >> Max >> >> >> Options used: >> -stokes_pc_fieldsplit_0_fields 0,1 -stokes_pc_fieldsplit_1_fields 2 \ >> -stokes_pc_type fieldsplit -stokes_pc_fieldsplit_type multiplicative \ >> -stokes_fieldsplit_0_pc_type ml \ >> -stokes_fieldsplit_0_ksp_type gmres \ >> -stokes_fieldsplit_0_ksp_monitor_true_residual \ >> -stokes_fieldsplit_0_ksp_norm_type UNPRECONDITIONED \ >> -stokes_fieldsplit_0_ksp_max_it 3 \ >> -stokes_fieldsplit_0_ksp_type gmres \ >> -stokes_fieldsplit_0_ksp_rtol 1.0e-4 \ >> -stokes_fieldsplit_0_mg_levels_ksp_type gmres \ >> -stokes_fieldsplit_0_mg_levels_pc_type bjacobi \ >> -stokes_fieldsplit_0_mg_levels_ksp_max_it 4 \ >> -stokes_fieldsplit_1_pc_type jacobi \ >> -stokes_fieldsplit_1_ksp_type preonly \ >> -stokes_fieldsplit_1_ksp_max_it 3 \ >> -stokes_fieldsplit_1_ksp_monitor_true_residual \ >> -stokes_ksp_type gcr \ >> -stokes_ksp_monitor_blocks \ >> -stokes_ksp_monitor_draw \ >> -stokes_ksp_view \ >> -stokes_ksp_atol 1e-6 \ >> -stokes_ksp_rtol 1e-6 \ >> -stokes_ksp_max_it 100 \ >> -stokes_ksp_norm_type UNPRECONDITIONED \ >> -stokes_ksp_monitor_true_residual \ >> >> Output: >> >> 0 KSP Component U,V,P residual norm [ 0.000000000000e+00, >> 1.165111661413e+06, 0.000000000000e+00 ] >> Residual norms for stokes_ solve. >> 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm >> 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.165111661413e+06 true resid norm >> 1.165111661413e+06 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 3.173622513625e+05 true resid norm >> 3.173622513625e+05 ||r(i)||/||b|| 2.723878421898e-01 >> 2 KSP unpreconditioned resid norm 5.634119635158e+04 true resid norm >> 1.725996376799e+05 ||r(i)||/||b|| 1.481399967026e-01 >> 3 KSP unpreconditioned resid norm 1.218418968344e+03 true resid norm >> 1.559727441168e+05 ||r(i)||/||b|| 1.338693528546e-01 >> 1 KSP Component U,V,P residual norm [ 5.763380362961e+04, >> 1.154490085631e+05, 3.370358145704e-12 ] >> 1 KSP unpreconditioned resid norm 1.290353784783e+05 true resid norm >> 1.290353784783e+05 ||r(i)||/||b|| 1.107493665644e-01 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.290353784783e+05 true resid norm >> 1.290353784783e+05 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 1.655137188235e+04 true resid norm >> 1.655137188235e+04 ||r(i)||/||b|| 1.282700301076e-01 >> 2 KSP unpreconditioned resid norm 1.195941831181e+03 true resid norm >> 4.554417355181e+03 ||r(i)||/||b|| 3.529588093508e-02 >> 3 KSP unpreconditioned resid norm 8.479547025398e+01 true resid norm >> 3.817072778396e+03 ||r(i)||/||b|| 2.958159865466e-02 >> 2 KSP Component U,V,P residual norm [ 2.026983725663e+03, >> 2.531521226429e+03, 3.419060873106e-12 ] >> 2 KSP unpreconditioned resid norm 3.243032954498e+03 true resid norm >> 3.243032954498e+03 ||r(i)||/||b|| 2.783452489493e-03 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 3.243032954498e+03 true resid norm >> 3.243032954498e+03 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 1.170090628031e+02 true resid norm >> 1.170090628031e+02 ||r(i)||/||b|| 3.608013376517e-02 >> 2 KSP unpreconditioned resid norm 9.782830529900e+00 true resid norm >> 1.741722174777e+01 ||r(i)||/||b|| 5.370658267168e-03 >> 3 KSP unpreconditioned resid norm 6.886950142735e-01 true resid norm >> 1.636749336722e+01 ||r(i)||/||b|| 5.046971028932e-03 >> 3 KSP Component U,V,P residual norm [ 7.515013854917e+01, >> 7.515663601801e+01, 3.418919176066e-12 ] >> 3 KSP unpreconditioned resid norm 1.062829396540e+02 true resid norm >> 1.062829396540e+02 ||r(i)||/||b|| 9.122124786317e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.062829396540e+02 true resid norm >> 1.062829396540e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.373578062042e+01 true resid norm >> 5.373578062042e+01 ||r(i)||/||b|| 5.055917797846e-01 >> 2 KSP unpreconditioned resid norm 1.199305097134e+00 true resid norm >> 3.492111756827e+01 ||r(i)||/||b|| 3.285674792393e-01 >> 3 KSP unpreconditioned resid norm 9.508597255523e-02 true resid norm >> 3.452079362567e+01 ||r(i)||/||b|| 3.248008922038e-01 >> 4 KSP Component U,V,P residual norm [ 7.495897679790e+01, >> 7.527868410560e+01, 3.418919160091e-12 ] >> 4 KSP unpreconditioned resid norm 1.062343093509e+02 true resid norm >> 1.062343093509e+02 ||r(i)||/||b|| 9.117950911420e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.062343093509e+02 true resid norm >> 1.062343093509e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.419252803207e+01 true resid norm >> 5.419252803207e+01 ||r(i)||/||b|| 5.101226558840e-01 >> 2 KSP unpreconditioned resid norm 1.431134174522e+00 true resid norm >> 3.339055236737e+01 ||r(i)||/||b|| 3.143104386088e-01 >> 3 KSP unpreconditioned resid norm 9.760479467902e-02 true resid norm >> 3.304522520358e+01 ||r(i)||/||b|| 3.110598205561e-01 >> 5 KSP Component U,V,P residual norm [ 7.491128585963e+01, >> 7.523275560552e+01, 3.418919008441e-12 ] >> 5 KSP unpreconditioned resid norm 1.061681132221e+02 true resid norm >> 1.061681132221e+02 ||r(i)||/||b|| 9.112269384837e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.061681132221e+02 true resid norm >> 1.061681132221e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.343215079492e+01 true resid norm >> 5.343215079492e+01 ||r(i)||/||b|| 5.032787074508e-01 >> 2 KSP unpreconditioned resid norm 1.288069736759e+00 true resid norm >> 3.308925591301e+01 ||r(i)||/||b|| 3.116684935691e-01 >> 3 KSP unpreconditioned resid norm 9.505248953960e-02 true resid norm >> 3.281875055845e+01 ||r(i)||/||b|| 3.091205971589e-01 >> 6 KSP Component U,V,P residual norm [ 7.481188568118e+01, >> 7.527346267608e+01, 3.418918860626e-12 ] >> 6 KSP unpreconditioned resid norm 1.061268694649e+02 true resid norm >> 1.061268694649e+02 ||r(i)||/||b|| 9.108729487455e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.061268694649e+02 true resid norm >> 1.061268694649e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.300383444945e+01 true resid norm >> 5.300383444945e+01 ||r(i)||/||b|| 4.994384053416e-01 >> 2 KSP unpreconditioned resid norm 1.118785004087e+00 true resid norm >> 3.282090953364e+01 ||r(i)||/||b|| 3.092610730828e-01 >> 3 KSP unpreconditioned resid norm 9.758015489979e-02 true resid norm >> 3.259718081014e+01 ||r(i)||/||b|| 3.071529479244e-01 >> 7 KSP Component U,V,P residual norm [ 7.475024970669e+01, >> 7.530858268154e+01, 3.418918784089e-12 ] >> 7 KSP unpreconditioned resid norm 1.061083524362e+02 true resid norm >> 1.061083524362e+02 ||r(i)||/||b|| 9.107140195255e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.061083524362e+02 true resid norm >> 1.061083524362e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.296981668051e+01 true resid norm >> 5.296981668051e+01 ||r(i)||/||b|| 4.992049679820e-01 >> 2 KSP unpreconditioned resid norm 9.379451887610e-01 true resid norm >> 3.378466967056e+01 ||r(i)||/||b|| 3.183978348066e-01 >> 3 KSP unpreconditioned resid norm 9.102580142867e-02 true resid norm >> 3.360853440947e+01 ||r(i)||/||b|| 3.167378781957e-01 >> 8 KSP Component U,V,P residual norm [ 7.464535615814e+01, >> 7.537007679541e+01, 3.418918790515e-12 ] >> 8 KSP unpreconditioned resid norm 1.060781677449e+02 true resid norm >> 1.060781677449e+02 ||r(i)||/||b|| 9.104549482946e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.060781677449e+02 true resid norm >> 1.060781677449e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.281972737642e+01 true resid norm >> 5.281972737642e+01 ||r(i)||/||b|| 4.979321240109e-01 >> 2 KSP unpreconditioned resid norm 9.224594814880e-01 true resid norm >> 3.351285171891e+01 ||r(i)||/||b|| 3.159260046751e-01 >> 3 KSP unpreconditioned resid norm 9.143100662935e-02 true resid norm >> 3.329269756083e+01 ||r(i)||/||b|| 3.138506091177e-01 >> 9 KSP Component U,V,P residual norm [ 7.451688471900e+01, >> 7.544516987344e+01, 3.418918860847e-12 ] >> 9 KSP unpreconditioned resid norm 1.060412172952e+02 true resid norm >> 1.060412172952e+02 ||r(i)||/||b|| 9.101378074496e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.060412172952e+02 true resid norm >> 1.060412172952e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.275132249899e+01 true resid norm >> 5.275132249899e+01 ||r(i)||/||b|| 4.974605520805e-01 >> 2 KSP unpreconditioned resid norm 7.755381284769e-01 true resid norm >> 3.453933285011e+01 ||r(i)||/||b|| 3.257161105001e-01 >> 3 KSP unpreconditioned resid norm 7.298768665179e-02 true resid norm >> 3.435179316160e+01 ||r(i)||/||b|| 3.239475558447e-01 >> 10 KSP Component U,V,P residual norm [ 7.451431102619e+01, >> 7.544762349626e+01, 3.418918857322e-12 ] >> 10 KSP unpreconditioned resid norm 1.060411544587e+02 true resid norm >> 1.060411544587e+02 ||r(i)||/||b|| 9.101372681321e-05 >> Residual norms for stokes_fieldsplit_0_ solve. >> 0 KSP unpreconditioned resid norm 1.060411544587e+02 true resid norm >> 1.060411544587e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.276103518337e+01 true resid norm >> 5.276103518337e+01 ||r(i)||/||b|| 4.975524403961e-01 >> 2 KSP unpreconditioned resid norm 7.777079890360e-01 true resid norm >> 3.454373663425e+01 ||r(i)||/||b|| 3.257578325186e-01 >> 3 KSP unpreconditioned resid norm 7.356028471071e-02 true resid norm >> 3.435584054266e+01 ||r(i)||/||b|| 3.239859158269e-01 >> 11 KSP Component U,V,P residual norm [ 7.438335197779e+01, >> 7.553731959735e+01, 3.418918856471e-12 ] >> >> >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno at psi.ch Tue Feb 28 01:47:28 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Tue, 28 Feb 2012 08:47:28 +0100 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> Message-ID: <4F4C8690.3050700@psi.ch> Dear all, Max may be correct, but I encounter the same problem as Aron. Neither PCILU nor PCICC work in parallel for me. Here is the message I get: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc ICC! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. Portion of the code which I use to set the solver contents follows (I am aiming at CG+ICC combination): /* Create KPS content */ KSPCreate(PETSC_COMM_WORLD, &ksp); KSPSetType(ksp ,KSPCG); /* Set operators */ KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN); /* Linear solver defaults (can be ove-ridden) */ KSPGetPC(ksp, &pc); PCSetType(pc, PCICC); KSPSetTolerances(ksp, 1.e-5, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT); /* Run-time options (over-rides above) */ KSPSetFromOptions(ksp); What is going wrong here? Kind regards, Bojan On 2/26/2012 6:17 PM, Matthew Knepley wrote: > On Sun, Feb 26, 2012 at 10:48 AM, Max Rudolph > wrote: > > MPIAIJ and SEQQIJ matrices are subtypes of the AIJ matrix type. > Looking at that table, you should be able to use any of the PCs > that supports AIJ and has an X under 'parallel'. > > > Max is correct. For instance, the most popular general purpose > parallel solver is ASM (Additive Schwarz Method), which then > has a sequential subsolver for each block, which defaults to ILU. > > Matt > > Max > > > On Sun, Feb 26, 2012 at 8:16 AM, Aron Roland > wrote: > > Dear All, > > I hope somebody can help us on this or give at least some > clearance. > > We have just included PETSc as an solver for our sparse matrix > evolving from an unstructured mesh advection scheme. > > The problem is that we are using the mpiaij matrix type, since > our matrix is naturally sparse. However it seems that PETSc > has no PC for this, except the PCSOR, which showed to be not > very effective for our problem. > > All others give the error msg. of the mail subject, where XXX > are the different PC tried. > > The manual is a bit diffuse on this e.g. > > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html > > it is claimed that certain PC's are running on aij matrices > ... but these are to be defined either as seq. or parallel > (mpiaij) matrices. Moreover in the above mentioned list are > two columns parallel/seriel, what is the intention of parallel > capability when not applicable to matrices stored within the > parallel mpiaij framework. > > I guess we just not understanding the concept or have some > other difficulties of understanding of all this. > > Any comments help is welcome > > Aron > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Tue Feb 28 02:27:42 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Tue, 28 Feb 2012 09:27:42 +0100 Subject: [petsc-users] Configuring without debug Message-ID: <4F4C8FFE.5020401@psi.ch> Dear all, I am trying to configure PETSc with*out* debugging. I am issuing the following command: ./configure --with-mpi-dir=/opt/mpi/openmpi-1.2.6-gcc-4.1 --with-debugging=0 COPTFLAGS='-O3 -march=p4 -mtune=p4' FOPTFLAGS='-O3 -qarch=p4 -qtune=p4'. The configuration procedure starts well, saying: =============================================================================== WARNING! Compiling PETSc with no debugging, this should only be done for timing and production runs. All development should be done when configured using --with-debugging=1 =============================================================================== which is is fine, which is what I wanted, but it ends with the message: xxx=========================================================================xxx Configure stage complete. Now build PETSc libraries with (legacy build): make PETSC_DIR=/homecfd/niceno/PETSc-3.2-p6-O PETSC_ARCH=arch-linux2-c-debug all Judging from the proposed PETSC_ARCH, it seems that debugging got switched back on :-( Then I type make, and, after a while, get the message: Completed building libraries ========================================= Now to check if the libraries are working do: make PETSC_DIR=/homecfd/niceno/PETSc-3.2-p6-O PETSC_ARCH=arch-linux2-c-debug test ========================================= Is there a way to build PETSc with*out* debugging option? Kind regards, Bojan -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Tue Feb 28 02:56:08 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Tue, 28 Feb 2012 09:56:08 +0100 Subject: [petsc-users] Configuring without debug In-Reply-To: <4F4C8FFE.5020401@psi.ch> References: <4F4C8FFE.5020401@psi.ch> Message-ID: <4F4C96A8.4020508@psi.ch> Sorry for this post, I figured out myself what was I doing wrong. I was configuring for compilation without debugging, but my PETSC_ARCH variable was still set to arch-linux2-c-debug :-( I have the optimized version running as I write this message. Cheers, Bojan On 2/28/2012 9:27 AM, Bojan Niceno wrote: > Dear all, > > > I am trying to configure PETSc with*out* debugging. > > I am issuing the following command: > > ./configure --with-mpi-dir=/opt/mpi/openmpi-1.2.6-gcc-4.1 > --with-debugging=0 COPTFLAGS='-O3 -march=p4 -mtune=p4' FOPTFLAGS='-O3 > -qarch=p4 -qtune=p4'. > > The configuration procedure starts well, saying: > =============================================================================== > WARNING! Compiling PETSc with no debugging, this should > only be done for timing and production runs. All development should > be done when configured using > --with-debugging=1 > =============================================================================== > > > which is is fine, which is what I wanted, but it ends with the message: > > xxx=========================================================================xxx > Configure stage complete. Now build PETSc libraries with (legacy build): > make PETSC_DIR=/homecfd/niceno/PETSc-3.2-p6-O > PETSC_ARCH=arch-linux2-c-debug all > > Judging from the proposed PETSC_ARCH, it seems that debugging got > switched back on :-( > > Then I type make, and, after a while, get the message: > > Completed building libraries > ========================================= > Now to check if the libraries are working do: > make PETSC_DIR=/homecfd/niceno/PETSc-3.2-p6-O > PETSC_ARCH=arch-linux2-c-debug test > ========================================= > > Is there a way to build PETSc with*out* debugging option? > > > > Kind regards, > > > Bojan > > -- -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From roland at wb.tu-darmstadt.de Tue Feb 28 03:04:10 2012 From: roland at wb.tu-darmstadt.de (Aron Roland) Date: Tue, 28 Feb 2012 10:04:10 +0100 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4C8690.3050700@psi.ch> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> Message-ID: <4F4C988A.8080004@wb.tu-darmstadt.de> Hi Bojan, the PCILU package does not work with mpiaij matrices, same as PCICC. Basically only PCSOR works. You can install the hypre package then you can use the hypre solvers that also include an ILU PC names BILUT, however I did not had any success to achieve convergence, even if my SPARSKIT ILU-BCGSTAB converges very well. Hope this helped. Cheers Aron On 02/28/2012 08:47 AM, Bojan Niceno wrote: > Dear all, > > > Max may be correct, but I encounter the same problem as Aron. Neither > PCILU nor PCICC work in parallel for me. Here is the message I get: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc ICC! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 > 09:28:45 CST 2012 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > > Portion of the code which I use to set the solver contents follows (I > am aiming at CG+ICC combination): > > /* Create KPS content */ > KSPCreate(PETSC_COMM_WORLD, &ksp); > KSPSetType(ksp ,KSPCG); > > /* Set operators */ > KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN); > > /* Linear solver defaults (can be ove-ridden) */ > KSPGetPC(ksp, &pc); > PCSetType(pc, PCICC); > KSPSetTolerances(ksp, 1.e-5, PETSC_DEFAULT, PETSC_DEFAULT, > PETSC_DEFAULT); > > /* Run-time options (over-rides above) */ > KSPSetFromOptions(ksp); > > What is going wrong here? > > > Kind regards, > > > Bojan > > > On 2/26/2012 6:17 PM, Matthew Knepley wrote: >> On Sun, Feb 26, 2012 at 10:48 AM, Max Rudolph > > wrote: >> >> MPIAIJ and SEQQIJ matrices are subtypes of the AIJ matrix type. >> Looking at that table, you should be able to use any of the PCs >> that supports AIJ and has an X under 'parallel'. >> >> >> Max is correct. For instance, the most popular general purpose >> parallel solver is ASM (Additive Schwarz Method), which then >> has a sequential subsolver for each block, which defaults to ILU. >> >> Matt >> >> Max >> >> >> On Sun, Feb 26, 2012 at 8:16 AM, Aron Roland > > wrote: >> >> Dear All, >> >> I hope somebody can help us on this or give at least some >> clearance. >> >> We have just included PETSc as an solver for our sparse >> matrix evolving from an unstructured mesh advection scheme. >> >> The problem is that we are using the mpiaij matrix type, >> since our matrix is naturally sparse. However it seems that >> PETSc has no PC for this, except the PCSOR, which showed to >> be not very effective for our problem. >> >> All others give the error msg. of the mail subject, where XXX >> are the different PC tried. >> >> The manual is a bit diffuse on this e.g. >> >> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html >> >> it is claimed that certain PC's are running on aij matrices >> ... but these are to be defined either as seq. or parallel >> (mpiaij) matrices. Moreover in the above mentioned list are >> two columns parallel/seriel, what is the intention of >> parallel capability when not applicable to matrices stored >> within the parallel mpiaij framework. >> >> I guess we just not understanding the concept or have some >> other difficulties of understanding of all this. >> >> Any comments help is welcome >> >> Aron >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener > > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Tue Feb 28 03:19:55 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Tue, 28 Feb 2012 10:19:55 +0100 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4C988A.8080004@wb.tu-darmstadt.de> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> Message-ID: <4F4C9C3B.4010809@psi.ch> On 2/28/2012 10:04 AM, Aron Roland wrote: > Hi Bojan, > > the PCILU package does not work with mpiaij matrices, same as PCICC. > Basically only PCSOR works. Auch :-( When I saw your reply, I was hoping you will say: "I've found a resolution in the meantime", but you only confirmed my fears. Cheers Bojan > > You can install the hypre package then you can use the hypre solvers > that also include an ILU PC names BILUT, however I did not had any > success to achieve convergence, even if my SPARSKIT ILU-BCGSTAB > converges very well. > > Hope this helped. > > Cheers > > Aron > > > On 02/28/2012 08:47 AM, Bojan Niceno wrote: >> Dear all, >> >> >> Max may be correct, but I encounter the same problem as Aron. >> Neither PCILU nor PCICC work in parallel for me. Here is the message >> I get: >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: No support for this operation for this object type! >> [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc ICC! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 >> 09:28:45 CST 2012 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> >> Portion of the code which I use to set the solver contents follows (I >> am aiming at CG+ICC combination): >> >> /* Create KPS content */ >> KSPCreate(PETSC_COMM_WORLD, &ksp); >> KSPSetType(ksp ,KSPCG); >> >> /* Set operators */ >> KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN); >> >> /* Linear solver defaults (can be ove-ridden) */ >> KSPGetPC(ksp, &pc); >> PCSetType(pc, PCICC); >> KSPSetTolerances(ksp, 1.e-5, PETSC_DEFAULT, PETSC_DEFAULT, >> PETSC_DEFAULT); >> >> /* Run-time options (over-rides above) */ >> KSPSetFromOptions(ksp); >> >> What is going wrong here? >> >> >> Kind regards, >> >> >> Bojan >> >> >> On 2/26/2012 6:17 PM, Matthew Knepley wrote: >>> On Sun, Feb 26, 2012 at 10:48 AM, Max Rudolph >> > wrote: >>> >>> MPIAIJ and SEQQIJ matrices are subtypes of the AIJ matrix type. >>> Looking at that table, you should be able to use any of the PCs >>> that supports AIJ and has an X under 'parallel'. >>> >>> >>> Max is correct. For instance, the most popular general purpose >>> parallel solver is ASM (Additive Schwarz Method), which then >>> has a sequential subsolver for each block, which defaults to ILU. >>> >>> Matt >>> >>> Max >>> >>> >>> On Sun, Feb 26, 2012 at 8:16 AM, Aron Roland >> > wrote: >>> >>> Dear All, >>> >>> I hope somebody can help us on this or give at least some >>> clearance. >>> >>> We have just included PETSc as an solver for our sparse >>> matrix evolving from an unstructured mesh advection scheme. >>> >>> The problem is that we are using the mpiaij matrix type, >>> since our matrix is naturally sparse. However it seems that >>> PETSc has no PC for this, except the PCSOR, which showed to >>> be not very effective for our problem. >>> >>> All others give the error msg. of the mail subject, where >>> XXX are the different PC tried. >>> >>> The manual is a bit diffuse on this e.g. >>> >>> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html >>> >>> it is claimed that certain PC's are running on aij matrices >>> ... but these are to be defined either as seq. or parallel >>> (mpiaij) matrices. Moreover in the above mentioned list are >>> two columns parallel/seriel, what is the intention of >>> parallel capability when not applicable to matrices stored >>> within the parallel mpiaij framework. >>> >>> I guess we just not understanding the concept or have some >>> other difficulties of understanding of all this. >>> >>> Any comments help is welcome >>> >>> Aron >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >> >> >> -- > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Signature.png Type: image/png Size: 6515 bytes Desc: not available URL: From jedbrown at mcs.anl.gov Tue Feb 28 06:20:14 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Tue, 28 Feb 2012 06:20:14 -0600 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4C988A.8080004@wb.tu-darmstadt.de> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> Message-ID: On Tue, Feb 28, 2012 at 03:04, Aron Roland wrote: > the PCILU package does not work with mpiaij matrices, same as PCICC. > Basically only PCSOR works. > Hardly, there are lots of parallel preconditioners tabulated in the solver table. Since you don't seem to be clicking the links to man pages or looking at examples, here are some concrete examples. # Additive Schwarz with various subdomain solvers -pc_type asm -sub_pc_type icc -pc_type asm -sub_pc_type ilu -pc_type asm -sub_pc_type lu # Block Jacobi (equivalent to zero-overlap additive Schwarz) -pc_type bjacobi -sub_pc_type icc # Redundant direct solve (useful for coarse levels or for debugging) -pc_type redundant -redundant_pc_type lu # Parallel direct solve (with various packages) -pc_type lu -pc_factor_mat_solver_package mumps -pc_type lu -pc_factor_mat_solver_package superlu_dist -pc_type lu -pc_factor_mat_solver_package pastix # Parallel smoothed aggregation algebraic multigrid -pc_type gamg # requires petsc-dev -pc_type ml # --download-ml # Classical algebraic multigrid (from Hypre) -pc_type hypre -pc_hypre_type boomeramg # Parallel ILU -pc_type hypre -pc_hypre_type pilut # deprecated, but includes drop tolerance -pc_type hypre -pc_hypre_type euclid # supported, no drop tolerance # Sparse approximate inverse -pc_type spai -pc_type hypre -pc_hypre_type parasails # Field split with AMG inside (physics-blocked relaxation (Schwarz) or factorization (Schur)) -pc_type fieldsplit -fieldsplit_0_pc_type gamg -fieldsplit_1_pc_type pbjacobi # Factorization field split with automatic saddle point detection, precondition Schur complement with least squares commutator -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type schur -fieldsplit_0_pc_type gamg -fieldsplit_1_pc_type ls -fieldsplit_1_lsc_pc_type gamg and so on... -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 28 10:41:35 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 28 Feb 2012 10:41:35 -0600 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4C9C3B.4010809@psi.ch> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> <4F4C9C3B.4010809@psi.ch> Message-ID: On Tue, Feb 28, 2012 at 3:19 AM, Bojan Niceno wrote: > On 2/28/2012 10:04 AM, Aron Roland wrote: > > Hi Bojan, > > the PCILU package does not work with mpiaij matrices, same as PCICC. > Basically only PCSOR works. > > > Auch :-( When I saw your reply, I was hoping you will say: "I've found a > resolution in the meantime", but you only confirmed my fears. > Look, I already replied to this, and now Jed had to reply again. If you are not going to read our mail, why mail the list? Matt > Cheers > > > Bojan > > > > > You can install the hypre package then you can use the hypre solvers that > also include an ILU PC names BILUT, however I did not had any success to > achieve convergence, even if my SPARSKIT ILU-BCGSTAB converges very well. > > Hope this helped. > > Cheers > > Aron > > > On 02/28/2012 08:47 AM, Bojan Niceno wrote: > > Dear all, > > > Max may be correct, but I encounter the same problem as Aron. Neither > PCILU nor PCICC work in parallel for me. Here is the message I get: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc ICC! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 > CST 2012 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > > Portion of the code which I use to set the solver contents follows (I am > aiming at CG+ICC combination): > > /* Create KPS content */ > KSPCreate(PETSC_COMM_WORLD, &ksp); > KSPSetType(ksp ,KSPCG); > > /* Set operators */ > KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN); > > /* Linear solver defaults (can be ove-ridden) */ > KSPGetPC(ksp, &pc); > PCSetType(pc, PCICC); > KSPSetTolerances(ksp, 1.e-5, PETSC_DEFAULT, PETSC_DEFAULT, > PETSC_DEFAULT); > > /* Run-time options (over-rides above) */ > KSPSetFromOptions(ksp); > > What is going wrong here? > > > Kind regards, > > > Bojan > > > On 2/26/2012 6:17 PM, Matthew Knepley wrote: > > On Sun, Feb 26, 2012 at 10:48 AM, Max Rudolph wrote: > >> MPIAIJ and SEQQIJ matrices are subtypes of the AIJ matrix type. Looking >> at that table, you should be able to use any of the PCs that supports AIJ >> and has an X under 'parallel'. > > > Max is correct. For instance, the most popular general purpose parallel > solver is ASM (Additive Schwarz Method), which then > has a sequential subsolver for each block, which defaults to ILU. > > Matt > > >> Max >> >> >> On Sun, Feb 26, 2012 at 8:16 AM, Aron Roland wrote: >> >>> Dear All, >>> >>> I hope somebody can help us on this or give at least some clearance. >>> >>> We have just included PETSc as an solver for our sparse matrix evolving >>> from an unstructured mesh advection scheme. >>> >>> The problem is that we are using the mpiaij matrix type, since our >>> matrix is naturally sparse. However it seems that PETSc has no PC for this, >>> except the PCSOR, which showed to be not very effective for our problem. >>> >>> All others give the error msg. of the mail subject, where XXX are the >>> different PC tried. >>> >>> The manual is a bit diffuse on this e.g. >>> >>> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html >>> >>> it is claimed that certain PC's are running on aij matrices ... but >>> these are to be defined either as seq. or parallel (mpiaij) matrices. >>> Moreover in the above mentioned list are two columns parallel/seriel, what >>> is the intention of parallel capability when not applicable to matrices >>> stored within the parallel mpiaij framework. >>> >>> I guess we just not understanding the concept or have some other >>> difficulties of understanding of all this. >>> >>> Any comments help is welcome >>> >>> Aron >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > -- > > > > > -- > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6515 bytes Desc: not available URL: From bojan.niceno at psi.ch Tue Feb 28 11:06:14 2012 From: bojan.niceno at psi.ch (Bojan Niceno) Date: Tue, 28 Feb 2012 18:06:14 +0100 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> <4F4C9C3B.4010809@psi.ch> Message-ID: <4F4D0986.5080601@psi.ch> Hi all, On 2/28/2012 5:41 PM, Matthew Knepley wrote: > Look, I already replied to this, and now Jed had to reply again. If > you are not going to read our mail, why mail the list? I am reading your messages, all right, but I also read PETSc's errors and manuals. Look what the manual says on ICC: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCICC.html "Notes: Only implemented for some matrix formats. Not implemented in parallel." And on ILU: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCILU.html "Notes: Only implemented for some matrix formats. (for parallel see PCHYPRE for hypre's ILU)" So Matt, if I understand your answer correctly, one should use PCASM to get ILU in parallel, right? What if I want IC? Thanks, Bojan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 28 11:10:38 2012 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 28 Feb 2012 11:10:38 -0600 Subject: [petsc-users] Matrix format mpiaij does not have a built-in PETSc XXX! In-Reply-To: <4F4D0986.5080601@psi.ch> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> <4F4C9C3B.4010809@psi.ch> <4F4D0986.5080601@psi.ch> Message-ID: On Tue, Feb 28, 2012 at 11:06 AM, Bojan Niceno wrote: > Hi all, > > On 2/28/2012 5:41 PM, Matthew Knepley wrote: > > Look, I already replied to this, and now Jed had to reply again. If you > are not going to read our mail, why mail the list? > > > I am reading your messages, all right, but I also read PETSc's errors and > manuals. > > Look what the manual says on ICC: > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCICC.html > > "Notes: Only implemented for some matrix formats. Not implemented in > parallel." > > > And on ILU: > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCILU.html > > "Notes: Only implemented for some matrix formats. (for parallel see > PCHYPREfor hypre's ILU)" > > > So Matt, if I understand your answer correctly, one should use PCASM to > get ILU in parallel, right? What if I want IC? > Or PCBJACOBI, or use Hypre for parallel ILU, or better yet do not use an unreliable preconditioner with poor scalability. How does this lead you to conclude that SOR is the only thing you can run in parallel? Matt > Thanks, > > > Bojan > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanharen at nrg.eu Wed Feb 29 15:35:14 2012 From: vanharen at nrg.eu (Haren, S.W. van (Steven)) Date: Wed, 29 Feb 2012 22:35:14 +0100 Subject: [petsc-users] HDF5 installation problems References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> <4F4C9C3B.4010809@psi.ch> <4F4D0986.5080601@psi.ch> Message-ID: <1EC3BC0BF3DF4945832ECD4A46E7DE2F01A6D29A@NRGPEX.nrgnet.intra> Dear all, I try to configure Petsc to use HDF5. During the HDF5 compilation stage the memory use increases untill the computer basically stalls and almost all memory is used. After 2500s the compilation is stopped because of a runaway process. Did somebody encounter this before? How can I solve this? Thanks for your input! Kind regards, Steven From jedbrown at mcs.anl.gov Wed Feb 29 15:43:03 2012 From: jedbrown at mcs.anl.gov (Jed Brown) Date: Wed, 29 Feb 2012 15:43:03 -0600 Subject: [petsc-users] HDF5 installation problems In-Reply-To: <1EC3BC0BF3DF4945832ECD4A46E7DE2F01A6D29A@NRGPEX.nrgnet.intra> References: <4F4A5AC3.6030203@wb.tu-darmstadt.de> <4F4A5AF9.3060006@gmx.de> <4F4C8690.3050700@psi.ch> <4F4C988A.8080004@wb.tu-darmstadt.de> <4F4C9C3B.4010809@psi.ch> <4F4D0986.5080601@psi.ch> <1EC3BC0BF3DF4945832ECD4A46E7DE2F01A6D29A@NRGPEX.nrgnet.intra> Message-ID: On Wed, Feb 29, 2012 at 15:35, Haren, S.W. van (Steven) wrote: > Dear all, > > I try to configure Petsc to use HDF5. > > During the HDF5 compilation stage the memory use increases untill the > computer basically stalls and almost all memory is used. After 2500s the > compilation is stopped because of a runaway process. > Check configure.log and/or attach a debugger to find out where it got stuck. At least use "top" to find out which process is taking all the time and memory. -------------- next part -------------- An HTML attachment was scrubbed... URL: