From z.sheng at ewi.tudelft.nl Thu Nov 1 07:45:57 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Thu, 01 Nov 2007 13:45:57 +0100 Subject: memory usage of the unstructured grid In-Reply-To: <830561.29161.qm@web36201.mail.mud.yahoo.com> References: <830561.29161.qm@web36201.mail.mud.yahoo.com> Message-ID: <4729CA85.3070908@ewi.tudelft.nl> Shi Jin wrote: > Hi, > > I have observed that the memory usage of the petsc mesh is much higher > than my previous code, if both were to be run serially. > For example, for a simple cubic box with 750,000 tetrahedral elements, > my old code takes about 200MB for the whole array, including all the > mappings required for later use such as the inverse connectivity > table. For the same mesh, my PETSc code takes about 4GB for the mesh > alone. > > The same can be found in the provided examples. I made a few changes > to the navierStokes code to output the virtual memory usage and got > ./navierStokes -dim 3 -generate -structured 0 -refinement_limit 1e-6 > 109,283 elements, 139,030 edges , 21,523 vertexes > [0]:after mesh created:mem=574.46 MB > This is consistent with my Petsc code. > > I understand that for the mesh to scale in parallel, extra > information needs to be stored. But the current cost seems too > expensive. I am wondering if there is a way to cut the memory usage > for the mesh. > Thank you very much. > > Shi > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com Hi I am not expert on it... but I have used some FEM library beside mine, and I found out that some libraries create neighboring element, edge, and node list for every element. This is not necessary if you do use them. Similar thing could have happened in your case. Best Zhifeng From z.sheng at ewi.tudelft.nl Thu Nov 1 07:52:58 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Thu, 01 Nov 2007 13:52:58 +0100 Subject: Dynamic allocation of Non zeros? Message-ID: <4729CC2A.6010703@ewi.tudelft.nl> Dear all It is always not easy to determine the number of nonzeros in a row without any cost. I wonder if I can make dynamic allocation, for instance, if one found the nonzeros of a row overflow the pre-allocated space, then the space shall be doubled. This may waste some memory but it may not be that bad. Or maybe this is already done in Petsc? Could please tell what happens if I assign the nozero number in rows to be zeros and let the programme fills in the nonzeros ? Best regards Zhifeng Sheng From hzhang at mcs.anl.gov Thu Nov 1 09:32:31 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 1 Nov 2007 09:32:31 -0500 (CDT) Subject: Dynamic allocation of Non zeros? In-Reply-To: <4729CC2A.6010703@ewi.tudelft.nl> References: <4729CC2A.6010703@ewi.tudelft.nl> Message-ID: On Thu, 1 Nov 2007, Zhifeng Sheng wrote: > Dear all > > It is always not easy to determine the number of nonzeros in a row without > any cost. > > I wonder if I can make dynamic allocation, for instance, if one found the > nonzeros of a row overflow the pre-allocated space, then the space shall be > doubled. > > This may waste some memory but it may not be that bad. Or maybe this is > already done in Petsc? This is implemented in petsc, and is the reasons that MatAssembly may become extremely slow when user does not preallocate memory. > > Could please tell what happens if I assign the nozero number in rows to be > zeros and let the programme fills in the nonzeros ? It would be better that you assign a max nozero number than zero to reduce memory malloc. Hong > > Best regards > Zhifeng Sheng > > From bsmith at mcs.anl.gov Thu Nov 1 16:08:25 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 Nov 2007 16:08:25 -0500 Subject: Dynamic allocation of Non zeros? In-Reply-To: <4729CC2A.6010703@ewi.tudelft.nl> References: <4729CC2A.6010703@ewi.tudelft.nl> Message-ID: Zhifeng, We already do this; it helps with the time, but still results in very inefficient runs if your preallocation is zero entries. Perhaps you could describe how your matrix entries are generated and we can suggest a preallocation scheme? Barry On Nov 1, 2007, at 7:52 AM, Zhifeng Sheng wrote: > Dear all > > It is always not easy to determine the number of nonzeros in a row > without any cost. > > I wonder if I can make dynamic allocation, for instance, if one > found the nonzeros of a row overflow the pre-allocated space, then > the space shall be doubled. > > This may waste some memory but it may not be that bad. Or maybe this > is already done in Petsc? > > Could please tell what happens if I assign the nozero number in rows > to be zeros and let the programme fills in the nonzeros ? > > Best regards > Zhifeng Sheng > From berend at chalmers.se Thu Nov 1 16:17:02 2007 From: berend at chalmers.se (Berend van Wachem) Date: Thu, 01 Nov 2007 22:17:02 +0100 Subject: Dynamic allocation of Non zeros? In-Reply-To: References: <4729CC2A.6010703@ewi.tudelft.nl> Message-ID: <472A424E.1000608@chalmers.se> Hi Barry, > Perhaps you could describe how your matrix entries are generated and > we can suggest a > preallocation scheme? Are there examples of preallocation schemes for PETSc? For instance, if I would like to solve Poisson equation on an irregular grid (9 point stencil in 2d and 27 in 3d) on n processors? Thanks, Berend. From bsmith at mcs.anl.gov Thu Nov 1 16:25:15 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 Nov 2007 16:25:15 -0500 Subject: Dynamic allocation of Non zeros? In-Reply-To: <472A424E.1000608@chalmers.se> References: <4729CC2A.6010703@ewi.tudelft.nl> <472A424E.1000608@chalmers.se> Message-ID: <5028F943-1EF9-4378-9F3D-64A3D3910789@mcs.anl.gov> On Nov 1, 2007, at 4:17 PM, Berend van Wachem wrote: > Hi Barry, > > >> Perhaps you could describe how your matrix entries are generated >> and we can suggest a >> preallocation scheme? > > Are there examples of preallocation schemes for PETSc? For instance, > if I would like to solve Poisson equation on an irregular grid (9 > point stencil in 2d and 27 in 3d) on n processors? This would depend on how you store your grid information; is it treated as unstructured? For structured or semi-structured take a look at DAGetMatrix2d_MPIAIJ() in src/dm/da/utils/fdda.c Barry > > > Thanks, > Berend. > From knepley at gmail.com Thu Nov 1 18:05:07 2007 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 1 Nov 2007 18:05:07 -0500 Subject: Dynamic allocation of Non zeros? In-Reply-To: <472A424E.1000608@chalmers.se> References: <4729CC2A.6010703@ewi.tudelft.nl> <472A424E.1000608@chalmers.se> Message-ID: The short answer is that preallocation looks exactly like an assembly, but individual values are not calculated, just sizes. You can look at the routines I have for this for the Mesh class in preallocateOperator() in src/dm/mesh/mesh.c. Matt On Nov 1, 2007 4:17 PM, Berend van Wachem wrote: > Hi Barry, > > > > Perhaps you could describe how your matrix entries are generated and > > we can suggest a > > preallocation scheme? > > Are there examples of preallocation schemes for PETSc? For instance, if > I would like to solve Poisson equation on an irregular grid (9 point > stencil in 2d and 27 in 3d) on n processors? > > Thanks, > Berend. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From z.sheng at ewi.tudelft.nl Fri Nov 2 07:23:40 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Fri, 02 Nov 2007 13:23:40 +0100 Subject: Dynamic allocation of Non zeros? In-Reply-To: References: <4729CC2A.6010703@ewi.tudelft.nl> Message-ID: <472B16CC.7030906@ewi.tudelft.nl> Barry Smith wrote: > > > Zhifeng, > > We already do this; it helps with the time, but still results in very > inefficient runs if your preallocation is zero entries. > > Perhaps you could describe how your matrix entries are generated and > we can suggest a > preallocation scheme? > > Barry > > > On Nov 1, 2007, at 7:52 AM, Zhifeng Sheng wrote: > >> Dear all >> >> It is always not easy to determine the number of nonzeros in a row >> without any cost. >> >> I wonder if I can make dynamic allocation, for instance, if one found >> the nonzeros of a row overflow the pre-allocated space, then the >> space shall be doubled. >> >> This may waste some memory but it may not be that bad. Or maybe this >> is already done in Petsc? >> >> Could please tell what happens if I assign the nozero number in rows >> to be zeros and let the programme fills in the nonzeros ? >> >> Best regards >> Zhifeng Sheng >> > In that case I'd better do some preallocation. I am working on 3D FEM code with unstructured tetrahedron mesh, and on each node, we can have 3 to arround 12 unknowns, at this moment I assigned 50 nonzeros to each rows, and it is sufficient... just half of the memory assigned went wasted. Then I tried 25 nonzeros in a row, but this is not enough and the performance is terrible. So... do I really need to compute the nonzero pattern? Ps. my matrix is SeqSBAIJ with block size 1, so I am wondering when I specify the nonzeros in a row, does it mean the actually nonzeros numbers or the memory that is needed? (for instance, for SeqSBAIJ, the actual nonzeros in a row would twice as much as memory needed) Thank you Best regards Zhifeng From z.sheng at ewi.tudelft.nl Fri Nov 2 07:54:28 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Fri, 02 Nov 2007 13:54:28 +0100 Subject: not enough memory?! Message-ID: <472B1E04.1090906@ewi.tudelft.nl> Dear all I have a problem for memory management. I implemented 3d FEM code and at this moment SeqSBAIJ is used to store the system matrix, then I used CG method and ICC(k) preconditioner. The test configuration is not very big, I succeeded in constructing the system matrix and construct the preconditioner. But when I need to setup the KSP solver, I got the error message below ( I also dumped some information about solver and system matrix): KSP Object: type: cg maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: icc ICC: 2 levels of fill ICC: factor fill ratio allocated 1 linear system matrix = precond matrix: Matrix Object: type=seqsbaij, rows=435450, cols=435450 total: nonzeros=9470450, allocated nonzeros=21772580 block size is 1 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [0]PETSC ERROR: Memory allocated 375435120 Memory used by process 428597248 [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [0]PETSC ERROR: Memory requested 3108507040! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri Nov 2 13:26:39 2007 [0]PETSC ERROR: Libraries linked from /u/01/01/zhifeng/install/lib/linux-gnu-c-debug [0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 --download-f-blas-lapack=1 --download-mpich=1 --prefix=/u/01/01/zhifeng/install --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c [0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/freespace.c [0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in src/mat/impls/sbaij/seq/sbaijfact2.c [0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/icc/icc.c [0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c =====================the solver convergence info====================== Convergence in 0 iterations. ===================================================================== writing solution to file temp_H.vtk ... number of unknowns >>435450 finishing the numeric solver ... LSFIM hybrid method for four domain problem deallocating memory of domain class ... I debug the code, and the error is dumped by the funciton KSPSetUp() which should not take so much memory, but still 3108507040 memory was requested .... I really could not figure out where I need them... Any help will be appreciated. Thank you all Best regards Zhifeng From z.sheng at ewi.tudelft.nl Fri Nov 2 07:56:22 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Fri, 02 Nov 2007 13:56:22 +0100 Subject: PCFactorSetLevels does not work?! In-Reply-To: References: <925346A443D4E340BEB20248BAFCDBDF02D8501F@CFEVS1-IP.americas.cray.com> <4728AC10.4040500@ewi.tudelft.nl> Message-ID: <472B1E76.1010200@ewi.tudelft.nl> Barry Smith wrote: > Are you calling the PCFactorSetLevels() on the prec where the >ICC is being used. For example if your * below means >-sub_pc_factor_levels then you need to use >PCBJacobiGetSubKSP() then get the PC out of that that subksp >and call the PCFactorSetLevels() on that > > Barry > > >On Wed, 31 Oct 2007, Zhifeng Sheng wrote: > > > >>Dear all >> >>I have a weird problem. >> >>In my programme, I try to use ICC(K), while I set the factor level with >> >>PCFactorSetLevels(prec, 4); >> >>However this does not work. >> >>Then I run the executable with the option *-pc_factor_levels 4, it works... >>Can someone tell me why this is happenning? >> >>Thank you >> >>Best regards >>Zhifeng Sheng >>* >> >> >> >> > > > Dear all I figured out the reason .... DO NOT use KSPSetUP() before you configurate you solver and preconditioners, otherwise, your configuration shall be ignored. Best regards Zhifeng From knepley at gmail.com Fri Nov 2 08:25:59 2007 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 Nov 2007 08:25:59 -0500 Subject: not enough memory?! In-Reply-To: <472B1E04.1090906@ewi.tudelft.nl> References: <472B1E04.1090906@ewi.tudelft.nl> Message-ID: The routine which calculates ICC requested 3G, so it seems that the matrix is too dense or large to be factored on this machine. Matt On Nov 2, 2007 7:54 AM, Zhifeng Sheng wrote: > Dear all > > I have a problem for memory management. > > I implemented 3d FEM code and at this moment SeqSBAIJ is used to store > the system matrix, then I used CG method and ICC(k) preconditioner. > > The test configuration is not very big, I succeeded in constructing the > system matrix and construct the preconditioner. But when I need to setup > the KSP solver, I got the error message below ( I also dumped some > information about solver and system matrix): > > KSP Object: > type: cg > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: icc > ICC: 2 levels of fill > ICC: factor fill ratio allocated 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqsbaij, rows=435450, cols=435450 > total: nonzeros=9470450, allocated nonzeros=21772580 > block size is 1 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 375435120 Memory used by process 428597248 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 3108507040! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 > 19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri > Nov 2 13:26:39 2007 > [0]PETSC ERROR: Libraries linked from > /u/01/01/zhifeng/install/lib/linux-gnu-c-debug > [0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 > --download-f-blas-lapack=1 --download-mpich=1 > --prefix=/u/01/01/zhifeng/install --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c > [0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/freespace.c > [0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in > src/mat/impls/sbaij/seq/sbaijfact2.c > [0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in > src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/icc/icc.c > [0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c > =====================the solver convergence info====================== > Convergence in 0 iterations. > ===================================================================== > writing solution to file temp_H.vtk ... > number of unknowns >>435450 > finishing the numeric solver ... > LSFIM hybrid method for four domain problem > deallocating memory of domain class ... > > > I debug the code, and the error is dumped by the funciton KSPSetUp() > which should not take so much memory, but still 3108507040 memory was > requested .... I really could not figure out where I need them... > > Any help will be appreciated. > > Thank you all > Best regards > Zhifeng > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From z.sheng at ewi.tudelft.nl Fri Nov 2 09:07:09 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Fri, 02 Nov 2007 15:07:09 +0100 Subject: not enough memory?! In-Reply-To: References: <472B1E04.1090906@ewi.tudelft.nl> Message-ID: <472B2F0D.4010906@ewi.tudelft.nl> Matthew Knepley wrote: >The routine which calculates ICC requested 3G, so it seems that the matrix >is too dense or large to be factored on this machine. > > Matt > >On Nov 2, 2007 7:54 AM, Zhifeng Sheng wrote: > > >>Dear all >> >>I have a problem for memory management. >> >>I implemented 3d FEM code and at this moment SeqSBAIJ is used to store >>the system matrix, then I used CG method and ICC(k) preconditioner. >> >>The test configuration is not very big, I succeeded in constructing the >>system matrix and construct the preconditioner. But when I need to setup >>the KSP solver, I got the error message below ( I also dumped some >>information about solver and system matrix): >> >>KSP Object: >> type: cg >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000 >> left preconditioning >>PC Object: >> type: icc >> ICC: 2 levels of fill >> ICC: factor fill ratio allocated 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqsbaij, rows=435450, cols=435450 >> total: nonzeros=9470450, allocated nonzeros=21772580 >> block size is 1 >>[0]PETSC ERROR: --------------------- Error Message >>------------------------------------ >>[0]PETSC ERROR: Out of memory. This could be due to allocating >>[0]PETSC ERROR: too large an object or bleeding by not properly >>[0]PETSC ERROR: destroying unneeded objects. >>[0]PETSC ERROR: Memory allocated 375435120 Memory used by process 428597248 >>[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. >>[0]PETSC ERROR: Memory requested 3108507040! >>[0]PETSC ERROR: >>------------------------------------------------------------------------ >>[0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 >>19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428 >>[0]PETSC ERROR: See docs/changes/index.html for recent updates. >>[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>[0]PETSC ERROR: See docs/index.html for manual pages. >>[0]PETSC ERROR: >>------------------------------------------------------------------------ >>[0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri >>Nov 2 13:26:39 2007 >>[0]PETSC ERROR: Libraries linked from >>/u/01/01/zhifeng/install/lib/linux-gnu-c-debug >>[0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007 >>[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 >>--download-f-blas-lapack=1 --download-mpich=1 >>--prefix=/u/01/01/zhifeng/install --with-shared=0 >>[0]PETSC ERROR: >>------------------------------------------------------------------------ >>[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c >>[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c >>[0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/freespace.c >>[0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in >>src/mat/impls/sbaij/seq/sbaijfact2.c >>[0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in >>src/mat/interface/matrix.c >>[0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/icc/icc.c >>[0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c >>[0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c >>[0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c >>=====================the solver convergence info====================== >>Convergence in 0 iterations. >>===================================================================== >>writing solution to file temp_H.vtk ... >>number of unknowns >>435450 >>finishing the numeric solver ... >>LSFIM hybrid method for four domain problem >>deallocating memory of domain class ... >> >> >>I debug the code, and the error is dumped by the funciton KSPSetUp() >>which should not take so much memory, but still 3108507040 memory was >>requested .... I really could not figure out where I need them... >> >>Any help will be appreciated. >> >>Thank you all >>Best regards >>Zhifeng >> >> >> >> > > > > > So ... computation of the preconditioner is actually called by KSPSetUp() instead of PCGetFactoredMatrix(prec,&M)? Could you please tell me when the system matrix be factorized (incompletely)? thank you. Zhifeng From hzhang at mcs.anl.gov Fri Nov 2 09:22:36 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 2 Nov 2007 09:22:36 -0500 (CDT) Subject: not enough memory?! In-Reply-To: <472B2F0D.4010906@ewi.tudelft.nl> References: <472B1E04.1090906@ewi.tudelft.nl> <472B2F0D.4010906@ewi.tudelft.nl> Message-ID: k>>> >>> >> >> >> >> > So ... computation of the preconditioner is actually called by KSPSetUp() > instead of PCGetFactoredMatrix(prec,&M)? Yes. > > Could you please tell me when the system matrix be factorized (incompletely)? > thank you. During a call of KSPSetUp(). Hong > > Zhifeng > > From z.sheng at ewi.tudelft.nl Fri Nov 2 10:04:48 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Fri, 02 Nov 2007 16:04:48 +0100 Subject: not enough memory?! In-Reply-To: References: <472B1E04.1090906@ewi.tudelft.nl> <472B2F0D.4010906@ewi.tudelft.nl> Message-ID: <472B3C90.2020906@ewi.tudelft.nl> Hong Zhang wrote: > > k>>> > >>>> >>> >>> >>> >>> >> So ... computation of the preconditioner is actually called by >> KSPSetUp() instead of PCGetFactoredMatrix(prec,&M)? > > > Yes. > >> >> Could you please tell me when the system matrix be factorized >> (incompletely)? thank you. > > > During a call of KSPSetUp(). > > Hong > >> >> Zhifeng >> >> > Thank you for your reply, since the problem I have is ICC took too much memory, but I think I can fix it with reorderring the matrix. My matrix is SeqSBAIJ and ICC is used as preconditioner, I think the default orderring is "natural", while I think nested-dissection is what I need. However, when I choose nested dissection with - *pc_factor_mat_ordering_type nd, this is what comes out: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] SPARSEPACKfndsep line 50 src/mat/order/fndsep.c [0]PETSC ERROR: [0] SPARSEPACKgennd line 70 src/mat/order/gennd.c [0]PETSC ERROR: [0] MatOrdering_ND line 18 src/mat/order/spnd.c [0]PETSC ERROR: [0] MatGetOrdering line 187 src/mat/order/sorder.c [0]PETSC ERROR: [0] PCSetup_ICC line 113 src/ksp/pc/impls/factor/icc/icc.c [0]PETSC ERROR: [0] PCSetUp line 778 src/ksp/pc/interface/precon.c [0]PETSC ERROR: [0] KSPSolve line 305 src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 19:13:22 CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri Nov 2 16:00:58 2007 [0]PETSC ERROR: Libraries linked from /u/01/01/zhifeng/install/lib/linux-gnu-c-debug [0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 --download-f-blas-lapack=1 --download-mpich=1 --prefix=/u/01/01/zhifeng/install --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 PS: which reorderring schemas are symmetric? Thanks alot Best regards Zhifeng Sheng * From jens.madsen at risoe.dk Fri Nov 2 10:37:27 2007 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Fri, 02 Nov 2007 16:37:27 +0100 Subject: Larger stencils in DA Message-ID: Hi Do you have examples where stencils are wider than the DA types (box and star)? I need a ghost region twice the normal size for a fourth order derivative. Are there any plans of introducing a stencil width into DA? Kind Regards Jens MAdsen -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Nov 2 10:47:47 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Fri, 2 Nov 2007 10:47:47 -0500 (CDT) Subject: not enough memory?! In-Reply-To: <472B3C90.2020906@ewi.tudelft.nl> References: <472B1E04.1090906@ewi.tudelft.nl> <472B2F0D.4010906@ewi.tudelft.nl> <472B3C90.2020906@ewi.tudelft.nl> Message-ID: >>> >> > Thank you for your reply, since the problem I have is ICC took too much > memory, but I think I can fix it with reorderring the matrix. > > My matrix is SeqSBAIJ and ICC is used as preconditioner, I think the default > orderring is "natural", while I think nested-dissection is what I need. > > However, when I choose nested dissection with - *pc_factor_mat_ordering_type > nd, this is what comes out: We do not support Matrix reordering for sbaij matrix format because reordering sbaij matrix changes matrix data structure - very expensive. You can use aij matrix, which enables efficient matrix reordering. We support icc for aij matrix. Note: using aij matrix should not increase much of memory, only additional half of original sparse matrix entries are added. All matrix operations for aij matrix can only be faster than sbaij's. Hong > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find > memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] SPARSEPACKfndsep line 50 src/mat/order/fndsep.c > [0]PETSC ERROR: [0] SPARSEPACKgennd line 70 src/mat/order/gennd.c > [0]PETSC ERROR: [0] MatOrdering_ND line 18 src/mat/order/spnd.c > [0]PETSC ERROR: [0] MatGetOrdering line 187 src/mat/order/sorder.c > [0]PETSC ERROR: [0] PCSetup_ICC line 113 src/ksp/pc/impls/factor/icc/icc.c > [0]PETSC ERROR: [0] PCSetUp line 778 src/ksp/pc/interface/precon.c > [0]PETSC ERROR: [0] KSPSolve line 305 src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 19:13:22 > CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng Fri Nov 2 > 16:00:58 2007 > [0]PETSC ERROR: Libraries linked from > /u/01/01/zhifeng/install/lib/linux-gnu-c-debug > [0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 > --download-f-blas-lapack=1 --download-mpich=1 > --prefix=/u/01/01/zhifeng/install --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown > file > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > PS: which reorderring schemas are symmetric? > > Thanks alot > > Best regards > Zhifeng Sheng > * > > > > > > From bsmith at mcs.anl.gov Fri Nov 2 10:53:52 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 Nov 2007 10:53:52 -0500 Subject: Larger stencils in DA In-Reply-To: References: Message-ID: Jens, This is the argument called s in the calls to DACreateXXX() I hope this serves your needs. Barry Note: with a wider stencil we still only support box and star so you may have some points that the DA thinks are in your stencil but that you do not need to use. This is harmless. On Nov 2, 2007, at 10:37 AM, jens.madsen at risoe.dk wrote: > Hi > > Do you have examples where stencils are wider than the DA types (box > and star)? I need a ghost region twice the normal size for a fourth > order derivative. > > Are there any plans of introducing a stencil width into DA? > > Kind Regards > > Jens MAdsen -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbostandoust at yahoo.com Fri Nov 2 10:50:12 2007 From: mbostandoust at yahoo.com (Mehdi Bostandoost) Date: Fri, 2 Nov 2007 08:50:12 -0700 (PDT) Subject: Larger stencils in DA In-Reply-To: Message-ID: <930783.14701.qm@web33507.mail.mud.yahoo.com> hi box or star are the type of overlapping. if you need more ghost points,you can specify it with parameter s in DACreate. mehdi jens.madsen at risoe.dk wrote: Hi Do you have examples where stencils are wider than the DA types (box and star)? I need a ghost region twice the normal size for a fourth order derivative. Are there any plans of introducing a stencil width into DA? Kind Regards Jens MAdsen __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Nov 2 11:01:18 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 Nov 2007 11:01:18 -0500 Subject: not enough memory?! In-Reply-To: <472B1E04.1090906@ewi.tudelft.nl> References: <472B1E04.1090906@ewi.tudelft.nl> Message-ID: <94D44493-F62A-48CD-AC21-FFE6B3A14631@mcs.anl.gov> In the output: ICC: factor fill ratio allocated 1 but ICC: 2 levels of fill You should use the option -pc_factor_fill (or -sub_pc_factor_fill if using block Jacobi or ASM in parallel) and set to be the expected ratio of nonzeros in the factored matrix and the unfactored matrix; I suggest starting with at least 3 since you are using 2 levels of fill. See the manual page for PCFactorSetFill() Barry On Nov 2, 2007, at 7:54 AM, Zhifeng Sheng wrote: > Dear all > > I have a problem for memory management. > > I implemented 3d FEM code and at this moment SeqSBAIJ is used to > store the system matrix, then I used CG method and ICC(k) > preconditioner. > > The test configuration is not very big, I succeeded in constructing > the system matrix and construct the preconditioner. But when I need > to setup the KSP solver, I got the error message below ( I also > dumped some information about solver and system matrix): > > KSP Object: > type: cg > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: icc > ICC: 2 levels of fill > ICC: factor fill ratio allocated 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqsbaij, rows=435450, cols=435450 > total: nonzeros=9470450, allocated nonzeros=21772580 > block size is 1 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 375435120 Memory used by process > 428597248 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 3108507040! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 > 19:13:22 CDT 2007 HG revision: > d7298c71db7f5e767f359ae35d33cab3bed44428 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: bin/main on a linux-gnu named callisto by zhifeng > Fri Nov 2 13:26:39 2007 > [0]PETSC ERROR: Libraries linked from /u/01/01/zhifeng/install/lib/ > linux-gnu-c-debug > [0]PETSC ERROR: Configure run at Wed Aug 8 13:46:26 2007 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=g77 -- > download-f-blas-lapack=1 --download-mpich=1 --prefix=/u/01/01/ > zhifeng/install --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/ > mtr.c > [0]PETSC ERROR: PetscFreeSpaceGet() line 14 in src/mat/utils/ > freespace.c > [0]PETSC ERROR: MatICCFactorSymbolic_SeqSBAIJ() line 1648 in src/mat/ > impls/sbaij/seq/sbaijfact2.c > [0]PETSC ERROR: MatICCFactorSymbolic() line 4611 in src/mat/ > interface/matrix.c > [0]PETSC ERROR: PCSetup_ICC() line 117 in src/ksp/pc/impls/factor/ > icc/icc.c > [0]PETSC ERROR: PCSetUp() line 801 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 338 in src/ksp/ksp/interface/itfunc.c > =====================the solver convergence info====================== > Convergence in 0 iterations. > ===================================================================== > writing solution to file temp_H.vtk ... > number of unknowns >>435450 > finishing the numeric solver ... > LSFIM hybrid method for four domain problem > deallocating memory of domain class ... > > > I debug the code, and the error is dumped by the funciton KSPSetUp() > which should not take so much memory, but still 3108507040 memory > was requested .... I really could not figure out where I need them... > > Any help will be appreciated. > > Thank you all > Best regards > Zhifeng > From z.sheng at ewi.tudelft.nl Tue Nov 6 03:29:04 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Tue, 06 Nov 2007 10:29:04 +0100 Subject: reordering does not work for ICC? Message-ID: <473033E0.6020802@ewi.tudelft.nl> Dear all I tried to reorder my the preconditioner with different reordering schema. All worked well for ILU, but does not make any difference on ICC. It seems that the reordering schema does not work for ICC at all.... Is it supposed to be like this? PS: I have a symmetric matrix and I would like to save some memory. I used SBAIJ with block=1, but some told me it's not efficient ... So... what can I do to save some memory on matrix and preconditioner? Thank you Best regards Zhifeng Sheng From bsmith at mcs.anl.gov Tue Nov 6 08:26:23 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Nov 2007 08:26:23 -0600 Subject: reordering does not work for ICC? In-Reply-To: <473033E0.6020802@ewi.tudelft.nl> References: <473033E0.6020802@ewi.tudelft.nl> Message-ID: <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov> On Nov 6, 2007, at 3:29 AM, Zhifeng Sheng wrote: > Dear all > > I tried to reorder my the preconditioner with different reordering > schema. > > All worked well for ILU, but does not make any difference on ICC. It > seems that the reordering schema does not work for ICC at all.... > > Is it supposed to be like this? Yes, with sbaij only the upper triangular part of the matrix is stored; hence reordering doesn't make sense since the values needed in the reordered form are not available. You can use the SeqAIJ format if you want to do reorderings with the ICC. Barry > > > PS: I have a symmetric matrix and I would like to save some memory. > I used SBAIJ with block=1, but some told me it's not efficient ... SBAIJ with block=1 is just as efficient as AIJ! The is seperate code for each block size. Barry > > So... what can I do to save some memory on matrix and preconditioner? > > > Thank you > Best regards > Zhifeng Sheng > From knepley at gmail.com Tue Nov 6 08:47:23 2007 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Nov 2007 08:47:23 -0600 Subject: Larger stencils in DA In-Reply-To: References: Message-ID: On Oct 31, 2007 8:38 AM, wrote: > > > > > Hi > > > > Do you have examples where stencils not fitting the DA types (box and star) > are being used? We have extensions for multicomponent problems, like DASetBlockFills(). > Are there any plans of introducing a stencil width into DA? We have one. It is an argument to DACreate2D(). Matt > > Kind Regards > > > > Jens MAdsen -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From jens.madsen at risoe.dk Tue Nov 6 09:03:34 2007 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Tue, 06 Nov 2007 16:03:34 +0100 Subject: Larger stencils in DA In-Reply-To: References: Message-ID: Hi Thank you for all your answer to my up to date most stupid question ever :-) I somehow missed the "s" parameter in the documentation. Sorry Jens Madsen Ph.d.-studerende Phone direct +45 4677 4560 Mobile jens.madsen at risoe.dk Optics and Plasma Research Department Ris? National Laboratory Technical University of Denmark - DTU Building 128, P.O. Box 49 DK-4000 Roskilde, Denmark Tel +45 4677 4500 Fax +45 4677 4565 www.risoe.dk >From 1 January 2007, Ris? National Laboratory, the Danish Institute for Food and Veterinary Research, the Danish Institute for Fisheries Research, the Danish National Space Center and the Danish Transport Research Institute have been merged with the Technical University of Denmark (DTU) with DTU as the continuing unit. -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Tuesday, November 06, 2007 3:47 PM To: petsc-users at mcs.anl.gov Cc: petsc-maint Subject: Re: Larger stencils in DA On Oct 31, 2007 8:38 AM, wrote: > > > > > Hi > > > > Do you have examples where stencils not fitting the DA types (box and star) > are being used? We have extensions for multicomponent problems, like DASetBlockFills(). > Are there any plans of introducing a stencil width into DA? We have one. It is an argument to DACreate2D(). Matt > > Kind Regards > > > > Jens MAdsen -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From z.sheng at ewi.tudelft.nl Tue Nov 6 09:29:43 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Tue, 06 Nov 2007 16:29:43 +0100 Subject: reordering does not work for ICC? In-Reply-To: <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov> References: <473033E0.6020802@ewi.tudelft.nl> <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov> Message-ID: <47308867.1010309@ewi.tudelft.nl> Barry Smith wrote: > > On Nov 6, 2007, at 3:29 AM, Zhifeng Sheng wrote: > >> Dear all >> >> I tried to reorder my the preconditioner with different reordering >> schema. >> >> All worked well for ILU, but does not make any difference on ICC. It >> seems that the reordering schema does not work for ICC at all.... >> >> Is it supposed to be like this? > > > Yes, with sbaij only the upper triangular part of the matrix is > stored; hence reordering doesn't make sense > since the values needed in the reordered form are not available. You > can use the SeqAIJ format if you want > to do reorderings with the ICC. > > Barry > >> >> >> PS: I have a symmetric matrix and I would like to save some memory. I >> used SBAIJ with block=1, but some told me it's not efficient ... > > > SBAIJ with block=1 is just as efficient as AIJ! The is seperate code > for each block size. > > Barry > >> >> So... what can I do to save some memory on matrix and preconditioner? >> >> >> Thank you >> Best regards >> Zhifeng Sheng >> > Thank you for your answer. Could you please tell me if I can first assemble the matrix with a sufficiently large nonzero number per row and then release the redundent memory after the assembly is done? and what does MatCompress do? I tried it on my matrix, nothing happened ... Thank you Best regards From bsmith at mcs.anl.gov Tue Nov 6 10:29:29 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Nov 2007 10:29:29 -0600 Subject: reordering does not work for ICC? In-Reply-To: <47308867.1010309@ewi.tudelft.nl> References: <473033E0.6020802@ewi.tudelft.nl> <90C0D9F9-8EEF-4B59-8DEB-CF14DE5F40C2@mcs.anl.gov> <47308867.1010309@ewi.tudelft.nl> Message-ID: There is no mechanism to release that extra memory. Thus it is important to preallocate well for large problems. Barry On Nov 6, 2007, at 9:29 AM, Zhifeng Sheng wrote: > Barry Smith wrote: > >> >> On Nov 6, 2007, at 3:29 AM, Zhifeng Sheng wrote: >> >>> Dear all >>> >>> I tried to reorder my the preconditioner with different reordering >>> schema. >>> >>> All worked well for ILU, but does not make any difference on ICC. >>> It seems that the reordering schema does not work for ICC at all.... >>> >>> Is it supposed to be like this? >> >> >> Yes, with sbaij only the upper triangular part of the matrix is >> stored; hence reordering doesn't make sense >> since the values needed in the reordered form are not available. >> You can use the SeqAIJ format if you want >> to do reorderings with the ICC. >> >> Barry >> >>> >>> >>> PS: I have a symmetric matrix and I would like to save some >>> memory. I used SBAIJ with block=1, but some told me it's not >>> efficient ... >> >> >> SBAIJ with block=1 is just as efficient as AIJ! The is seperate >> code for each block size. >> >> Barry >> >>> >>> So... what can I do to save some memory on matrix and >>> preconditioner? >>> >>> >>> Thank you >>> Best regards >>> Zhifeng Sheng >>> >> > Thank you for your answer. > > Could you please tell me if I can first assemble the matrix with a > sufficiently large nonzero number per row and then release the > redundent memory after the assembly is done? > > and what does MatCompress do? I tried it on my matrix, nothing > happened ... > > Thank you > Best regards > From z.sheng at ewi.tudelft.nl Mon Nov 12 10:36:54 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Mon, 12 Nov 2007 17:36:54 +0100 Subject: Nonzeros of SBAIJ Message-ID: <47388126.2010607@ewi.tudelft.nl> Dear all my matrix is SeqSBAIJ with block size 1, so I am wondering when I specify the nonzeros in a row, does it mean the actually nonzeros numbers or the memory that is needed? (for instance, for SeqSBAIJ, the actual nonzeros in a row would twice as much as memory needed) Thank you Best regards Zhifeng From hzhang at mcs.anl.gov Mon Nov 12 11:16:06 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Mon, 12 Nov 2007 11:16:06 -0600 (CST) Subject: Nonzeros of SBAIJ In-Reply-To: <47388126.2010607@ewi.tudelft.nl> References: <47388126.2010607@ewi.tudelft.nl> Message-ID: On Mon, 12 Nov 2007, Zhifeng Sheng wrote: > Dear all > > my matrix is SeqSBAIJ with block size 1, so I am wondering when I specify the > nonzeros in a row, does it mean the actually nonzeros numbers or the memory > that is needed? (for instance, for SeqSBAIJ, the actual nonzeros in a row > would twice as much as memory needed) No, you specify the nonzeros of upper triangular part. Hong > > Thank you > Best regards > Zhifeng > > From bsmith at mcs.anl.gov Mon Nov 12 13:21:51 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 12 Nov 2007 13:21:51 -0600 Subject: Nonzeros of SBAIJ In-Reply-To: References: <47388126.2010607@ewi.tudelft.nl> Message-ID: <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov> Plus 1 for each row for the diagonal. Barry On Nov 12, 2007, at 11:16 AM, Hong Zhang wrote: > > > On Mon, 12 Nov 2007, Zhifeng Sheng wrote: > >> Dear all >> >> my matrix is SeqSBAIJ with block size 1, so I am wondering when I >> specify the nonzeros in a row, does it mean the actually nonzeros >> numbers or the memory that is needed? (for instance, for SeqSBAIJ, >> the actual nonzeros in a row would twice as much as memory needed) > > No, you specify the nonzeros of upper triangular part. > > Hong > >> >> Thank you >> Best regards >> Zhifeng >> >> > From grs2103 at columbia.edu Tue Nov 13 10:06:29 2007 From: grs2103 at columbia.edu (Gideon Simpson) Date: Tue, 13 Nov 2007 11:06:29 -0500 Subject: multi core os x machines Message-ID: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> Has anyone had any success in getting good performance on multi-core intel os x machines with petsc? What's the right way to get MPICH up and running for such a thing? -Gideon Simpson Department of Applied Physics and Applied Mathematics Columbia University -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Nov 13 10:14:17 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 Nov 2007 10:14:17 -0600 Subject: multi core os x machines In-Reply-To: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> Message-ID: Not possible. The problem is that with one process it uses all the memory bandwidth, when you change to use 2 processes (2 cores) each core now gets only half the memory bandwidth and hence essentially half the speed. Barry Barry On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote: > Has anyone had any success in getting good performance on multi-core > intel os x machines with petsc? What's the right way to get MPICH > up and running for such a thing? > > -Gideon Simpson > Department of Applied Physics and Applied Mathematics > Columbia University > > From grs2103 at columbia.edu Tue Nov 13 10:23:01 2007 From: grs2103 at columbia.edu (Gideon Simpson) Date: Tue, 13 Nov 2007 11:23:01 -0500 Subject: multi core os x machines In-Reply-To: References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> Message-ID: This is also true for a multi-processor machine, or its unique to multi-core machines? -gideon On Nov 13, 2007, at 11:14 AM, Barry Smith wrote: > > Not possible. The problem is that with one process it uses all > the memory > bandwidth, when you change to use 2 processes (2 cores) each core > now gets only half the memory bandwidth and hence essentially half > the speed. > > Barry > > > Barry > > On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote: > >> Has anyone had any success in getting good performance on multi- >> core intel os x machines with petsc? What's the right way to get >> MPICH up and running for such a thing? >> >> -Gideon Simpson >> Department of Applied Physics and Applied Mathematics >> Columbia University >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Nov 13 10:31:05 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 Nov 2007 10:31:05 -0600 Subject: multi core os x machines In-Reply-To: References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> Message-ID: <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov> It depends on how the memory is connected to the individual cores or CPUS; for example the AMD has a different approach than Intel. If the different processors/cores have SEPERATE paths to memory then you will not see this terrible effect. Barry On Nov 13, 2007, at 10:23 AM, Gideon Simpson wrote: > This is also true for a multi-processor machine, or its unique to > multi-core machines? > -gideon > > On Nov 13, 2007, at 11:14 AM, Barry Smith wrote: > >> >> Not possible. The problem is that with one process it uses all >> the memory >> bandwidth, when you change to use 2 processes (2 cores) each core >> now gets only half the memory bandwidth and hence essentially half >> the speed. >> >> Barry >> >> >> Barry >> >> On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote: >> >>> Has anyone had any success in getting good performance on multi- >>> core intel os x machines with petsc? What's the right way to get >>> MPICH up and running for such a thing? >>> >>> -Gideon Simpson >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Nov 13 10:57:07 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Nov 2007 10:57:07 -0600 (CST) Subject: multi core os x machines In-Reply-To: <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov> References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov> Message-ID: Actually the new intels are pretty similar to the AMDs these days. There are 2 things here: - multiple-cores per chip - and multiple chips For eg: one can buy 2-chip-dual-core = 4CPU machine. [with AMD each chip has a separate memory bank. With intel, there is a single controller with multiple banks. But when 1 chip is used - only half the memory banks are accessed - or something like that] So in both AMD and Intel, when both chips [each chip - a dual-core] are used, MB available scales up - as compared to 1 chip usage. However within a chip [i.e dual core] - the MB from main memory to cpu/cache is same irrespective of both cores being used or only one. So when both are used - the effective memory bandwith is not scaling up. So to get best parallel speedup - one should choose 'np' as 'no_of_memory banks' - not 'no_of_cpus'. So, on this 2x2 = 4CPU machine, I suspect the best performance scaling can be seen only with '-np 2' Wrt MPICH on SMP, we were sugested to use the following MPICH configure options: --with-pm=gforker --device=ch3:nemesis --enable-fast Satish On Tue, 13 Nov 2007, Barry Smith wrote: > > It depends on how the memory is connected to the individual cores or CPUS; > for example the AMD has a different approach than Intel. If the different > processors/cores > have SEPERATE paths to memory then you will not see this terrible effect. > > Barry > > > > On Nov 13, 2007, at 10:23 AM, Gideon Simpson wrote: > > > This is also true for a multi-processor machine, or its unique to multi-core > > machines? > > -gideon > > > > On Nov 13, 2007, at 11:14 AM, Barry Smith wrote: > > > > > > > > Not possible. The problem is that with one process it uses all the memory > > > bandwidth, when you change to use 2 processes (2 cores) each core > > > now gets only half the memory bandwidth and hence essentially half > > > the speed. > > > > > > Barry > > > > > > > > > Barry > > > > > > On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote: > > > > > > > Has anyone had any success in getting good performance on multi-core > > > > intel os x machines with petsc? What's the right way to get MPICH up > > > > and running for such a thing? > > > > > > > > -Gideon Simpson > > > > Department of Applied Physics and Applied Mathematics > > > > Columbia University > > > > > > > > > > > > > > From randy at geosystem.us Tue Nov 13 11:01:47 2007 From: randy at geosystem.us (Randall Mackie) Date: Tue, 13 Nov 2007 09:01:47 -0800 Subject: multi core os x machines In-Reply-To: <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov> References: <5248B9BC-2C7A-4783-821C-3DFBF44A98B6@columbia.edu> <679E6589-C2EE-455B-ACB7-971A065B16A9@mcs.anl.gov> Message-ID: <4739D87B.4060300@geosystem.us> We have a 64 node cluster, each node being a quad core Intel Xeon chip, so we have a total of 256 cpus. i'm not quite sure of the chip architecture and the memory paths. With infiniband, each cpu can go at full 100% during a PETSc execution. The key for us was the infiniband and the special mpi that is tuned for the infiniband - without them, performance was much worse (ie, using mpich). Randy M. Barry Smith wrote: > > It depends on how the memory is connected to the individual cores or CPUS; > for example the AMD has a different approach than Intel. If the > different processors/cores > have SEPERATE paths to memory then you will not see this terrible effect. > > Barry > > > > On Nov 13, 2007, at 10:23 AM, Gideon Simpson wrote: > >> This is also true for a multi-processor machine, or its unique to >> multi-core machines? >> -gideon >> >> On Nov 13, 2007, at 11:14 AM, Barry Smith wrote: >> >>> >>> Not possible. The problem is that with one process it uses all the >>> memory >>> bandwidth, when you change to use 2 processes (2 cores) each core >>> now gets only half the memory bandwidth and hence essentially half >>> the speed. >>> >>> Barry >>> >>> >>> Barry >>> >>> On Nov 13, 2007, at 10:06 AM, Gideon Simpson wrote: >>> >>>> Has anyone had any success in getting good performance on multi-core >>>> intel os x machines with petsc? What's the right way to get MPICH >>>> up and running for such a thing? >>>> >>>> -Gideon Simpson >>>> Department of Applied Physics and Applied Mathematics >>>> Columbia University >>>> >>>> >>> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From jiaxun_hou at yahoo.com.cn Wed Nov 14 04:12:48 2007 From: jiaxun_hou at yahoo.com.cn (jiaxun hou) Date: Wed, 14 Nov 2007 18:12:48 +0800 (CST) Subject: about MatGetRow/MatRestoreRow Message-ID: <412086.81470.qm@web15808.mail.cnb.yahoo.com> Dear all, Does anyone have examples of using MatGetRow/MatRestoreRow? I failed in using them. My code is: PetscInt ncols_A; const PetscInt** cols_A_point; const PetscScalar **vals_A_point; for (i=0;i From knepley at gmail.com Wed Nov 14 06:28:20 2007 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Nov 2007 06:28:20 -0600 Subject: about MatGetRow/MatRestoreRow In-Reply-To: <412086.81470.qm@web15808.mail.cnb.yahoo.com> References: <412086.81470.qm@web15808.mail.cnb.yahoo.com> Message-ID: On Nov 14, 2007 4:12 AM, jiaxun hou wrote: > Dear all, > > Does anyone have examples of using MatGetRow/MatRestoreRow? > I failed in using them. > > My code is: > > PetscInt ncols_A; > const PetscInt** cols_A_point; > const PetscScalar **vals_A_point; This is not proper C usage since you never allocate space for the pointers. You want const PetscInt *cols; const PetscScalar *vals; > for (i=0;i ierr = MatGetRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr); MatGetRow(A,i,&ncols,&cols,&vals); and so on. Matt > //do something > ierr = MatRestoreRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr); > } > > and it gets errors as below: > > Petsc Release Version 2.3.1, Patch 10, Thu Mar 9 22:48:00 CST 2006 > BK revision: balay at asterix.mcs.anrank 0 in job 61 lab_43825 caused > collective abort of all ranks > exit status of rank 0: killed by signal 9 > [cli_0]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 64) - process 0 > l.gov|ChangeSet|20060310044535|22333 > See docs/changes/index.html for recent updates. > See docs/faq.html for hints about > trouble shooting. > See docs/index.html for manual pages. > ------------------------------------------------------------------------ > ./mytest1 on a linux-gnu named lab by root Wed Nov 14 17:57:00 2007 > Libraries linked from > /home/software/petsc-2.3.1-p10/lib/linux-gnu-cxx-complex-debug > Configure run at Thu Jun 15 13:08:29 2006 > Configure options --with-cc=gcc --with-fc=gfortran > --download-f-blas-lapack=1 --with-mpi-dir=/home/software/mpich2 > --with-scalar-type=complex --with-shared=0 > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscObjectDestroy() line 88 in src/sys/objects/destroy.c > [0]PETSC ERROR: Corrupt argument: see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Corrupt! > [0]PETSC ERROR: Invalid type of object: Parameter # 1! > [0]PETSC ERROR: PetscObjectRegisterDestroyAll() line 228 in > src/sys/objects/destroy.c > [0]PETSC ERROR: PetscFinalize() line 599 in > src/sys/objects/pinit.c > [0]PETSC ERROR: main() line 329 in /home/myprogram/mypro/mytest1.c > make: [runmytest1] Error 137 (ignored) > > Can anyone tell me where is wrong? THX > > > > ________________________________ > ?????????? -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From jiaxun_hou at yahoo.com.cn Wed Nov 14 07:00:15 2007 From: jiaxun_hou at yahoo.com.cn (jiaxun hou) Date: Wed, 14 Nov 2007 21:00:15 +0800 (CST) Subject: =?gb2312?q?Re=A3=BA=20Re:=20about=20MatGetRow/MatRestoreRow?= In-Reply-To: Message-ID: <39757.46716.qm@web15814.mail.cnb.yahoo.com> Thanks a lot! Matthew Knepley ??? On Nov 14, 2007 4:12 AM, jiaxun hou wrote: > Dear all, > > Does anyone have examples of using MatGetRow/MatRestoreRow? > I failed in using them. > > My code is: > > PetscInt ncols_A; > const PetscInt** cols_A_point; > const PetscScalar **vals_A_point; This is not proper C usage since you never allocate space for the pointers. You want const PetscInt *cols; const PetscScalar *vals; > for (i=0;i > ierr = MatGetRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr); MatGetRow(A,i,&ncols,&cols,&vals); and so on. Matt > //do something > ierr = MatRestoreRow(A,i,&ncols_A,cols_A_point,vals_A_point);CHKERRQ(ierr); > } > > and it gets errors as below: > > Petsc Release Version 2.3.1, Patch 10, Thu Mar 9 22:48:00 CST 2006 > BK revision: balay at asterix.mcs.anrank 0 in job 61 lab_43825 caused > collective abort of all ranks > exit status of rank 0: killed by signal 9 > [cli_0]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 64) - process 0 > l.gov|ChangeSet|20060310044535|22333 > See docs/changes/index.html for recent updates. > See docs/faq.html for hints about > trouble shooting. > See docs/index.html for manual pages. > ------------------------------------------------------------------------ > ./mytest1 on a linux-gnu named lab by root Wed Nov 14 17:57:00 2007 > Libraries linked from > /home/software/petsc-2.3.1-p10/lib/linux-gnu-cxx-complex-debug > Configure run at Thu Jun 15 13:08:29 2006 > Configure options --with-cc=gcc --with-fc=gfortran > --download-f-blas-lapack=1 --with-mpi-dir=/home/software/mpich2 > --with-scalar-type=complex --with-shared=0 > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscObjectDestroy() line 88 in src/sys/objects/destroy.c > [0]PETSC ERROR: Corrupt argument: see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Corrupt! > [0]PETSC ERROR: Invalid type of object: Parameter # 1! > [0]PETSC ERROR: PetscObjectRegisterDestroyAll() line 228 in > src/sys/objects/destroy.c > [0]PETSC ERROR: PetscFinalize() line 599 in > src/sys/objects/pinit.c > [0]PETSC ERROR: main() line 329 in /home/myprogram/mypro/mytest1.c > make: [runmytest1] Error 137 (ignored) > > Can anyone tell me where is wrong? THX > > > > ________________________________ > ?????????? -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener --------------------------------- ?????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.stitt at ichec.ie Wed Nov 14 08:13:28 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Wed, 14 Nov 2007 14:13:28 +0000 Subject: AX=B Fortran Petsc Code Message-ID: <473B0288.2060002@ichec.ie> Dear PETSc Users/Developers, I have the following sequential Fortran PETSc code that I have been developing (on and off) based on the kind advice given by members of this list, with respect to solving an inverse sparse matrix problem. Essentially, the code reads in a square double complex matrix from external file of size (order x order) and then proceeds to do a MatMatSolve(), where A is the sparse matrix to invert, B is a dense identity matrix and X is the resultant dense matrix....hope that makes sense. My main problem is that the code stalls on the MatSetValues() for the sparse matrix A. With a trivial test matrix of (224 x 224) the program terminates successfully (by successfully I mean all instructions execute...I am not interested in the validity of X right now). Unfortunately, when I move up to a (2352 x 2352) matrix the MatSetValues() routine for matrix A is still in progress after 15 minutes on one processor of our AMD Opteron IBM Cluster. I know that people will be screaming "preallocation"...but I have tried to take this into account by running a loop over the rows in A and counting the non-zero values explicitly prior to creation. I then pass this vector into the creation routine for the nnz argument. For the large (2352 x 2352) problem that seems to be taking forever to set...at most there are only 200 elements per row that are non-zero according to the counts. Can anyone explain why the MatSetValues() routine is taking such a long time. Maybe this expected for this specific task...although it seems very long? I did notice that on the trivial (224 x 224) run that I was still getting mallocs (approx 2000) for the A assembly when I used the -info command line parameter. I thought that it should be 0 if my preallocation counts were exact? Does this hint that I am doing something wrong. I have checked the code but don't see any obvious problems in the logic...not that means anything. I would be grateful if someone could advise on this matter. Also, if you have a few seconds to spare I would be grateful if some experts could scan the remaining logic of the code (not in fine detail) to make sure that I am doing all that I need to do to get this calculation working...assuming I can resolve the MatSetValues() problem. Once again many thanks in advance, Tim. ! Initialise the PETSc MPI Harness call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) ! Read in Matrix open(321,file='Hamiltonian.bin',form='unformatted') read(321) order if (ID==0) then print * print *,processes," Processing Elements being used" print * print *,"Matrix has order ",order," rows by ",order," columns" print * end if allocate(matrix(order,order)) read(321) matrix close(321) ! Allocate array for nnz allocate(numberZero(order)) ! Count number of non-zero elements in each matrix row do row=1,order count=0 do column=1,order if (matrix(row,column).ne.(0,0)) count=count+1 end do numberZero(row)=count end do ! Declare a PETSc Matrices call MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) call MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) call MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) call MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) ! Set up zero-based array indexing for use in MatSetValues allocate(columnIndices(order)) do column=1,order columnIndices(column)=column-1 end do ! Need to transpose values array as row-major arrays are used. call MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) ! Assemble Matrix A call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) deallocate(matrix) ! Create Index Sets for Factorisation call ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) call MatFactorInfoInitialize(info,error);CHKERRQ(error) call ISSetPermutation(indexSet,error);CHKERRQ(error) call MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) ! A no-longer needed call MatDestroy(A,error);CHKERRQ(error) one=(1,0) ! Set Diagonal elements in Identity Matrix B do row=0,order-1 call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) end do ! Assemble B call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) ! Assemble X call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) ! Solve AX=B call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) ! Deallocate Storage deallocate(columnIndices) call MatDestroy(factorMat,error);CHKERRQ(error) call MatDestroy(B,error);CHKERRQ(error) call MatDestroy(X,error);CHKERRQ(error) call PetscFinalize(error) -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Wed Nov 14 08:29:21 2007 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Nov 2007 08:29:21 -0600 Subject: AX=B Fortran Petsc Code In-Reply-To: <473B0288.2060002@ichec.ie> References: <473B0288.2060002@ichec.ie> Message-ID: You appear to be setting every value in the sparse matrix. We do not throw out 0 values (since sometimes they are necessary for structural reasons). Thus you are allocating a ton of times. You need to remove the 0 values before calling MatSetValues (and their associated column entires as well). Matt On Nov 14, 2007 8:13 AM, Tim Stitt wrote: > Dear PETSc Users/Developers, > > I have the following sequential Fortran PETSc code that I have been > developing (on and off) based on the kind advice given by members of > this list, with respect to solving an inverse sparse matrix problem. > Essentially, the code reads in a square double complex matrix from > external file of size (order x order) and then proceeds to do a > MatMatSolve(), where A is the sparse matrix to invert, B is a dense > identity matrix and X is the resultant dense matrix....hope that makes > sense. > > My main problem is that the code stalls on the MatSetValues() for the > sparse matrix A. With a trivial test matrix of (224 x 224) the program > terminates successfully (by successfully I mean all instructions > execute...I am not interested in the validity of X right now). > Unfortunately, when I move up to a (2352 x 2352) matrix the > MatSetValues() routine for matrix A is still in progress after 15 > minutes on one processor of our AMD Opteron IBM Cluster. I know that > people will be screaming "preallocation"...but I have tried to take this > into account by running a loop over the rows in A and counting the > non-zero values explicitly prior to creation. I then pass this vector > into the creation routine for the nnz argument. For the large (2352 x > 2352) problem that seems to be taking forever to set...at most there are > only 200 elements per row that are non-zero according to the counts. > > Can anyone explain why the MatSetValues() routine is taking such a long > time. Maybe this expected for this specific task...although it seems > very long? > > I did notice that on the trivial (224 x 224) run that I was still > getting mallocs (approx 2000) for the A assembly when I used the -info > command line parameter. I thought that it should be 0 if my > preallocation counts were exact? Does this hint that I am doing > something wrong. I have checked the code but don't see any obvious > problems in the logic...not that means anything. > > I would be grateful if someone could advise on this matter. Also, if you > have a few seconds to spare I would be grateful if some experts could > scan the remaining logic of the code (not in fine detail) to make sure > that I am doing all that I need to do to get this calculation > working...assuming I can resolve the MatSetValues() problem. > > Once again many thanks in advance, > > Tim. > > ! Initialise the PETSc MPI Harness > call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) > > call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) > call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) > > ! Read in Matrix > open(321,file='Hamiltonian.bin',form='unformatted') > read(321) order > if (ID==0) then > print * > print *,processes," Processing Elements being used" > print * > print *,"Matrix has order ",order," rows by ",order," columns" > print * > end if > > allocate(matrix(order,order)) > read(321) matrix > close(321) > > ! Allocate array for nnz > allocate(numberZero(order)) > > ! Count number of non-zero elements in each matrix row > do row=1,order > count=0 > do column=1,order > if (matrix(row,column).ne.(0,0)) count=count+1 > end do > numberZero(row)=count > end do > > ! Declare a PETSc Matrices > > call > MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) > call > MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) > call > MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) > call > MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) > > ! Set up zero-based array indexing for use in MatSetValues > allocate(columnIndices(order)) > > do column=1,order > columnIndices(column)=column-1 > end do > > ! Need to transpose values array as row-major arrays are used. > call > MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) > > ! Assemble Matrix A > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > deallocate(matrix) > > ! Create Index Sets for Factorisation > call > ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) > call MatFactorInfoInitialize(info,error);CHKERRQ(error) > call ISSetPermutation(indexSet,error);CHKERRQ(error) > call > MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) > call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) > > ! A no-longer needed > call MatDestroy(A,error);CHKERRQ(error) > > one=(1,0) > > ! Set Diagonal elements in Identity Matrix B > do row=0,order-1 > call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) > end do > > ! Assemble B > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > ! Assemble X > call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > ! Solve AX=B > call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) > > ! Deallocate Storage > deallocate(columnIndices) > > call MatDestroy(factorMat,error);CHKERRQ(error) > call MatDestroy(B,error);CHKERRQ(error) > call MatDestroy(X,error);CHKERRQ(error) > > call PetscFinalize(error) > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Wed Nov 14 08:47:01 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Wed, 14 Nov 2007 14:47:01 +0000 Subject: AX=B Fortran Petsc Code In-Reply-To: References: <473B0288.2060002@ichec.ie> Message-ID: <473B0A65.4090607@ichec.ie> Matthew, OK...I see what you are saying. I initially set A a row at a time but for performance reasons I thought doing it at once would be better. I overlooked the fact that the logical 2D matrix input to MatSetValues() is non-zero values only. With hindsight I now remember that was the case for each individual row. Thanks for pointing that out.... Regards. Matthew Knepley wrote: > You appear to be setting every value in the sparse matrix. We do not > throw out 0 values (since sometimes they are necessary for structural > reasons). Thus you are allocating a ton of times. You need to remove > the 0 values before calling MatSetValues (and their associated > column entires as well). > > Matt > > On Nov 14, 2007 8:13 AM, Tim Stitt wrote: > >> Dear PETSc Users/Developers, >> >> I have the following sequential Fortran PETSc code that I have been >> developing (on and off) based on the kind advice given by members of >> this list, with respect to solving an inverse sparse matrix problem. >> Essentially, the code reads in a square double complex matrix from >> external file of size (order x order) and then proceeds to do a >> MatMatSolve(), where A is the sparse matrix to invert, B is a dense >> identity matrix and X is the resultant dense matrix....hope that makes >> sense. >> >> My main problem is that the code stalls on the MatSetValues() for the >> sparse matrix A. With a trivial test matrix of (224 x 224) the program >> terminates successfully (by successfully I mean all instructions >> execute...I am not interested in the validity of X right now). >> Unfortunately, when I move up to a (2352 x 2352) matrix the >> MatSetValues() routine for matrix A is still in progress after 15 >> minutes on one processor of our AMD Opteron IBM Cluster. I know that >> people will be screaming "preallocation"...but I have tried to take this >> into account by running a loop over the rows in A and counting the >> non-zero values explicitly prior to creation. I then pass this vector >> into the creation routine for the nnz argument. For the large (2352 x >> 2352) problem that seems to be taking forever to set...at most there are >> only 200 elements per row that are non-zero according to the counts. >> >> Can anyone explain why the MatSetValues() routine is taking such a long >> time. Maybe this expected for this specific task...although it seems >> very long? >> >> I did notice that on the trivial (224 x 224) run that I was still >> getting mallocs (approx 2000) for the A assembly when I used the -info >> command line parameter. I thought that it should be 0 if my >> preallocation counts were exact? Does this hint that I am doing >> something wrong. I have checked the code but don't see any obvious >> problems in the logic...not that means anything. >> >> I would be grateful if someone could advise on this matter. Also, if you >> have a few seconds to spare I would be grateful if some experts could >> scan the remaining logic of the code (not in fine detail) to make sure >> that I am doing all that I need to do to get this calculation >> working...assuming I can resolve the MatSetValues() problem. >> >> Once again many thanks in advance, >> >> Tim. >> >> ! Initialise the PETSc MPI Harness >> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) >> >> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) >> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) >> >> ! Read in Matrix >> open(321,file='Hamiltonian.bin',form='unformatted') >> read(321) order >> if (ID==0) then >> print * >> print *,processes," Processing Elements being used" >> print * >> print *,"Matrix has order ",order," rows by ",order," columns" >> print * >> end if >> >> allocate(matrix(order,order)) >> read(321) matrix >> close(321) >> >> ! Allocate array for nnz >> allocate(numberZero(order)) >> >> ! Count number of non-zero elements in each matrix row >> do row=1,order >> count=0 >> do column=1,order >> if (matrix(row,column).ne.(0,0)) count=count+1 >> end do >> numberZero(row)=count >> end do >> >> ! Declare a PETSc Matrices >> >> call >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) >> call >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) >> call >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) >> call >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) >> >> ! Set up zero-based array indexing for use in MatSetValues >> allocate(columnIndices(order)) >> >> do column=1,order >> columnIndices(column)=column-1 >> end do >> >> ! Need to transpose values array as row-major arrays are used. >> call >> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) >> >> ! Assemble Matrix A >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> >> deallocate(matrix) >> >> ! Create Index Sets for Factorisation >> call >> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) >> call MatFactorInfoInitialize(info,error);CHKERRQ(error) >> call ISSetPermutation(indexSet,error);CHKERRQ(error) >> call >> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) >> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) >> >> ! A no-longer needed >> call MatDestroy(A,error);CHKERRQ(error) >> >> one=(1,0) >> >> ! Set Diagonal elements in Identity Matrix B >> do row=0,order-1 >> call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) >> end do >> >> ! Assemble B >> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> >> ! Assemble X >> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> >> ! Solve AX=B >> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) >> >> ! Deallocate Storage >> deallocate(columnIndices) >> >> call MatDestroy(factorMat,error);CHKERRQ(error) >> call MatDestroy(B,error);CHKERRQ(error) >> call MatDestroy(X,error);CHKERRQ(error) >> >> call PetscFinalize(error) >> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From balay at mcs.anl.gov Wed Nov 14 09:20:48 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 14 Nov 2007 09:20:48 -0600 (CST) Subject: AX=B Fortran Petsc Code In-Reply-To: <473B0A65.4090607@ichec.ie> References: <473B0288.2060002@ichec.ie> <473B0A65.4090607@ichec.ie> Message-ID: You can use: MatSetOption(mat,MAT_IGNORE_ZERO_ENTRIES) So subsequent MatSetValues() will ignore these zero entries. Satish On Wed, 14 Nov 2007, Tim Stitt wrote: > Matthew, > > OK...I see what you are saying. > > I initially set A a row at a time but for performance reasons I thought doing > it at once would be better. I overlooked the fact that the logical 2D matrix > input to MatSetValues() is non-zero values only. With hindsight I now remember > that was the case for each individual row. > > Thanks for pointing that out.... > > Regards. > > Matthew Knepley wrote: > > You appear to be setting every value in the sparse matrix. We do not > > throw out 0 values (since sometimes they are necessary for structural > > reasons). Thus you are allocating a ton of times. You need to remove > > the 0 values before calling MatSetValues (and their associated > > column entires as well). > > > > Matt > > > > On Nov 14, 2007 8:13 AM, Tim Stitt wrote: > > > > > Dear PETSc Users/Developers, > > > > > > I have the following sequential Fortran PETSc code that I have been > > > developing (on and off) based on the kind advice given by members of > > > this list, with respect to solving an inverse sparse matrix problem. > > > Essentially, the code reads in a square double complex matrix from > > > external file of size (order x order) and then proceeds to do a > > > MatMatSolve(), where A is the sparse matrix to invert, B is a dense > > > identity matrix and X is the resultant dense matrix....hope that makes > > > sense. > > > > > > My main problem is that the code stalls on the MatSetValues() for the > > > sparse matrix A. With a trivial test matrix of (224 x 224) the program > > > terminates successfully (by successfully I mean all instructions > > > execute...I am not interested in the validity of X right now). > > > Unfortunately, when I move up to a (2352 x 2352) matrix the > > > MatSetValues() routine for matrix A is still in progress after 15 > > > minutes on one processor of our AMD Opteron IBM Cluster. I know that > > > people will be screaming "preallocation"...but I have tried to take this > > > into account by running a loop over the rows in A and counting the > > > non-zero values explicitly prior to creation. I then pass this vector > > > into the creation routine for the nnz argument. For the large (2352 x > > > 2352) problem that seems to be taking forever to set...at most there are > > > only 200 elements per row that are non-zero according to the counts. > > > > > > Can anyone explain why the MatSetValues() routine is taking such a long > > > time. Maybe this expected for this specific task...although it seems > > > very long? > > > > > > I did notice that on the trivial (224 x 224) run that I was still > > > getting mallocs (approx 2000) for the A assembly when I used the -info > > > command line parameter. I thought that it should be 0 if my > > > preallocation counts were exact? Does this hint that I am doing > > > something wrong. I have checked the code but don't see any obvious > > > problems in the logic...not that means anything. > > > > > > I would be grateful if someone could advise on this matter. Also, if you > > > have a few seconds to spare I would be grateful if some experts could > > > scan the remaining logic of the code (not in fine detail) to make sure > > > that I am doing all that I need to do to get this calculation > > > working...assuming I can resolve the MatSetValues() problem. > > > > > > Once again many thanks in advance, > > > > > > Tim. > > > > > > ! Initialise the PETSc MPI Harness > > > call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) > > > > > > call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) > > > call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) > > > > > > ! Read in Matrix > > > open(321,file='Hamiltonian.bin',form='unformatted') > > > read(321) order > > > if (ID==0) then > > > print * > > > print *,processes," Processing Elements being used" > > > print * > > > print *,"Matrix has order ",order," rows by ",order," columns" > > > print * > > > end if > > > > > > allocate(matrix(order,order)) > > > read(321) matrix > > > close(321) > > > > > > ! Allocate array for nnz > > > allocate(numberZero(order)) > > > > > > ! Count number of non-zero elements in each matrix row > > > do row=1,order > > > count=0 > > > do column=1,order > > > if (matrix(row,column).ne.(0,0)) count=count+1 > > > end do > > > numberZero(row)=count > > > end do > > > > > > ! Declare a PETSc Matrices > > > > > > call > > > MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) > > > call > > > MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) > > > call > > > MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) > > > call > > > MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) > > > > > > ! Set up zero-based array indexing for use in MatSetValues > > > allocate(columnIndices(order)) > > > > > > do column=1,order > > > columnIndices(column)=column-1 > > > end do > > > > > > ! Need to transpose values array as row-major arrays are used. > > > call > > > MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) > > > > > > ! Assemble Matrix A > > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > > > > > deallocate(matrix) > > > > > > ! Create Index Sets for Factorisation > > > call > > > ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) > > > call MatFactorInfoInitialize(info,error);CHKERRQ(error) > > > call ISSetPermutation(indexSet,error);CHKERRQ(error) > > > call > > > MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) > > > call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) > > > > > > ! A no-longer needed > > > call MatDestroy(A,error);CHKERRQ(error) > > > > > > one=(1,0) > > > > > > ! Set Diagonal elements in Identity Matrix B > > > do row=0,order-1 > > > call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) > > > end do > > > > > > ! Assemble B > > > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > > > > > ! Assemble X > > > call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > > call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > > > > > > ! Solve AX=B > > > call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) > > > > > > ! Deallocate Storage > > > deallocate(columnIndices) > > > > > > call MatDestroy(factorMat,error);CHKERRQ(error) > > > call MatDestroy(B,error);CHKERRQ(error) > > > call MatDestroy(X,error);CHKERRQ(error) > > > > > > call PetscFinalize(error) > > > > > > -- > > > Dr. Timothy Stitt > > > HPC Application Consultant - ICHEC (www.ichec.ie) > > > > > > Dublin Institute for Advanced Studies > > > 5 Merrion Square - Dublin 2 - Ireland > > > > > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > > > > > > > > > > > > > > > > > > > From z.sheng at ewi.tudelft.nl Wed Nov 14 10:08:43 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Wed, 14 Nov 2007 17:08:43 +0100 Subject: Nonzeros of SBAIJ In-Reply-To: <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov> References: <47388126.2010607@ewi.tudelft.nl> <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov> Message-ID: <473B1D8B.30900@ewi.tudelft.nl> Barry Smith wrote: > > Plus 1 for each row for the diagonal. > > Barry > > > On Nov 12, 2007, at 11:16 AM, Hong Zhang wrote: > >> >> >> On Mon, 12 Nov 2007, Zhifeng Sheng wrote: >> >>> Dear all >>> >>> my matrix is SeqSBAIJ with block size 1, so I am wondering when I >>> specify the nonzeros in a row, does it mean the actually nonzeros >>> numbers or the memory that is needed? (for instance, for SeqSBAIJ, >>> the actual nonzeros in a row would twice as much as memory needed) >> >> >> No, you specify the nonzeros of upper triangular part. >> >> Hong >> >>> >>> Thank you >>> Best regards >>> Zhifeng >>> >>> >> > Dear all I allocated the memory exactly with symbolic computation, and tried preallocation on AIJ and SBAIJ, it works well for AIJ, no additional allocaiton is needed in assembly... then I did it on SBAIJ ( nonzeros of upper triangular part + 1), the performance is much worse , it seems that memory allocation was still needed... does anyone have such problem before? Thank you Best regards Zhifeng Sheng From hzhang at mcs.anl.gov Wed Nov 14 10:17:28 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 14 Nov 2007 10:17:28 -0600 (CST) Subject: Nonzeros of SBAIJ In-Reply-To: <473B1D8B.30900@ewi.tudelft.nl> References: <47388126.2010607@ewi.tudelft.nl> <64C31677-0447-46D7-9F6D-8EB59777E7BE@mcs.anl.gov> <473B1D8B.30900@ewi.tudelft.nl> Message-ID: >> > Dear all > > I allocated the memory exactly with symbolic computation, and tried > preallocation on AIJ and SBAIJ, it works well for AIJ, no additional > allocaiton is needed in assembly... then I did it on SBAIJ ( nonzeros of > upper triangular part + 1), the performance is much worse , it seems that > memory allocation was still needed... does anyone have such problem before? May I have the segment of your code that implements the memory allocation? I'll test it and see what is the problem. BTW, you may send this type of request to petsc-maint at mcs.anl.gov instead of petsc-users. Hong > > Thank you > Best regards > Zhifeng Sheng > > > > > > > From timothy.stitt at ichec.ie Wed Nov 14 10:37:43 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Wed, 14 Nov 2007 16:37:43 +0000 Subject: AX=B Fortran Petsc Code In-Reply-To: References: <473B0288.2060002@ichec.ie> Message-ID: <473B2457.1050104@ichec.ie> Can I just ask a question about MatLUFactorSymbolic() in this context? What sizes should the 'row' and 'col' index sets be? Should they span all global rows/columns in A? Matthew Knepley wrote: > You appear to be setting every value in the sparse matrix. We do not > throw out 0 values (since sometimes they are necessary for structural > reasons). Thus you are allocating a ton of times. You need to remove > the 0 values before calling MatSetValues (and their associated > column entires as well). > > Matt > > On Nov 14, 2007 8:13 AM, Tim Stitt wrote: > >> Dear PETSc Users/Developers, >> >> I have the following sequential Fortran PETSc code that I have been >> developing (on and off) based on the kind advice given by members of >> this list, with respect to solving an inverse sparse matrix problem. >> Essentially, the code reads in a square double complex matrix from >> external file of size (order x order) and then proceeds to do a >> MatMatSolve(), where A is the sparse matrix to invert, B is a dense >> identity matrix and X is the resultant dense matrix....hope that makes >> sense. >> >> My main problem is that the code stalls on the MatSetValues() for the >> sparse matrix A. With a trivial test matrix of (224 x 224) the program >> terminates successfully (by successfully I mean all instructions >> execute...I am not interested in the validity of X right now). >> Unfortunately, when I move up to a (2352 x 2352) matrix the >> MatSetValues() routine for matrix A is still in progress after 15 >> minutes on one processor of our AMD Opteron IBM Cluster. I know that >> people will be screaming "preallocation"...but I have tried to take this >> into account by running a loop over the rows in A and counting the >> non-zero values explicitly prior to creation. I then pass this vector >> into the creation routine for the nnz argument. For the large (2352 x >> 2352) problem that seems to be taking forever to set...at most there are >> only 200 elements per row that are non-zero according to the counts. >> >> Can anyone explain why the MatSetValues() routine is taking such a long >> time. Maybe this expected for this specific task...although it seems >> very long? >> >> I did notice that on the trivial (224 x 224) run that I was still >> getting mallocs (approx 2000) for the A assembly when I used the -info >> command line parameter. I thought that it should be 0 if my >> preallocation counts were exact? Does this hint that I am doing >> something wrong. I have checked the code but don't see any obvious >> problems in the logic...not that means anything. >> >> I would be grateful if someone could advise on this matter. Also, if you >> have a few seconds to spare I would be grateful if some experts could >> scan the remaining logic of the code (not in fine detail) to make sure >> that I am doing all that I need to do to get this calculation >> working...assuming I can resolve the MatSetValues() problem. >> >> Once again many thanks in advance, >> >> Tim. >> >> ! Initialise the PETSc MPI Harness >> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) >> >> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) >> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) >> >> ! Read in Matrix >> open(321,file='Hamiltonian.bin',form='unformatted') >> read(321) order >> if (ID==0) then >> print * >> print *,processes," Processing Elements being used" >> print * >> print *,"Matrix has order ",order," rows by ",order," columns" >> print * >> end if >> >> allocate(matrix(order,order)) >> read(321) matrix >> close(321) >> >> ! Allocate array for nnz >> allocate(numberZero(order)) >> >> ! Count number of non-zero elements in each matrix row >> do row=1,order >> count=0 >> do column=1,order >> if (matrix(row,column).ne.(0,0)) count=count+1 >> end do >> numberZero(row)=count >> end do >> >> ! Declare a PETSc Matrices >> >> call >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) >> call >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) >> call >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) >> call >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) >> >> ! Set up zero-based array indexing for use in MatSetValues >> allocate(columnIndices(order)) >> >> do column=1,order >> columnIndices(column)=column-1 >> end do >> >> ! Need to transpose values array as row-major arrays are used. >> call >> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) >> >> ! Assemble Matrix A >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> >> deallocate(matrix) >> >> ! Create Index Sets for Factorisation >> call >> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) >> call MatFactorInfoInitialize(info,error);CHKERRQ(error) >> call ISSetPermutation(indexSet,error);CHKERRQ(error) >> call >> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) >> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) >> >> ! A no-longer needed >> call MatDestroy(A,error);CHKERRQ(error) >> >> one=(1,0) >> >> ! Set Diagonal elements in Identity Matrix B >> do row=0,order-1 >> call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) >> end do >> >> ! Assemble B >> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> >> ! Assemble X >> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >> >> ! Solve AX=B >> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) >> >> ! Deallocate Storage >> deallocate(columnIndices) >> >> call MatDestroy(factorMat,error);CHKERRQ(error) >> call MatDestroy(B,error);CHKERRQ(error) >> call MatDestroy(X,error);CHKERRQ(error) >> >> call PetscFinalize(error) >> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Wed Nov 14 10:53:46 2007 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Nov 2007 10:53:46 -0600 Subject: AX=B Fortran Petsc Code In-Reply-To: <473B2457.1050104@ichec.ie> References: <473B0288.2060002@ichec.ie> <473B2457.1050104@ichec.ie> Message-ID: On Nov 14, 2007 10:37 AM, Tim Stitt wrote: > Can I just ask a question about MatLUFactorSymbolic() in this context? > What sizes should the 'row' and 'col' index sets be? Should they span > all global rows/columns in A? Yes, the matrix is permuted as a whole. Matt > Matthew Knepley wrote: > > You appear to be setting every value in the sparse matrix. We do not > > throw out 0 values (since sometimes they are necessary for structural > > reasons). Thus you are allocating a ton of times. You need to remove > > the 0 values before calling MatSetValues (and their associated > > column entires as well). > > > > Matt > > > > On Nov 14, 2007 8:13 AM, Tim Stitt wrote: > > > >> Dear PETSc Users/Developers, > >> > >> I have the following sequential Fortran PETSc code that I have been > >> developing (on and off) based on the kind advice given by members of > >> this list, with respect to solving an inverse sparse matrix problem. > >> Essentially, the code reads in a square double complex matrix from > >> external file of size (order x order) and then proceeds to do a > >> MatMatSolve(), where A is the sparse matrix to invert, B is a dense > >> identity matrix and X is the resultant dense matrix....hope that makes > >> sense. > >> > >> My main problem is that the code stalls on the MatSetValues() for the > >> sparse matrix A. With a trivial test matrix of (224 x 224) the program > >> terminates successfully (by successfully I mean all instructions > >> execute...I am not interested in the validity of X right now). > >> Unfortunately, when I move up to a (2352 x 2352) matrix the > >> MatSetValues() routine for matrix A is still in progress after 15 > >> minutes on one processor of our AMD Opteron IBM Cluster. I know that > >> people will be screaming "preallocation"...but I have tried to take this > >> into account by running a loop over the rows in A and counting the > >> non-zero values explicitly prior to creation. I then pass this vector > >> into the creation routine for the nnz argument. For the large (2352 x > >> 2352) problem that seems to be taking forever to set...at most there are > >> only 200 elements per row that are non-zero according to the counts. > >> > >> Can anyone explain why the MatSetValues() routine is taking such a long > >> time. Maybe this expected for this specific task...although it seems > >> very long? > >> > >> I did notice that on the trivial (224 x 224) run that I was still > >> getting mallocs (approx 2000) for the A assembly when I used the -info > >> command line parameter. I thought that it should be 0 if my > >> preallocation counts were exact? Does this hint that I am doing > >> something wrong. I have checked the code but don't see any obvious > >> problems in the logic...not that means anything. > >> > >> I would be grateful if someone could advise on this matter. Also, if you > >> have a few seconds to spare I would be grateful if some experts could > >> scan the remaining logic of the code (not in fine detail) to make sure > >> that I am doing all that I need to do to get this calculation > >> working...assuming I can resolve the MatSetValues() problem. > >> > >> Once again many thanks in advance, > >> > >> Tim. > >> > >> ! Initialise the PETSc MPI Harness > >> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) > >> > >> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) > >> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) > >> > >> ! Read in Matrix > >> open(321,file='Hamiltonian.bin',form='unformatted') > >> read(321) order > >> if (ID==0) then > >> print * > >> print *,processes," Processing Elements being used" > >> print * > >> print *,"Matrix has order ",order," rows by ",order," columns" > >> print * > >> end if > >> > >> allocate(matrix(order,order)) > >> read(321) matrix > >> close(321) > >> > >> ! Allocate array for nnz > >> allocate(numberZero(order)) > >> > >> ! Count number of non-zero elements in each matrix row > >> do row=1,order > >> count=0 > >> do column=1,order > >> if (matrix(row,column).ne.(0,0)) count=count+1 > >> end do > >> numberZero(row)=count > >> end do > >> > >> ! Declare a PETSc Matrices > >> > >> call > >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) > >> call > >> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) > >> call > >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) > >> call > >> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) > >> > >> ! Set up zero-based array indexing for use in MatSetValues > >> allocate(columnIndices(order)) > >> > >> do column=1,order > >> columnIndices(column)=column-1 > >> end do > >> > >> ! Need to transpose values array as row-major arrays are used. > >> call > >> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) > >> > >> ! Assemble Matrix A > >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > >> > >> deallocate(matrix) > >> > >> ! Create Index Sets for Factorisation > >> call > >> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) > >> call MatFactorInfoInitialize(info,error);CHKERRQ(error) > >> call ISSetPermutation(indexSet,error);CHKERRQ(error) > >> call > >> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) > >> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) > >> > >> ! A no-longer needed > >> call MatDestroy(A,error);CHKERRQ(error) > >> > >> one=(1,0) > >> > >> ! Set Diagonal elements in Identity Matrix B > >> do row=0,order-1 > >> call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) > >> end do > >> > >> ! Assemble B > >> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > >> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > >> > >> ! Assemble X > >> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > >> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) > >> > >> ! Solve AX=B > >> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) > >> > >> ! Deallocate Storage > >> deallocate(columnIndices) > >> > >> call MatDestroy(factorMat,error);CHKERRQ(error) > >> call MatDestroy(B,error);CHKERRQ(error) > >> call MatDestroy(X,error);CHKERRQ(error) > >> > >> call PetscFinalize(error) > >> > >> -- > >> Dr. Timothy Stitt > >> HPC Application Consultant - ICHEC (www.ichec.ie) > >> > >> Dublin Institute for Advanced Studies > >> 5 Merrion Square - Dublin 2 - Ireland > >> > >> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >> > >> > >> > > > > > > > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Wed Nov 14 11:17:07 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 Nov 2007 11:17:07 -0600 Subject: AX=B Fortran Petsc Code In-Reply-To: <473B2457.1050104@ichec.ie> References: <473B0288.2060002@ichec.ie> <473B2457.1050104@ichec.ie> Message-ID: <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> For sequential codes the index set is as large as the matrix. In parallel the factor codes do not use the ordering, they do the ordering internally. Barry On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote: > Can I just ask a question about MatLUFactorSymbolic() in this > context? What sizes should the 'row' and 'col' index sets be? Should > they span all global rows/columns in A? > > Matthew Knepley wrote: >> You appear to be setting every value in the sparse matrix. We do not >> throw out 0 values (since sometimes they are necessary for structural >> reasons). Thus you are allocating a ton of times. You need to remove >> the 0 values before calling MatSetValues (and their associated >> column entires as well). >> >> Matt >> >> On Nov 14, 2007 8:13 AM, Tim Stitt wrote: >> >>> Dear PETSc Users/Developers, >>> >>> I have the following sequential Fortran PETSc code that I have been >>> developing (on and off) based on the kind advice given by members of >>> this list, with respect to solving an inverse sparse matrix problem. >>> Essentially, the code reads in a square double complex matrix from >>> external file of size (order x order) and then proceeds to do a >>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense >>> identity matrix and X is the resultant dense matrix....hope that >>> makes >>> sense. >>> >>> My main problem is that the code stalls on the MatSetValues() for >>> the >>> sparse matrix A. With a trivial test matrix of (224 x 224) the >>> program >>> terminates successfully (by successfully I mean all instructions >>> execute...I am not interested in the validity of X right now). >>> Unfortunately, when I move up to a (2352 x 2352) matrix the >>> MatSetValues() routine for matrix A is still in progress after 15 >>> minutes on one processor of our AMD Opteron IBM Cluster. I know that >>> people will be screaming "preallocation"...but I have tried to >>> take this >>> into account by running a loop over the rows in A and counting the >>> non-zero values explicitly prior to creation. I then pass this >>> vector >>> into the creation routine for the nnz argument. For the large >>> (2352 x >>> 2352) problem that seems to be taking forever to set...at most >>> there are >>> only 200 elements per row that are non-zero according to the counts. >>> >>> Can anyone explain why the MatSetValues() routine is taking such a >>> long >>> time. Maybe this expected for this specific task...although it seems >>> very long? >>> >>> I did notice that on the trivial (224 x 224) run that I was still >>> getting mallocs (approx 2000) for the A assembly when I used the - >>> info >>> command line parameter. I thought that it should be 0 if my >>> preallocation counts were exact? Does this hint that I am doing >>> something wrong. I have checked the code but don't see any obvious >>> problems in the logic...not that means anything. >>> >>> I would be grateful if someone could advise on this matter. Also, >>> if you >>> have a few seconds to spare I would be grateful if some experts >>> could >>> scan the remaining logic of the code (not in fine detail) to make >>> sure >>> that I am doing all that I need to do to get this calculation >>> working...assuming I can resolve the MatSetValues() problem. >>> >>> Once again many thanks in advance, >>> >>> Tim. >>> >>> ! Initialise the PETSc MPI Harness >>> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) >>> >>> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) >>> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) >>> >>> ! Read in Matrix >>> open(321,file='Hamiltonian.bin',form='unformatted') >>> read(321) order >>> if (ID==0) then >>> print * >>> print *,processes," Processing Elements being used" >>> print * >>> print *,"Matrix has order ",order," rows by ",order," columns" >>> print * >>> end if >>> >>> allocate(matrix(order,order)) >>> read(321) matrix >>> close(321) >>> >>> ! Allocate array for nnz >>> allocate(numberZero(order)) >>> >>> ! Count number of non-zero elements in each matrix row >>> do row=1,order >>> count=0 >>> do column=1,order >>> if (matrix(row,column).ne.(0,0)) count=count+1 >>> end do >>> numberZero(row)=count >>> end do >>> >>> ! Declare a PETSc Matrices >>> >>> call >>> MatCreateSeqAIJ >>> (PETSC_COMM_SELF >>> ,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) >>> call >>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order, >>> 0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) >>> call >>> MatCreateSeqDense >>> (PETSC_COMM_SELF >>> ,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) >>> call >>> MatCreateSeqDense >>> (PETSC_COMM_SELF >>> ,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) >>> >>> ! Set up zero-based array indexing for use in MatSetValues >>> allocate(columnIndices(order)) >>> >>> do column=1,order >>> columnIndices(column)=column-1 >>> end do >>> >>> ! Need to transpose values array as row-major arrays are used. >>> call >>> MatSetValues >>> (A >>> ,order >>> ,columnIndices >>> ,order >>> ,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) >>> >>> ! Assemble Matrix A >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>> >>> deallocate(matrix) >>> >>> ! Create Index Sets for Factorisation >>> call >>> ISCreateGeneral >>> (PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) >>> call MatFactorInfoInitialize(info,error);CHKERRQ(error) >>> call ISSetPermutation(indexSet,error);CHKERRQ(error) >>> call >>> MatLUFactorSymbolic >>> (A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) >>> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) >>> >>> ! A no-longer needed >>> call MatDestroy(A,error);CHKERRQ(error) >>> >>> one=(1,0) >>> >>> ! Set Diagonal elements in Identity Matrix B >>> do row=0,order-1 >>> call >>> MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) >>> end do >>> >>> ! Assemble B >>> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>> >>> ! Assemble X >>> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>> >>> ! Solve AX=B >>> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) >>> >>> ! Deallocate Storage >>> deallocate(columnIndices) >>> >>> call MatDestroy(factorMat,error);CHKERRQ(error) >>> call MatDestroy(B,error);CHKERRQ(error) >>> call MatDestroy(X,error);CHKERRQ(error) >>> >>> call PetscFinalize(error) >>> >>> -- >>> Dr. Timothy Stitt >>> HPC Application Consultant - ICHEC (www.ichec.ie) >>> >>> Dublin Institute for Advanced Studies >>> 5 Merrion Square - Dublin 2 - Ireland >>> >>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>> >>> >>> >> >> >> >> > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From timothy.stitt at ichec.ie Wed Nov 14 12:04:21 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Wed, 14 Nov 2007 18:04:21 +0000 Subject: AX=B Fortran Petsc Code In-Reply-To: <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> References: <473B0288.2060002@ichec.ie> <473B2457.1050104@ichec.ie> <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> Message-ID: <473B38A5.8080602@ichec.ie> OK...everything is working well now and I am getting the results I expect. Much appreciated. Saying that...I am trying to now satisfy the PC FACTOR FILL suggestion provided by the -info parameter on my sample sparse matrices. In my case I am getting: [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.56568 [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.56568 or use [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.56568); [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. So I run my code with ./foo -pc_factor_fill 2.56568 but I continually get WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-pc_factor_fill value: 2.56568 Can someone suggest how I can improve performance with the pc_factor_fill parameter in my case? As my code stands there is no MatSetFromOptions() as I set everything explicitly in the code. Thanks again, Tim. Barry Smith wrote: > > For sequential codes the index set is as large as the matrix. > > In parallel the factor codes do not use the ordering, they do the > ordering > internally. > > Barry > > On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote: > >> Can I just ask a question about MatLUFactorSymbolic() in this >> context? What sizes should the 'row' and 'col' index sets be? Should >> they span all global rows/columns in A? >> >> Matthew Knepley wrote: >>> You appear to be setting every value in the sparse matrix. We do not >>> throw out 0 values (since sometimes they are necessary for structural >>> reasons). Thus you are allocating a ton of times. You need to remove >>> the 0 values before calling MatSetValues (and their associated >>> column entires as well). >>> >>> Matt >>> >>> On Nov 14, 2007 8:13 AM, Tim Stitt wrote: >>> >>>> Dear PETSc Users/Developers, >>>> >>>> I have the following sequential Fortran PETSc code that I have been >>>> developing (on and off) based on the kind advice given by members of >>>> this list, with respect to solving an inverse sparse matrix problem. >>>> Essentially, the code reads in a square double complex matrix from >>>> external file of size (order x order) and then proceeds to do a >>>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense >>>> identity matrix and X is the resultant dense matrix....hope that makes >>>> sense. >>>> >>>> My main problem is that the code stalls on the MatSetValues() for the >>>> sparse matrix A. With a trivial test matrix of (224 x 224) the program >>>> terminates successfully (by successfully I mean all instructions >>>> execute...I am not interested in the validity of X right now). >>>> Unfortunately, when I move up to a (2352 x 2352) matrix the >>>> MatSetValues() routine for matrix A is still in progress after 15 >>>> minutes on one processor of our AMD Opteron IBM Cluster. I know that >>>> people will be screaming "preallocation"...but I have tried to take >>>> this >>>> into account by running a loop over the rows in A and counting the >>>> non-zero values explicitly prior to creation. I then pass this vector >>>> into the creation routine for the nnz argument. For the large (2352 x >>>> 2352) problem that seems to be taking forever to set...at most >>>> there are >>>> only 200 elements per row that are non-zero according to the counts. >>>> >>>> Can anyone explain why the MatSetValues() routine is taking such a >>>> long >>>> time. Maybe this expected for this specific task...although it seems >>>> very long? >>>> >>>> I did notice that on the trivial (224 x 224) run that I was still >>>> getting mallocs (approx 2000) for the A assembly when I used the -info >>>> command line parameter. I thought that it should be 0 if my >>>> preallocation counts were exact? Does this hint that I am doing >>>> something wrong. I have checked the code but don't see any obvious >>>> problems in the logic...not that means anything. >>>> >>>> I would be grateful if someone could advise on this matter. Also, >>>> if you >>>> have a few seconds to spare I would be grateful if some experts could >>>> scan the remaining logic of the code (not in fine detail) to make >>>> sure >>>> that I am doing all that I need to do to get this calculation >>>> working...assuming I can resolve the MatSetValues() problem. >>>> >>>> Once again many thanks in advance, >>>> >>>> Tim. >>>> >>>> ! Initialise the PETSc MPI Harness >>>> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) >>>> >>>> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) >>>> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) >>>> >>>> ! Read in Matrix >>>> open(321,file='Hamiltonian.bin',form='unformatted') >>>> read(321) order >>>> if (ID==0) then >>>> print * >>>> print *,processes," Processing Elements being used" >>>> print * >>>> print *,"Matrix has order ",order," rows by ",order," columns" >>>> print * >>>> end if >>>> >>>> allocate(matrix(order,order)) >>>> read(321) matrix >>>> close(321) >>>> >>>> ! Allocate array for nnz >>>> allocate(numberZero(order)) >>>> >>>> ! Count number of non-zero elements in each matrix row >>>> do row=1,order >>>> count=0 >>>> do column=1,order >>>> if (matrix(row,column).ne.(0,0)) count=count+1 >>>> end do >>>> numberZero(row)=count >>>> end do >>>> >>>> ! Declare a PETSc Matrices >>>> >>>> call >>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) >>>> >>>> call >>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) >>>> >>>> call >>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) >>>> >>>> call >>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) >>>> >>>> >>>> ! Set up zero-based array indexing for use in MatSetValues >>>> allocate(columnIndices(order)) >>>> >>>> do column=1,order >>>> columnIndices(column)=column-1 >>>> end do >>>> >>>> ! Need to transpose values array as row-major arrays are used. >>>> call >>>> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) >>>> >>>> >>>> ! Assemble Matrix A >>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>> >>>> deallocate(matrix) >>>> >>>> ! Create Index Sets for Factorisation >>>> call >>>> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) >>>> >>>> call MatFactorInfoInitialize(info,error);CHKERRQ(error) >>>> call ISSetPermutation(indexSet,error);CHKERRQ(error) >>>> call >>>> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) >>>> >>>> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) >>>> >>>> ! A no-longer needed >>>> call MatDestroy(A,error);CHKERRQ(error) >>>> >>>> one=(1,0) >>>> >>>> ! Set Diagonal elements in Identity Matrix B >>>> do row=0,order-1 >>>> call MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) >>>> end do >>>> >>>> ! Assemble B >>>> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>> >>>> ! Assemble X >>>> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>> >>>> ! Solve AX=B >>>> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) >>>> >>>> ! Deallocate Storage >>>> deallocate(columnIndices) >>>> >>>> call MatDestroy(factorMat,error);CHKERRQ(error) >>>> call MatDestroy(B,error);CHKERRQ(error) >>>> call MatDestroy(X,error);CHKERRQ(error) >>>> >>>> call PetscFinalize(error) >>>> >>>> -- >>>> Dr. Timothy Stitt >>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>> >>>> Dublin Institute for Advanced Studies >>>> 5 Merrion Square - Dublin 2 - Ireland >>>> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>> >>>> >>>> >>> >>> >>> >>> >> >> >> --Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From timothy.stitt at ichec.ie Thu Nov 15 08:20:19 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Thu, 15 Nov 2007 14:20:19 +0000 Subject: SUPERLU Type and MatMatSolve() Message-ID: <473C55A3.1070504@ichec.ie> Hi, Just wondering if it is possible to use the SUPERLU matrix type with the MatMatSolve() routine. I changed a working code (which uses MatMatSolve()) by the setting the matrix type to superlu (using MatSetType()) and now I get the following runtime errors in MatMatSolve(): [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Null Object: Parameter # 1! Thanks, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From hzhang at mcs.anl.gov Thu Nov 15 08:39:08 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 15 Nov 2007 08:39:08 -0600 (CST) Subject: SUPERLU Type and MatMatSolve() In-Reply-To: <473C55A3.1070504@ichec.ie> References: <473C55A3.1070504@ichec.ie> Message-ID: On Thu, 15 Nov 2007, Tim Stitt wrote: > Hi, > > Just wondering if it is possible to use the SUPERLU matrix type with the > MatMatSolve() routine. > > I changed a working code (which uses MatMatSolve()) by the setting the matrix > type to superlu (using MatSetType()) and now I get the following runtime > errors in MatMatSolve(): The current petsc-SUPERLU interface doesn't support MatMatSolve(). Hong > > [0]PETSC ERROR: Null argument, when expecting valid pointer! > [0]PETSC ERROR: Null Object: Parameter # 1! > > Thanks, > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > From timothy.stitt at ichec.ie Thu Nov 15 10:26:37 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Thu, 15 Nov 2007 16:26:37 +0000 Subject: SUPERLU Type and MatMatSolve() In-Reply-To: References: <473C55A3.1070504@ichec.ie> Message-ID: <473C733D.7090306@ichec.ie> Hong, Does MatSolve() support SuperLU and MUMPS? Hong Zhang wrote: > > > On Thu, 15 Nov 2007, Tim Stitt wrote: > >> Hi, >> >> Just wondering if it is possible to use the SUPERLU matrix type with >> the MatMatSolve() routine. >> >> I changed a working code (which uses MatMatSolve()) by the setting >> the matrix type to superlu (using MatSetType()) and now I get the >> following runtime errors in MatMatSolve(): > > The current petsc-SUPERLU interface doesn't support MatMatSolve(). > > Hong > >> >> [0]PETSC ERROR: Null argument, when expecting valid pointer! >> [0]PETSC ERROR: Null Object: Parameter # 1! >> >> Thanks, >> >> Tim. >> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Thu Nov 15 10:59:21 2007 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Nov 2007 10:59:21 -0600 Subject: SUPERLU Type and MatMatSolve() In-Reply-To: <473C733D.7090306@ichec.ie> References: <473C55A3.1070504@ichec.ie> <473C733D.7090306@ichec.ie> Message-ID: Yes. Matt On Nov 15, 2007 10:26 AM, Tim Stitt wrote: > Hong, > > Does MatSolve() support SuperLU and MUMPS? > > Hong Zhang wrote: > > > > > > On Thu, 15 Nov 2007, Tim Stitt wrote: > > > >> Hi, > >> > >> Just wondering if it is possible to use the SUPERLU matrix type with > >> the MatMatSolve() routine. > >> > >> I changed a working code (which uses MatMatSolve()) by the setting > >> the matrix type to superlu (using MatSetType()) and now I get the > >> following runtime errors in MatMatSolve(): > > > > The current petsc-SUPERLU interface doesn't support MatMatSolve(). > > > > Hong > > > >> > >> [0]PETSC ERROR: Null argument, when expecting valid pointer! > >> [0]PETSC ERROR: Null Object: Parameter # 1! > >> > >> Thanks, > >> > >> Tim. > >> > >> -- > >> Dr. Timothy Stitt > >> HPC Application Consultant - ICHEC (www.ichec.ie) > >> > >> Dublin Institute for Advanced Studies > >> 5 Merrion Square - Dublin 2 - Ireland > >> > >> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >> > >> > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From hzhang at mcs.anl.gov Thu Nov 15 11:03:30 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 15 Nov 2007 11:03:30 -0600 (CST) Subject: SUPERLU Type and MatMatSolve() In-Reply-To: <473C733D.7090306@ichec.ie> References: <473C55A3.1070504@ichec.ie> <473C733D.7090306@ichec.ie> Message-ID: On Thu, 15 Nov 2007, Tim Stitt wrote: > Hong, > > Does MatSolve() support SuperLU and MUMPS? Yes. Hong > > Hong Zhang wrote: >> >> >> On Thu, 15 Nov 2007, Tim Stitt wrote: >> >>> Hi, >>> >>> Just wondering if it is possible to use the SUPERLU matrix type with the >>> MatMatSolve() routine. >>> >>> I changed a working code (which uses MatMatSolve()) by the setting the >>> matrix type to superlu (using MatSetType()) and now I get the following >>> runtime errors in MatMatSolve(): >> >> The current petsc-SUPERLU interface doesn't support MatMatSolve(). >> >> Hong >> >>> >>> [0]PETSC ERROR: Null argument, when expecting valid pointer! >>> [0]PETSC ERROR: Null Object: Parameter # 1! >>> >>> Thanks, >>> >>> Tim. >>> >>> -- >>> Dr. Timothy Stitt >>> HPC Application Consultant - ICHEC (www.ichec.ie) >>> >>> Dublin Institute for Advanced Studies >>> 5 Merrion Square - Dublin 2 - Ireland >>> >>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>> >>> >> > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > From bsmith at mcs.anl.gov Thu Nov 15 13:43:50 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Nov 2007 13:43:50 -0600 Subject: AX=B Fortran Petsc Code In-Reply-To: <473B38A5.8080602@ichec.ie> References: <473B0288.2060002@ichec.ie> <473B2457.1050104@ichec.ie> <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> <473B38A5.8080602@ichec.ie> Message-ID: <517A8A62-BDC1-495E-87C6-B2C5659A5B01@mcs.anl.gov> Tim, There is an field in MatFactorInfo that contains this fill factor called fill set it with 2.5 and you should be all set. Barry On Nov 14, 2007, at 12:04 PM, Tim Stitt wrote: > OK...everything is working well now and I am getting the results I > expect. Much appreciated. > > Saying that...I am trying to now satisfy the PC FACTOR FILL > suggestion provided by the -info parameter on my sample sparse > matrices. > > In my case I am getting: > > [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 > needed 2.56568 > [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.56568 > or use > [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.56568); > [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > So I run my code with ./foo -pc_factor_fill 2.56568 > > but I continually get > > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-pc_factor_fill value: 2.56568 > > Can someone suggest how I can improve performance with the > pc_factor_fill parameter in my case? As my code stands there is no > MatSetFromOptions() as I set everything explicitly in the code. > > Thanks again, > > Tim. > > Barry Smith wrote: >> >> For sequential codes the index set is as large as the matrix. >> >> In parallel the factor codes do not use the ordering, they do the >> ordering >> internally. >> >> Barry >> >> On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote: >> >>> Can I just ask a question about MatLUFactorSymbolic() in this >>> context? What sizes should the 'row' and 'col' index sets be? >>> Should they span all global rows/columns in A? >>> >>> Matthew Knepley wrote: >>>> You appear to be setting every value in the sparse matrix. We do >>>> not >>>> throw out 0 values (since sometimes they are necessary for >>>> structural >>>> reasons). Thus you are allocating a ton of times. You need to >>>> remove >>>> the 0 values before calling MatSetValues (and their associated >>>> column entires as well). >>>> >>>> Matt >>>> >>>> On Nov 14, 2007 8:13 AM, Tim Stitt wrote: >>>> >>>>> Dear PETSc Users/Developers, >>>>> >>>>> I have the following sequential Fortran PETSc code that I have >>>>> been >>>>> developing (on and off) based on the kind advice given by >>>>> members of >>>>> this list, with respect to solving an inverse sparse matrix >>>>> problem. >>>>> Essentially, the code reads in a square double complex matrix from >>>>> external file of size (order x order) and then proceeds to do a >>>>> MatMatSolve(), where A is the sparse matrix to invert, B is a >>>>> dense >>>>> identity matrix and X is the resultant dense matrix....hope that >>>>> makes >>>>> sense. >>>>> >>>>> My main problem is that the code stalls on the MatSetValues() >>>>> for the >>>>> sparse matrix A. With a trivial test matrix of (224 x 224) the >>>>> program >>>>> terminates successfully (by successfully I mean all instructions >>>>> execute...I am not interested in the validity of X right now). >>>>> Unfortunately, when I move up to a (2352 x 2352) matrix the >>>>> MatSetValues() routine for matrix A is still in progress after 15 >>>>> minutes on one processor of our AMD Opteron IBM Cluster. I know >>>>> that >>>>> people will be screaming "preallocation"...but I have tried to >>>>> take this >>>>> into account by running a loop over the rows in A and counting the >>>>> non-zero values explicitly prior to creation. I then pass this >>>>> vector >>>>> into the creation routine for the nnz argument. For the large >>>>> (2352 x >>>>> 2352) problem that seems to be taking forever to set...at most >>>>> there are >>>>> only 200 elements per row that are non-zero according to the >>>>> counts. >>>>> >>>>> Can anyone explain why the MatSetValues() routine is taking such >>>>> a long >>>>> time. Maybe this expected for this specific task...although it >>>>> seems >>>>> very long? >>>>> >>>>> I did notice that on the trivial (224 x 224) run that I was still >>>>> getting mallocs (approx 2000) for the A assembly when I used the >>>>> -info >>>>> command line parameter. I thought that it should be 0 if my >>>>> preallocation counts were exact? Does this hint that I am doing >>>>> something wrong. I have checked the code but don't see any obvious >>>>> problems in the logic...not that means anything. >>>>> >>>>> I would be grateful if someone could advise on this matter. >>>>> Also, if you >>>>> have a few seconds to spare I would be grateful if some experts >>>>> could >>>>> scan the remaining logic of the code (not in fine detail) to >>>>> make sure >>>>> that I am doing all that I need to do to get this calculation >>>>> working...assuming I can resolve the MatSetValues() problem. >>>>> >>>>> Once again many thanks in advance, >>>>> >>>>> Tim. >>>>> >>>>> ! Initialise the PETSc MPI Harness >>>>> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) >>>>> >>>>> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) >>>>> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) >>>>> >>>>> ! Read in Matrix >>>>> open(321,file='Hamiltonian.bin',form='unformatted') >>>>> read(321) order >>>>> if (ID==0) then >>>>> print * >>>>> print *,processes," Processing Elements being used" >>>>> print * >>>>> print *,"Matrix has order ",order," rows by ",order," columns" >>>>> print * >>>>> end if >>>>> >>>>> allocate(matrix(order,order)) >>>>> read(321) matrix >>>>> close(321) >>>>> >>>>> ! Allocate array for nnz >>>>> allocate(numberZero(order)) >>>>> >>>>> ! Count number of non-zero elements in each matrix row >>>>> do row=1,order >>>>> count=0 >>>>> do column=1,order >>>>> if (matrix(row,column).ne.(0,0)) count=count+1 >>>>> end do >>>>> numberZero(row)=count >>>>> end do >>>>> >>>>> ! Declare a PETSc Matrices >>>>> >>>>> call >>>>> MatCreateSeqAIJ >>>>> (PETSC_COMM_SELF >>>>> ,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) >>>>> call >>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order, >>>>> 0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) >>>>> call >>>>> MatCreateSeqDense >>>>> (PETSC_COMM_SELF >>>>> ,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) >>>>> call >>>>> MatCreateSeqDense >>>>> (PETSC_COMM_SELF >>>>> ,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) >>>>> >>>>> ! Set up zero-based array indexing for use in MatSetValues >>>>> allocate(columnIndices(order)) >>>>> >>>>> do column=1,order >>>>> columnIndices(column)=column-1 >>>>> end do >>>>> >>>>> ! Need to transpose values array as row-major arrays are used. >>>>> call >>>>> MatSetValues >>>>> (A >>>>> ,order >>>>> ,columnIndices >>>>> ,order >>>>> ,columnIndices >>>>> ,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) >>>>> >>>>> ! Assemble Matrix A >>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>> >>>>> deallocate(matrix) >>>>> >>>>> ! Create Index Sets for Factorisation >>>>> call >>>>> ISCreateGeneral >>>>> (PETSC_COMM_SELF >>>>> ,order,columnIndices,indexSet,error);CHKERRQ(error) >>>>> call MatFactorInfoInitialize(info,error);CHKERRQ(error) >>>>> call ISSetPermutation(indexSet,error);CHKERRQ(error) >>>>> call >>>>> MatLUFactorSymbolic >>>>> (A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) >>>>> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) >>>>> >>>>> ! A no-longer needed >>>>> call MatDestroy(A,error);CHKERRQ(error) >>>>> >>>>> one=(1,0) >>>>> >>>>> ! Set Diagonal elements in Identity Matrix B >>>>> do row=0,order-1 >>>>> call >>>>> MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) >>>>> end do >>>>> >>>>> ! Assemble B >>>>> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>> >>>>> ! Assemble X >>>>> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>> >>>>> ! Solve AX=B >>>>> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) >>>>> >>>>> ! Deallocate Storage >>>>> deallocate(columnIndices) >>>>> >>>>> call MatDestroy(factorMat,error);CHKERRQ(error) >>>>> call MatDestroy(B,error);CHKERRQ(error) >>>>> call MatDestroy(X,error);CHKERRQ(error) >>>>> >>>>> call PetscFinalize(error) >>>>> >>>>> -- >>>>> Dr. Timothy Stitt >>>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>>> >>>>> Dublin Institute for Advanced Studies >>>>> 5 Merrion Square - Dublin 2 - Ireland >>>>> >>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >>> >>> >>> --Dr. Timothy Stitt >>> HPC Application Consultant - ICHEC (www.ichec.ie) >>> >>> Dublin Institute for Advanced Studies >>> 5 Merrion Square - Dublin 2 - Ireland >>> >>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>> >> > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From timothy.stitt at ichec.ie Thu Nov 15 14:41:23 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Thu, 15 Nov 2007 20:41:23 +0000 Subject: AX=B Fortran Petsc Code In-Reply-To: <517A8A62-BDC1-495E-87C6-B2C5659A5B01@mcs.anl.gov> References: <473B0288.2060002@ichec.ie> <473B2457.1050104@ichec.ie> <6D457C11-14AD-4E43-B842-8AD24C64F993@mcs.anl.gov> <473B38A5.8080602@ichec.ie> <517A8A62-BDC1-495E-87C6-B2C5659A5B01@mcs.anl.gov> Message-ID: <473CAEF3.1030903@ichec.ie> Thanks Barry. Barry Smith wrote: > > Tim, > > There is an field in MatFactorInfo that contains this fill factor > called > fill set it with 2.5 and you should be all set. > > Barry > > On Nov 14, 2007, at 12:04 PM, Tim Stitt wrote: > >> OK...everything is working well now and I am getting the results I >> expect. Much appreciated. >> >> Saying that...I am trying to now satisfy the PC FACTOR FILL >> suggestion provided by the -info parameter on my sample sparse matrices. >> >> In my case I am getting: >> >> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 >> needed 2.56568 >> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.56568 or >> use >> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.56568); >> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> >> So I run my code with ./foo -pc_factor_fill 2.56568 >> >> but I continually get >> >> WARNING! There are options you set that were not used! >> WARNING! could be spelling mistake, etc! >> Option left: name:-pc_factor_fill value: 2.56568 >> >> Can someone suggest how I can improve performance with the >> pc_factor_fill parameter in my case? As my code stands there is no >> MatSetFromOptions() as I set everything explicitly in the code. >> >> Thanks again, >> >> Tim. >> >> Barry Smith wrote: >>> >>> For sequential codes the index set is as large as the matrix. >>> >>> In parallel the factor codes do not use the ordering, they do the >>> ordering >>> internally. >>> >>> Barry >>> >>> On Nov 14, 2007, at 10:37 AM, Tim Stitt wrote: >>> >>>> Can I just ask a question about MatLUFactorSymbolic() in this >>>> context? What sizes should the 'row' and 'col' index sets be? >>>> Should they span all global rows/columns in A? >>>> >>>> Matthew Knepley wrote: >>>>> You appear to be setting every value in the sparse matrix. We do not >>>>> throw out 0 values (since sometimes they are necessary for structural >>>>> reasons). Thus you are allocating a ton of times. You need to remove >>>>> the 0 values before calling MatSetValues (and their associated >>>>> column entires as well). >>>>> >>>>> Matt >>>>> >>>>> On Nov 14, 2007 8:13 AM, Tim Stitt wrote: >>>>> >>>>>> Dear PETSc Users/Developers, >>>>>> >>>>>> I have the following sequential Fortran PETSc code that I have been >>>>>> developing (on and off) based on the kind advice given by members of >>>>>> this list, with respect to solving an inverse sparse matrix problem. >>>>>> Essentially, the code reads in a square double complex matrix from >>>>>> external file of size (order x order) and then proceeds to do a >>>>>> MatMatSolve(), where A is the sparse matrix to invert, B is a dense >>>>>> identity matrix and X is the resultant dense matrix....hope that >>>>>> makes >>>>>> sense. >>>>>> >>>>>> My main problem is that the code stalls on the MatSetValues() for >>>>>> the >>>>>> sparse matrix A. With a trivial test matrix of (224 x 224) the >>>>>> program >>>>>> terminates successfully (by successfully I mean all instructions >>>>>> execute...I am not interested in the validity of X right now). >>>>>> Unfortunately, when I move up to a (2352 x 2352) matrix the >>>>>> MatSetValues() routine for matrix A is still in progress after 15 >>>>>> minutes on one processor of our AMD Opteron IBM Cluster. I know that >>>>>> people will be screaming "preallocation"...but I have tried to >>>>>> take this >>>>>> into account by running a loop over the rows in A and counting the >>>>>> non-zero values explicitly prior to creation. I then pass this >>>>>> vector >>>>>> into the creation routine for the nnz argument. For the large >>>>>> (2352 x >>>>>> 2352) problem that seems to be taking forever to set...at most >>>>>> there are >>>>>> only 200 elements per row that are non-zero according to the counts. >>>>>> >>>>>> Can anyone explain why the MatSetValues() routine is taking such >>>>>> a long >>>>>> time. Maybe this expected for this specific task...although it seems >>>>>> very long? >>>>>> >>>>>> I did notice that on the trivial (224 x 224) run that I was still >>>>>> getting mallocs (approx 2000) for the A assembly when I used the >>>>>> -info >>>>>> command line parameter. I thought that it should be 0 if my >>>>>> preallocation counts were exact? Does this hint that I am doing >>>>>> something wrong. I have checked the code but don't see any obvious >>>>>> problems in the logic...not that means anything. >>>>>> >>>>>> I would be grateful if someone could advise on this matter. Also, >>>>>> if you >>>>>> have a few seconds to spare I would be grateful if some experts >>>>>> could >>>>>> scan the remaining logic of the code (not in fine detail) to >>>>>> make sure >>>>>> that I am doing all that I need to do to get this calculation >>>>>> working...assuming I can resolve the MatSetValues() problem. >>>>>> >>>>>> Once again many thanks in advance, >>>>>> >>>>>> Tim. >>>>>> >>>>>> ! Initialise the PETSc MPI Harness >>>>>> call PetscInitialize(PETSC_NULL_CHARACTER,error);CHKERRQ(error) >>>>>> >>>>>> call MPI_COMM_SIZE(PETSC_COMM_SELF,processes,error);CHKERRQ(error) >>>>>> call MPI_COMM_RANK(PETSC_COMM_SELF,ID,error);CHKERRQ(error) >>>>>> >>>>>> ! Read in Matrix >>>>>> open(321,file='Hamiltonian.bin',form='unformatted') >>>>>> read(321) order >>>>>> if (ID==0) then >>>>>> print * >>>>>> print *,processes," Processing Elements being used" >>>>>> print * >>>>>> print *,"Matrix has order ",order," rows by ",order," columns" >>>>>> print * >>>>>> end if >>>>>> >>>>>> allocate(matrix(order,order)) >>>>>> read(321) matrix >>>>>> close(321) >>>>>> >>>>>> ! Allocate array for nnz >>>>>> allocate(numberZero(order)) >>>>>> >>>>>> ! Count number of non-zero elements in each matrix row >>>>>> do row=1,order >>>>>> count=0 >>>>>> do column=1,order >>>>>> if (matrix(row,column).ne.(0,0)) count=count+1 >>>>>> end do >>>>>> numberZero(row)=count >>>>>> end do >>>>>> >>>>>> ! Declare a PETSc Matrices >>>>>> >>>>>> call >>>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,PETSC_NULL_INTEGER,numberZero,A,error);CHKERRQ(error) >>>>>> >>>>>> call >>>>>> MatCreateSeqAIJ(PETSC_COMM_SELF,order,order,0,PETSC_NULL_INTEGER,factorMat,error);CHKERRQ(error) >>>>>> >>>>>> call >>>>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,X,error);CHKERRQ(error) >>>>>> >>>>>> call >>>>>> MatCreateSeqDense(PETSC_COMM_SELF,order,order,PETSC_NULL_SCALAR,B,error);CHKERRQ(error) >>>>>> >>>>>> >>>>>> ! Set up zero-based array indexing for use in MatSetValues >>>>>> allocate(columnIndices(order)) >>>>>> >>>>>> do column=1,order >>>>>> columnIndices(column)=column-1 >>>>>> end do >>>>>> >>>>>> ! Need to transpose values array as row-major arrays are used. >>>>>> call >>>>>> MatSetValues(A,order,columnIndices,order,columnIndices,transpose(matrix),INSERT_VALUES,error);CHKERRQ(error) >>>>>> >>>>>> >>>>>> ! Assemble Matrix A >>>>>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>>> >>>>>> deallocate(matrix) >>>>>> >>>>>> ! Create Index Sets for Factorisation >>>>>> call >>>>>> ISCreateGeneral(PETSC_COMM_SELF,order,columnIndices,indexSet,error);CHKERRQ(error) >>>>>> >>>>>> call MatFactorInfoInitialize(info,error);CHKERRQ(error) >>>>>> call ISSetPermutation(indexSet,error);CHKERRQ(error) >>>>>> call >>>>>> MatLUFactorSymbolic(A,indexSet,indexSet,info,factorMat,error);CHKERRQ(error) >>>>>> >>>>>> call MatLUFactorNumeric(A,info,factorMat,error);CHKERRQ(error) >>>>>> >>>>>> ! A no-longer needed >>>>>> call MatDestroy(A,error);CHKERRQ(error) >>>>>> >>>>>> one=(1,0) >>>>>> >>>>>> ! Set Diagonal elements in Identity Matrix B >>>>>> do row=0,order-1 >>>>>> call >>>>>> MatSetValue(B,row,row,one,INSERT_VALUES,error);CHKERRQ(error) >>>>>> end do >>>>>> >>>>>> ! Assemble B >>>>>> call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>>> call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>>> >>>>>> ! Assemble X >>>>>> call MatAssemblyBegin(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>>> call MatAssemblyEnd(X,MAT_FINAL_ASSEMBLY,error);CHKERRQ(error) >>>>>> >>>>>> ! Solve AX=B >>>>>> call MatMatSolve(factorMat,B,X,error);CHKERRQ(error) >>>>>> >>>>>> ! Deallocate Storage >>>>>> deallocate(columnIndices) >>>>>> >>>>>> call MatDestroy(factorMat,error);CHKERRQ(error) >>>>>> call MatDestroy(B,error);CHKERRQ(error) >>>>>> call MatDestroy(X,error);CHKERRQ(error) >>>>>> >>>>>> call PetscFinalize(error) >>>>>> >>>>>> --Dr. Timothy Stitt >>>>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>>>> >>>>>> Dublin Institute for Advanced Studies >>>>>> 5 Merrion Square - Dublin 2 - Ireland >>>>>> >>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> --Dr. Timothy Stitt >>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>> >>>> Dublin Institute for Advanced Studies >>>> 5 Merrion Square - Dublin 2 - Ireland >>>> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>> >>> >> >> >> --Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From timothy.stitt at ichec.ie Fri Nov 16 09:12:15 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Fri, 16 Nov 2007 15:12:15 +0000 Subject: Linking Static Library to PETSc code Message-ID: <473DB34F.4060803@ichec.ie> PETSc Developers, I am trying to link a self-written static library to my PETSc code but during the compile and link phase I keep getting the following link error with my library: "could not read symbols: Bad value" Can anyone suggest how I can call external routines from my PETSc code which are packaged in an external static library without getting this error? Thanks, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From mfhoel at ifi.uio.no Fri Nov 16 10:24:06 2007 From: mfhoel at ifi.uio.no (Mads Hoel) Date: Fri, 16 Nov 2007 17:24:06 +0100 Subject: Linking Static Library to PETSc code In-Reply-To: <473DB34F.4060803@ichec.ie> References: <473DB34F.4060803@ichec.ie> Message-ID: On Fri, 16 Nov 2007 16:12:15 +0100, Tim Stitt wrote: > could not read symbols: Bad value I haven't seen that error message before, but i looked it up in a search engine and got 4 cases that might be the solution to your problem, suggesting to recompile the static library with -fPIC: http://www.gentoo.org/proj/en/base/amd64/howtos/index.xml?part=1&chap=3 -- Mads Hoel From balay at mcs.anl.gov Fri Nov 16 12:10:49 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 16 Nov 2007 12:10:49 -0600 (CST) Subject: Linking Static Library to PETSc code In-Reply-To: References: <473DB34F.4060803@ichec.ie> Message-ID: On Fri, 16 Nov 2007, Mads Hoel wrote: > On Fri, 16 Nov 2007 16:12:15 +0100, Tim Stitt wrote: > > > could not read symbols: Bad value > > I haven't seen that error message before, but i looked it up in a search > engine and got 4 cases that might be the solution to your problem, suggesting > to recompile the static library with -fPIC: > http://www.gentoo.org/proj/en/base/amd64/howtos/index.xml?part=1&chap=3 Can you post the complete log [compiler, compiler options etc..] of: - how this static library was built - how you are attempting to link it with PETSc -complete error message Also is this linux or linux64? This additional info could give us clues as to whats going wrong. Satish From bknaepen at ulb.ac.be Sun Nov 18 03:32:23 2007 From: bknaepen at ulb.ac.be (Bernard Knaepen) Date: Sun, 18 Nov 2007 10:32:23 +0100 Subject: problem compiling PETSC on MacOS Leopard Message-ID: Hello, I would like to compile PETSC on Leopard but I am encountering a problem during configuration. The scripts stops with: dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx = = = = = = = = = ======================================================================== Configuring PETSc to compile on your system = = = = = = = = = ======================================================================== TESTING: checkFortranCompiler from config.setCompilers(python/ BuildSystem/config/setCompilers.py: 708 ) ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- Fortran compiler you provided with --with-fc=mpif90 does not work ********************************************************************************* My MPI installation is mpich2 1.0.6p1 and I have the latest ifort compiler installed (10.0.20). I have test mpif90 and it is working ok. I copy below the configure.log file. Any help would be appreciated, thanks, Bernard. Pushing language C Popping language C Pushing language Cxx Popping language Cxx Pushing language FC Popping language FC sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/ config/packages/config.guess Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/ BuildSystem/config/packages/config.guess sh: i686-apple-darwin9.1.0 sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/ config/packages/config.sub i686-apple-darwin9.1.0 Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/python/ BuildSystem/config/packages/config.sub i686-apple-darwin9.1.0 sh: i686-apple-darwin9.1.0 = = = = = = = = ======================================================================== = = = = = = = = ======================================================================== Starting Configure Run at Sun Nov 18 10:29:29 2007 Configure Options: --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-shared=0 --configModules=PETSc.Configure -- optionsModule=PETSc.compilerOptions Working directory: /Users/bknaepen/Unix/petsc-2.3.3-p8 Python version: 2.5.1 (r251:54863, Oct 5 2007, 21:08:09) [GCC 4.0.1 (Apple Inc. build 5465)] = = = = = = = = ======================================================================== Pushing language C Popping language C Pushing language Cxx Popping language Cxx Pushing language FC Popping language FC = = = = = = = = ======================================================================== TEST configureExternalPackagesDir from config.framework(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py:807) TESTING: configureExternalPackagesDir from config.framework(python/ BuildSystem/config/framework.py:807) = = = = = = = = ======================================================================== TEST configureLibrary from PETSc.packages.PVODE(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/PETSc/packages/PVODE.py:10) TESTING: configureLibrary from PETSc.packages.PVODE(python/PETSc/ packages/PVODE.py:10) Find a PVODE installation and check if it can work with PETSc = = = = = = = = ======================================================================== TEST configureLibrary from PETSc.packages.NetCDF(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/PETSc/packages/NetCDF.py:10) TESTING: configureLibrary from PETSc.packages.NetCDF(python/PETSc/ packages/NetCDF.py:10) Find a NetCDF installation and check if it can work with PETSc = = = = = = = = ======================================================================== TEST configureMercurial from config.sourceControl(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:23) TESTING: configureMercurial from config.sourceControl(python/ BuildSystem/config/sourceControl.py:23) Find the Mercurial executable Checking for program /opt/intel/fc/10.0.020/bin/hg...not found Checking for program /usr/X11R6/bin/hg...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/hg...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/hg...not found Checking for program /bin/hg...not found Checking for program /sbin/hg...not found Checking for program /usr/bin/hg...not found Checking for program /usr/sbin/hg...not found Checking for program /usr/local/bin/hg...not found Checking for program /usr/texbin/hg...not found Checking for program /Users/bknaepen/hg...not found = = = = = = = = ======================================================================== TEST configureCVS from config.sourceControl(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:30) TESTING: configureCVS from config.sourceControl(python/BuildSystem/ config/sourceControl.py:30) Find the CVS executable Checking for program /opt/intel/fc/10.0.020/bin/cvs...not found Checking for program /usr/X11R6/bin/cvs...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cvs...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cvs...not found Checking for program /bin/cvs...not found Checking for program /sbin/cvs...not found Checking for program /usr/bin/cvs...found Defined make macro "CVS" to "cvs" = = = = = = = = ======================================================================== TEST configureSubversion from config.sourceControl(/Users/bknaepen/ Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:35) TESTING: configureSubversion from config.sourceControl(python/ BuildSystem/config/sourceControl.py:35) Find the Subversion executable Checking for program /opt/intel/fc/10.0.020/bin/svn...not found Checking for program /usr/X11R6/bin/svn...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/svn...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/svn...not found Checking for program /bin/svn...not found Checking for program /sbin/svn...not found Checking for program /usr/bin/svn...found Defined make macro "SVN" to "svn" = = = = = = = = ======================================================================== TEST configureMkdir from config.programs(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/programs.py:21) TESTING: configureMkdir from config.programs(python/BuildSystem/config/ programs.py:21) Make sure we can have mkdir automatically make intermediate directories Checking for program /opt/intel/fc/10.0.020/bin/mkdir...not found Checking for program /usr/X11R6/bin/mkdir...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mkdir...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mkdir...not found Checking for program /bin/mkdir...found sh: /bin/mkdir -p .conftest/tmp Executing: /bin/mkdir -p .conftest/tmp sh: Adding -p flag to /bin/mkdir -p to automatically create directories Defined make macro "MKDIR" to "/bin/mkdir -p" = = = = = = = = ======================================================================== TEST configurePrograms from config.programs(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/programs.py:43) TESTING: configurePrograms from config.programs(python/BuildSystem/ config/programs.py:43) Check for the programs needed to build and run PETSc Checking for program /opt/intel/fc/10.0.020/bin/sh...not found Checking for program /usr/X11R6/bin/sh...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sh...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sh...not found Checking for program /bin/sh...found Defined make macro "SHELL" to "/bin/sh" Checking for program /opt/intel/fc/10.0.020/bin/sed...not found Checking for program /usr/X11R6/bin/sed...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sed...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sed...not found Checking for program /bin/sed...not found Checking for program /sbin/sed...not found Checking for program /usr/bin/sed...found Defined make macro "SED" to "/usr/bin/sed" Checking for program /opt/intel/fc/10.0.020/bin/mv...not found Checking for program /usr/X11R6/bin/mv...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mv...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mv...not found Checking for program /bin/mv...found Defined make macro "MV" to "/bin/mv" Checking for program /opt/intel/fc/10.0.020/bin/cp...not found Checking for program /usr/X11R6/bin/cp...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cp...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cp...not found Checking for program /bin/cp...found Defined make macro "CP" to "/bin/cp" Checking for program /opt/intel/fc/10.0.020/bin/grep...not found Checking for program /usr/X11R6/bin/grep...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/grep...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/grep...not found Checking for program /bin/grep...not found Checking for program /sbin/grep...not found Checking for program /usr/bin/grep...found Defined make macro "GREP" to "/usr/bin/grep" Checking for program /opt/intel/fc/10.0.020/bin/rm...not found Checking for program /usr/X11R6/bin/rm...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/rm...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/rm...not found Checking for program /bin/rm...found Defined make macro "RM" to "/bin/rm -f" Checking for program /opt/intel/fc/10.0.020/bin/diff...not found Checking for program /usr/X11R6/bin/diff...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/diff...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/diff...not found Checking for program /bin/diff...not found Checking for program /sbin/diff...not found Checking for program /usr/bin/diff...found sh: /usr/bin/diff -w diff1 diff2 Executing: /usr/bin/diff -w diff1 diff2 sh: Defined make macro "DIFF" to "/usr/bin/diff -w" Checking for program /usr/ucb/ps...not found Checking for program /usr/usb/ps...not found Checking for program /Users/bknaepen/ps...not found Checking for program /opt/intel/fc/10.0.020/bin/gzip...not found Checking for program /usr/X11R6/bin/gzip...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gzip...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gzip...not found Checking for program /bin/gzip...not found Checking for program /sbin/gzip...not found Checking for program /usr/bin/gzip...found Defined make macro "GZIP" to "/usr/bin/gzip" Defined "HAVE_GZIP" to "1" = = = = = = = = ======================================================================== TEST configureMake from PETSc.utilities.Make(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/PETSc/utilities/Make.py:21) TESTING: configureMake from PETSc.utilities.Make(python/PETSc/ utilities/Make.py:21) Check various things about make Checking for program /opt/intel/fc/10.0.020/bin/make...not found Checking for program /usr/X11R6/bin/make...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/make...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/make...not found Checking for program /bin/make...not found Checking for program /sbin/make...not found Checking for program /usr/bin/make...found Defined make macro "MAKE" to "/usr/bin/make" sh: strings /usr/bin/make Executing: strings /usr/bin/make sh: attempt to use unsupported feature: `%s' touch: Archive `%s' does not exist touch: `%s' is not a valid archive touch: touch: Member `%s' does not exist in `%s' touch: Bad return code from ar_member_touch on `%s' ! ARFILENAMES/ $(MAKE) ${MAKE} *** [%s] Archive member `%s' may be bogus; not deleted *** Archive member `%s' may be bogus; not deleted *** [%s] Deleting file `%s' *** Deleting file `%s' unlink: kill # commands to execute (built-in): (from `%s', line %lu): %.*s GNUMAKE MAKEFILEPATH $(NEXT_ROOT)/Developer/Makefiles CHECKOUT,v +$(if $(wildcard $@),,$(CO) $(COFLAGS) $< $@) COFL... ... h: can't allocate %ld bytes for hash table: memory exhausted Load=%ld/%ld=%.0f%%, Rehash=%d, Collisions=%ld/%ld=%.0f%% $(VPATH) Can't do VPATH expansion on a null file. =|^();&<>*?[]:$`'"\ Using old-style VPATH substitution. Consider using automatic variable substitution instead. glob next != NULL /SourceCache/gnumake/gnumake-119/make/glob/glob.c alnum alpha blank cntrl digit graph lower print punct space upper xdigit .out .a .ln .o .c .cc .C .cpp .p .f .F .m .r .y .l .ym .lm .s .S .mod .sym .def .h .info .dvi .tex .texinfo .texi .txinfo .w .ch .web .sh .elc .el /bin/sh #;"*?[]&|<>(){}$`^~! Defined make macro "OMAKE " to "/usr/bin/make --no-print- directory" Defined make rule "libc" with dependencies "${LIBNAME}($ {OBJSC} ${SOBJSC})" and code [] Defined make rule "libf" with dependencies "${OBJSF}" and code -${AR} ${AR_FLAGS} ${LIBNAME} ${OBJSF} = = = = = = = = ======================================================================== TEST configureDebuggers from PETSc.utilities.debuggers(/Users/bknaepen/ Unix/petsc-2.3.3-p8/python/PETSc/utilities/debuggers.py:22) TESTING: configureDebuggers from PETSc.utilities.debuggers(python/ PETSc/utilities/debuggers.py:22) Find a default debugger and determine its arguments Checking for program /opt/intel/fc/10.0.020/bin/gdb...not found Checking for program /usr/X11R6/bin/gdb...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gdb...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gdb...not found Checking for program /bin/gdb...not found Checking for program /sbin/gdb...not found Checking for program /usr/bin/gdb...found Defined make macro "GDB" to "/usr/bin/gdb" Checking for program /opt/intel/fc/10.0.020/bin/dbx...not found Checking for program /usr/X11R6/bin/dbx...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/dbx...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/dbx...not found Checking for program /bin/dbx...not found Checking for program /sbin/dbx...not found Checking for program /usr/bin/dbx...not found Checking for program /usr/sbin/dbx...not found Checking for program /usr/local/bin/dbx...not found Checking for program /usr/texbin/dbx...not found Checking for program /Users/bknaepen/dbx...not found Checking for program /opt/intel/fc/10.0.020/bin/xdb...not found Checking for program /usr/X11R6/bin/xdb...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/xdb...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/xdb...not found Checking for program /bin/xdb...not found Checking for program /sbin/xdb...not found Checking for program /usr/bin/xdb...not found Checking for program /usr/sbin/xdb...not found Checking for program /usr/local/bin/xdb...not found Checking for program /usr/texbin/xdb...not found Checking for program /Users/bknaepen/xdb...not found Defined "USE_GDB_DEBUGGER" to "1" = = = = = = = = ======================================================================== TEST configureCLanguage from PETSc.utilities.languages(/Users/bknaepen/ Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:43) TESTING: configureCLanguage from PETSc.utilities.languages(python/ PETSc/utilities/languages.py:43) Choose between C and C++ bindings = = = = = = = = ======================================================================== TEST configureLanguageSupport from PETSc.utilities.languages(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:49) TESTING: configureLanguageSupport from PETSc.utilities.languages(python/PETSc/utilities/languages.py:49) Check c-support c++-support and other misc tests Turning off C++ support Allowing C++ name mangling C language is C Defined "CLANGUAGE_C" to "1" = = = = = = = = ======================================================================== TEST configureExternC from PETSc.utilities.languages(/Users/bknaepen/ Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:66) TESTING: configureExternC from PETSc.utilities.languages(python/PETSc/ utilities/languages.py:66) Protect C bindings from C++ mangling Defined "USE_EXTERN_CXX" to " " = = = = = = = = ======================================================================== TEST configureFortranLanguage from PETSc.utilities.languages(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:72) TESTING: configureFortranLanguage from PETSc.utilities.languages(python/PETSc/utilities/languages.py:72) Turn on Fortran bindings Using Fortran = = = = = = = = ======================================================================== TEST configureDirectories from PETSc.utilities.petscdir(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:34) TESTING: configureDirectories from PETSc.utilities.petscdir(python/ PETSc/utilities/petscdir.py:34) Checks PETSC_DIR and sets if not set Version Information: #define PETSC_VERSION_RELEASE 1 #define PETSC_VERSION_MAJOR 2 #define PETSC_VERSION_MINOR 3 #define PETSC_VERSION_SUBMINOR 3 #define PETSC_VERSION_PATCH 8 #define PETSC_VERSION_DATE "May, 23, 2007" #define PETSC_VERSION_PATCH_DATE "Fri Nov 16 17:03:40 CST 2007" #define PETSC_VERSION_HG "414581156e67e55c761739b0deb119f7590d0f4b" Defined make macro "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3- p8" Defined "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3-p8" sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.guess Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/ config.guess sh: i686-apple-darwin9.1.0 sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.sub i686-apple-darwin9.1.0 Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/ config.sub i686-apple-darwin9.1.0 sh: i686-apple-darwin9.1.0 = = = = = = = = ======================================================================== TEST configureExternalPackagesDir from PETSc.utilities.petscdir(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:112) TESTING: configureExternalPackagesDir from PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:112) = = = = = = = = ======================================================================== TEST configureInstallationMethod from PETSc.utilities.petscdir(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:119) TESTING: configureInstallationMethod from PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:119) This is a tarball installation = = = = = = = = ======================================================================== TEST configureETags from PETSc.utilities.Etags(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/PETSc/utilities/Etags.py:27) TESTING: configureETags from PETSc.utilities.Etags(python/PETSc/ utilities/Etags.py:27) Determine if etags files exist and try to create otherwise Found etags file = = = = = = = = ======================================================================== TEST getDatafilespath from PETSc.utilities.dataFilesPath(/Users/ bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/dataFilesPath.py:29) TESTING: getDatafilespath from PETSc.utilities.dataFilesPath(python/ PETSc/utilities/dataFilesPath.py:29) Checks what DATAFILESPATH should be Defined make macro "DATAFILESPATH" to "None" = = = = = = = = ======================================================================== TEST checkVendor from config.setCompilers(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:262) TESTING: checkVendor from config.setCompilers(python/BuildSystem/ config/setCompilers.py:262) Determine the compiler vendor Compiler vendor is "" = = = = = = = = ======================================================================== TEST checkInitialFlags from config.setCompilers(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:272) TESTING: checkInitialFlags from config.setCompilers(python/BuildSystem/ config/setCompilers.py:272) Initialize the compiler and linker flags Pushing language C Initialized CFLAGS to Initialized CFLAGS to Initialized LDFLAGS to Popping language C Pushing language Cxx Initialized CXXFLAGS to Initialized CXX_CXXFLAGS to Initialized LDFLAGS to Popping language Cxx Pushing language FC Initialized FFLAGS to Initialized FFLAGS to Initialized LDFLAGS to Popping language FC Initialized CPPFLAGS to Initialized executableFlags to [] Initialized sharedLibraryFlags to [] Initialized dynamicLibraryFlags to [] = = = = = = = = ======================================================================== TEST checkCCompiler from config.setCompilers(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:380) TESTING: checkCCompiler from config.setCompilers(python/BuildSystem/ config/setCompilers.py:380) Locate a functional C compiler Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found Checking for program /usr/X11R6/bin/mpicc...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found Defined make macro "CC" to "mpicc" Pushing language C sh: mpicc -c -o conftest.o conftest.c Executing: mpicc -c -o conftest.o conftest.c sh: sh: mpicc -c -o conftest.o conftest.c Executing: mpicc -c -o conftest.o conftest.c sh: Pushing language C Popping language C Pushing language Cxx Popping language Cxx Pushing language FC Popping language FC Pushing language C Popping language C sh: mpicc -o conftest conftest.o Executing: mpicc -o conftest conftest.o sh: sh: mpicc -c -o conftest.o conftest.c Executing: mpicc -c -o conftest.o conftest.c sh: Pushing language C Popping language C sh: mpicc -o conftest conftest.o Executing: mpicc -o conftest conftest.o sh: Executing: ./conftest sh: ./conftest Executing: ./conftest sh: Popping language C = = = = = = = = ======================================================================== TEST checkCPreprocessor from config.setCompilers(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:437) TESTING: checkCPreprocessor from config.setCompilers(python/ BuildSystem/config/setCompilers.py:437) Locate a functional C preprocessor Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found Checking for program /usr/X11R6/bin/mpicc...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found Defined make macro "CPP" to "mpicc -E" Pushing language C sh: mpicc -E conftest.c Executing: mpicc -E conftest.c sh: # 1 "conftest.c" # 1 "" # 1 "" # 1 "conftest.c" # 1 "confdefs.h" 1 # 2 "conftest.c" 2 # 1 "conffix.h" 1 # 3 "conftest.c" 2 # 1 "/usr/include/stdlib.h" 1 3 4 # 61 "/usr/include/stdlib.h" 3 4 # 1 "/usr/include/available.h" 1 3 4 # 62 "/usr/include/stdlib.h" 2 3 4 # 1 "/usr/include/_types.h" 1 3 4 # 27 "/usr/include/_types.h" 3 4 # 1 "/usr/include/sys/_types.h" 1 3 4 # 32 "/usr/include/sys/_types.h" 3 4 # 1 "/usr/include/sys/cdefs.h" 1 3 4 # 33 "/usr/include/sys/_types.h" 2 3 4 # 1 "/usr/include/machine/_types.h" 1 3 4 # 34 "/usr/include/machine/_types.h" 3 4 # 1 "/usr/inc... ... size_t, size_t, int (*)(const void *, const void *)); void qsort_r(void *, size_t, size_t, void *, int (*)(void *, const void *, const void *)); int radixsort(const unsigned char **, int, const unsigned char *, unsigned); void setprogname(const char *); int sradixsort(const unsigned char **, int, const unsigned char *, unsigned); void sranddev(void); void srandomdev(void); void *reallocf(void *, size_t); long long strtoq(const char *, char **, int); unsigned long long strtouq(const char *, char **, int); extern char *suboptarg; void *valloc(size_t); # 3 "conftest.c" 2 Popping language C = = = = = = = = ======================================================================== TEST checkCxxCompiler from config.setCompilers(/Users/bknaepen/Unix/ petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:541) TESTING: checkCxxCompiler from config.setCompilers(python/BuildSystem/ config/setCompilers.py:541) Locate a functional Cxx compiler = = = = = = = = ======================================================================== TEST checkFortranCompiler from config.setCompilers(/Users/bknaepen/ Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:708) TESTING: checkFortranCompiler from config.setCompilers(python/ BuildSystem/config/setCompilers.py:708) Locate a functional Fortran compiler Checking for program /opt/intel/fc/10.0.020/bin/mpif90...not found Checking for program /usr/X11R6/bin/mpif90...not found Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpif90...not found Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found Defined make macro "FC" to "mpif90" Pushing language FC sh: mpif90 -c -o conftest.o conftest.F Executing: mpif90 -c -o conftest.o conftest.F sh: Possible ERROR while running compiler: ret = 256 error message = {ifort: error #10106: Fatal error in /opt/intel/fc/ 10.0.020/bin/fpp, terminated by segmentation violation } Source: program main end Popping language FC Error testing Fortran compiler: Cannot compile FC with mpicc. MPI installation mpif90 is likely incorrect. Use --with-mpi-dir to indicate an alternate MPI. ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- Fortran compiler you provided with --with-fc=mpif90 does not work ********************************************************************************* File "./config/configure.py", line 190, in petsc_configure framework.configure(out = sys.stdout) File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ framework.py", line 878, in configure child.configure() File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ setCompilers.py", line 1267, in configure self.executeTest(self.checkFortranCompiler) File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ base.py", line 93, in executeTest return apply(test, args,kargs) File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ setCompilers.py", line 714, in checkFortranCompiler for compiler in self.generateFortranCompilerGuesses(): File "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/ setCompilers.py", line 631, in generateFortranCompilerGuesses raise RuntimeError('Fortran compiler you provided with --with- fc='+self.framework.argDB['with-fc']+' does not work') -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 486 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sun Nov 18 08:54:20 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 18 Nov 2007 08:54:20 -0600 (CST) Subject: problem compiling PETSC on MacOS Leopard In-Reply-To: References: Message-ID: Please direct these problems to petsc-maint instead of petsc-users. >From the log file Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found Defined make macro "FC" to "mpif90" Pushing language FC sh: mpif90 -c -o conftest.o conftest.F Executing: mpif90 -c -o conftest.o conftest.F sh: Possible ERROR while running compiler: ret =3D 256 error message =3D {ifort: error #10106: Fatal error in /opt/intel/fc/=20 10.0.020/bin/fpp, terminated by segmentation violation } Source: program main end So the mpif90 is crashing on a simple Fortran program with nothing in it. Can you try compiling exactly as above from the command line? Barry On Sun, 18 Nov 2007, Bernard Knaepen wrote: > Hello, > > I would like to compile PETSC on Leopard but I am encountering a problem > during configuration. The scripts stops with: > > dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc > --with-fc=mpif90 --with-cxx=mpicxx > > ================================================================================= > Configuring PETSc to compile on your system > ================================================================================= > TESTING: checkFortranCompiler from > config.setCompilers(python/BuildSystem/config/setCompilers.py:708) > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > --------------------------------------------------------------------------------------- > Fortran compiler you provided with --with-fc=mpif90 does not work > ********************************************************************************* > > > My MPI installation is mpich2 1.0.6p1 and I have the latest ifort compiler > installed (10.0.20). I have test mpif90 and it is working ok. I copy below > the configure.log file. > > > Any help would be appreciated, thanks, > > Bernard. > > > > > Pushing language C > Popping language C > Pushing language Cxx > Popping language Cxx > Pushing language FC > Popping language FC > sh: /bin/sh > /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.guess > Executing: /bin/sh > /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.guess > sh: i686-apple-darwin9.1.0 > > sh: /bin/sh > /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.sub > i686-apple-darwin9.1.0 > > Executing: /bin/sh > /Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/packages/config.sub > i686-apple-darwin9.1.0 > > sh: i686-apple-darwin9.1.0 > > > ================================================================================ > ================================================================================ > Starting Configure Run at Sun Nov 18 10:29:29 2007 > Configure Options: --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx > --with-shared=0 --configModules=PETSc.Configure > --optionsModule=PETSc.compilerOptions > Working directory: /Users/bknaepen/Unix/petsc-2.3.3-p8 > Python version: > 2.5.1 (r251:54863, Oct 5 2007, 21:08:09) > [GCC 4.0.1 (Apple Inc. build 5465)] > ================================================================================ > Pushing language C > Popping language C > Pushing language Cxx > Popping language Cxx > Pushing language FC > Popping language FC > ================================================================================ > TEST configureExternalPackagesDir from > config.framework(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py:807) > TESTING: configureExternalPackagesDir from > config.framework(python/BuildSystem/config/framework.py:807) > ================================================================================ > TEST configureLibrary from > PETSc.packages.PVODE(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/packages/PVODE.py:10) > TESTING: configureLibrary from > PETSc.packages.PVODE(python/PETSc/packages/PVODE.py:10) > Find a PVODE installation and check if it can work with PETSc > ================================================================================ > TEST configureLibrary from > PETSc.packages.NetCDF(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/packages/NetCDF.py:10) > TESTING: configureLibrary from > PETSc.packages.NetCDF(python/PETSc/packages/NetCDF.py:10) > Find a NetCDF installation and check if it can work with PETSc > ================================================================================ > TEST configureMercurial from > config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:23) > TESTING: configureMercurial from > config.sourceControl(python/BuildSystem/config/sourceControl.py:23) > Find the Mercurial executable > Checking for program /opt/intel/fc/10.0.020/bin/hg...not found > Checking for program /usr/X11R6/bin/hg...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/hg...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/hg...not found > Checking for program /bin/hg...not found > Checking for program /sbin/hg...not found > Checking for program /usr/bin/hg...not found > Checking for program /usr/sbin/hg...not found > Checking for program /usr/local/bin/hg...not found > Checking for program /usr/texbin/hg...not found > Checking for program /Users/bknaepen/hg...not found > ================================================================================ > TEST configureCVS from > config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:30) > TESTING: configureCVS from > config.sourceControl(python/BuildSystem/config/sourceControl.py:30) > Find the CVS executable > Checking for program /opt/intel/fc/10.0.020/bin/cvs...not found > Checking for program /usr/X11R6/bin/cvs...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cvs...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cvs...not found > Checking for program /bin/cvs...not found > Checking for program /sbin/cvs...not found > Checking for program /usr/bin/cvs...found > Defined make macro "CVS" to "cvs" > ================================================================================ > TEST configureSubversion from > config.sourceControl(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/sourceControl.py:35) > TESTING: configureSubversion from > config.sourceControl(python/BuildSystem/config/sourceControl.py:35) > Find the Subversion executable > Checking for program /opt/intel/fc/10.0.020/bin/svn...not found > Checking for program /usr/X11R6/bin/svn...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/svn...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/svn...not found > Checking for program /bin/svn...not found > Checking for program /sbin/svn...not found > Checking for program /usr/bin/svn...found > Defined make macro "SVN" to "svn" > ================================================================================ > TEST configureMkdir from > config.programs(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/programs.py:21) > TESTING: configureMkdir from > config.programs(python/BuildSystem/config/programs.py:21) > Make sure we can have mkdir automatically make intermediate directories > Checking for program /opt/intel/fc/10.0.020/bin/mkdir...not found > Checking for program /usr/X11R6/bin/mkdir...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mkdir...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mkdir...not found > Checking for program /bin/mkdir...found > sh: /bin/mkdir -p .conftest/tmp > Executing: /bin/mkdir -p .conftest/tmp > sh: > Adding -p flag to /bin/mkdir -p to automatically create directories > Defined make macro "MKDIR" to "/bin/mkdir -p" > ================================================================================ > TEST configurePrograms from > config.programs(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/programs.py:43) > TESTING: configurePrograms from > config.programs(python/BuildSystem/config/programs.py:43) > Check for the programs needed to build and run PETSc > Checking for program /opt/intel/fc/10.0.020/bin/sh...not found > Checking for program /usr/X11R6/bin/sh...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sh...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sh...not found > Checking for program /bin/sh...found > Defined make macro "SHELL" to "/bin/sh" > Checking for program /opt/intel/fc/10.0.020/bin/sed...not found > Checking for program /usr/X11R6/bin/sed...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/sed...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/sed...not found > Checking for program /bin/sed...not found > Checking for program /sbin/sed...not found > Checking for program /usr/bin/sed...found > Defined make macro "SED" to "/usr/bin/sed" > Checking for program /opt/intel/fc/10.0.020/bin/mv...not found > Checking for program /usr/X11R6/bin/mv...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mv...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mv...not found > Checking for program /bin/mv...found > Defined make macro "MV" to "/bin/mv" > Checking for program /opt/intel/fc/10.0.020/bin/cp...not found > Checking for program /usr/X11R6/bin/cp...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/cp...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/cp...not found > Checking for program /bin/cp...found > Defined make macro "CP" to "/bin/cp" > Checking for program /opt/intel/fc/10.0.020/bin/grep...not found > Checking for program /usr/X11R6/bin/grep...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/grep...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/grep...not found > Checking for program /bin/grep...not found > Checking for program /sbin/grep...not found > Checking for program /usr/bin/grep...found > Defined make macro "GREP" to "/usr/bin/grep" > Checking for program /opt/intel/fc/10.0.020/bin/rm...not found > Checking for program /usr/X11R6/bin/rm...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/rm...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/rm...not found > Checking for program /bin/rm...found > Defined make macro "RM" to "/bin/rm -f" > Checking for program /opt/intel/fc/10.0.020/bin/diff...not found > Checking for program /usr/X11R6/bin/diff...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/diff...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/diff...not found > Checking for program /bin/diff...not found > Checking for program /sbin/diff...not found > Checking for program /usr/bin/diff...found > sh: /usr/bin/diff -w diff1 diff2 > Executing: /usr/bin/diff -w diff1 diff2 > sh: > Defined make macro "DIFF" to "/usr/bin/diff -w" > Checking for program /usr/ucb/ps...not found > Checking for program /usr/usb/ps...not found > Checking for program /Users/bknaepen/ps...not found > Checking for program /opt/intel/fc/10.0.020/bin/gzip...not found > Checking for program /usr/X11R6/bin/gzip...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gzip...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gzip...not found > Checking for program /bin/gzip...not found > Checking for program /sbin/gzip...not found > Checking for program /usr/bin/gzip...found > Defined make macro "GZIP" to "/usr/bin/gzip" > Defined "HAVE_GZIP" to "1" > ================================================================================ > TEST configureMake from > PETSc.utilities.Make(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/Make.py:21) > TESTING: configureMake from > PETSc.utilities.Make(python/PETSc/utilities/Make.py:21) > Check various things about make > Checking for program /opt/intel/fc/10.0.020/bin/make...not found > Checking for program /usr/X11R6/bin/make...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/make...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/make...not found > Checking for program /bin/make...not found > Checking for program /sbin/make...not found > Checking for program /usr/bin/make...found > Defined make macro "MAKE" to "/usr/bin/make" > sh: strings /usr/bin/make > Executing: strings /usr/bin/make > sh: attempt to use unsupported feature: `%s' > touch: Archive `%s' does not exist > touch: `%s' is not a valid archive > touch: > touch: Member `%s' does not exist in `%s' > touch: Bad return code from ar_member_touch on `%s' > ! > ARFILENAMES/ > $(MAKE) > ${MAKE} > *** [%s] Archive member `%s' may be bogus; not deleted > *** Archive member `%s' may be bogus; not deleted > *** [%s] Deleting file `%s' > *** Deleting file `%s' > unlink: > kill > # commands to execute > (built-in): > (from `%s', line %lu): > %.*s > GNUMAKE > MAKEFILEPATH > $(NEXT_ROOT)/Developer/Makefiles > CHECKOUT,v > +$(if $(wildcard $@),,$(CO) $(COFLAGS) $< $@) > COFL... > ... h: > can't allocate %ld bytes for hash table: memory exhausted > Load=%ld/%ld=%.0f%%, > Rehash=%d, > Collisions=%ld/%ld=%.0f%% > $(VPATH) > Can't do VPATH expansion on a null file. > =|^();&<>*?[]:$`'"\ > Using old-style VPATH substitution. > Consider using automatic variable substitution instead. > glob > next != NULL > /SourceCache/gnumake/gnumake-119/make/glob/glob.c > alnum > alpha > blank > cntrl > digit > graph > lower > print > punct > space > upper > xdigit > .out .a .ln .o .c .cc .C .cpp .p .f .F .m .r .y .l .ym .lm .s .S .mod .sym > .def .h .info .dvi .tex .texinfo .texi .txinfo .w .ch .web .sh .elc .el > /bin/sh > #;"*?[]&|<>(){}$`^~! > > Defined make macro "OMAKE " to "/usr/bin/make --no-print-directory" > Defined make rule "libc" with dependencies "${LIBNAME}(${OBJSC} > ${SOBJSC})" and code [] > Defined make rule "libf" with dependencies "${OBJSF}" and code -${AR} > ${AR_FLAGS} ${LIBNAME} ${OBJSF} > ================================================================================ > TEST configureDebuggers from > PETSc.utilities.debuggers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/debuggers.py:22) > TESTING: configureDebuggers from > PETSc.utilities.debuggers(python/PETSc/utilities/debuggers.py:22) > Find a default debugger and determine its arguments > Checking for program /opt/intel/fc/10.0.020/bin/gdb...not found > Checking for program /usr/X11R6/bin/gdb...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/gdb...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/gdb...not found > Checking for program /bin/gdb...not found > Checking for program /sbin/gdb...not found > Checking for program /usr/bin/gdb...found > Defined make macro "GDB" to "/usr/bin/gdb" > Checking for program /opt/intel/fc/10.0.020/bin/dbx...not found > Checking for program /usr/X11R6/bin/dbx...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/dbx...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/dbx...not found > Checking for program /bin/dbx...not found > Checking for program /sbin/dbx...not found > Checking for program /usr/bin/dbx...not found > Checking for program /usr/sbin/dbx...not found > Checking for program /usr/local/bin/dbx...not found > Checking for program /usr/texbin/dbx...not found > Checking for program /Users/bknaepen/dbx...not found > Checking for program /opt/intel/fc/10.0.020/bin/xdb...not found > Checking for program /usr/X11R6/bin/xdb...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/xdb...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/xdb...not found > Checking for program /bin/xdb...not found > Checking for program /sbin/xdb...not found > Checking for program /usr/bin/xdb...not found > Checking for program /usr/sbin/xdb...not found > Checking for program /usr/local/bin/xdb...not found > Checking for program /usr/texbin/xdb...not found > Checking for program /Users/bknaepen/xdb...not found > Defined "USE_GDB_DEBUGGER" to "1" > ================================================================================ > TEST configureCLanguage from > PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:43) > TESTING: configureCLanguage from > PETSc.utilities.languages(python/PETSc/utilities/languages.py:43) > Choose between C and C++ bindings > ================================================================================ > TEST configureLanguageSupport from > PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:49) > TESTING: configureLanguageSupport from > PETSc.utilities.languages(python/PETSc/utilities/languages.py:49) > Check c-support c++-support and other misc tests > Turning off C++ support > Allowing C++ name mangling > C language is C > Defined "CLANGUAGE_C" to "1" > ================================================================================ > TEST configureExternC from > PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:66) > TESTING: configureExternC from > PETSc.utilities.languages(python/PETSc/utilities/languages.py:66) > Protect C bindings from C++ mangling > Defined "USE_EXTERN_CXX" to " " > ================================================================================ > TEST configureFortranLanguage from > PETSc.utilities.languages(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/languages.py:72) > TESTING: configureFortranLanguage from > PETSc.utilities.languages(python/PETSc/utilities/languages.py:72) > Turn on Fortran bindings > Using Fortran > ================================================================================ > TEST configureDirectories from > PETSc.utilities.petscdir(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:34) > TESTING: configureDirectories from > PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:34) > Checks PETSC_DIR and sets if not set > Version Information: > #define PETSC_VERSION_RELEASE 1 > #define PETSC_VERSION_MAJOR 2 > #define PETSC_VERSION_MINOR 3 > #define PETSC_VERSION_SUBMINOR 3 > #define PETSC_VERSION_PATCH 8 > #define PETSC_VERSION_DATE "May, 23, 2007" > #define PETSC_VERSION_PATCH_DATE "Fri Nov 16 17:03:40 CST 2007" > #define PETSC_VERSION_HG > "414581156e67e55c761739b0deb119f7590d0f4b" > Defined make macro "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3-p8" > Defined "DIR" to "/Users/bknaepen/Unix/petsc-2.3.3-p8" > sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.guess > Executing: /bin/sh > /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.guess > sh: i686-apple-darwin9.1.0 > > sh: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.sub > i686-apple-darwin9.1.0 > > Executing: /bin/sh /Users/bknaepen/Unix/petsc-2.3.3-p8/bin/config/config.sub > i686-apple-darwin9.1.0 > > sh: i686-apple-darwin9.1.0 > > ================================================================================ > TEST configureExternalPackagesDir from > PETSc.utilities.petscdir(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:112) > TESTING: configureExternalPackagesDir from > PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:112) > ================================================================================ > TEST configureInstallationMethod from > PETSc.utilities.petscdir(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/petscdir.py:119) > TESTING: configureInstallationMethod from > PETSc.utilities.petscdir(python/PETSc/utilities/petscdir.py:119) > This is a tarball installation > ================================================================================ > TEST configureETags from > PETSc.utilities.Etags(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/Etags.py:27) > TESTING: configureETags from > PETSc.utilities.Etags(python/PETSc/utilities/Etags.py:27) > Determine if etags files exist and try to create otherwise > Found etags file > ================================================================================ > TEST getDatafilespath from > PETSc.utilities.dataFilesPath(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/PETSc/utilities/dataFilesPath.py:29) > TESTING: getDatafilespath from > PETSc.utilities.dataFilesPath(python/PETSc/utilities/dataFilesPath.py:29) > Checks what DATAFILESPATH should be > Defined make macro "DATAFILESPATH" to "None" > ================================================================================ > TEST checkVendor from > config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:262) > TESTING: checkVendor from > config.setCompilers(python/BuildSystem/config/setCompilers.py:262) > Determine the compiler vendor > Compiler vendor is "" > ================================================================================ > TEST checkInitialFlags from > config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:272) > TESTING: checkInitialFlags from > config.setCompilers(python/BuildSystem/config/setCompilers.py:272) > Initialize the compiler and linker flags > Pushing language C > Initialized CFLAGS to > Initialized CFLAGS to > Initialized LDFLAGS to > Popping language C > Pushing language Cxx > Initialized CXXFLAGS to > Initialized CXX_CXXFLAGS to > Initialized LDFLAGS to > Popping language Cxx > Pushing language FC > Initialized FFLAGS to > Initialized FFLAGS to > Initialized LDFLAGS to > Popping language FC > Initialized CPPFLAGS to > Initialized executableFlags to [] > Initialized sharedLibraryFlags to [] > Initialized dynamicLibraryFlags to [] > ================================================================================ > TEST checkCCompiler from > config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:380) > TESTING: checkCCompiler from > config.setCompilers(python/BuildSystem/config/setCompilers.py:380) > Locate a functional C compiler > Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found > Checking for program /usr/X11R6/bin/mpicc...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found > Defined make macro "CC" to "mpicc" > Pushing language C > sh: mpicc -c -o conftest.o conftest.c > Executing: mpicc -c -o conftest.o conftest.c > sh: > sh: mpicc -c -o conftest.o conftest.c > Executing: mpicc -c -o conftest.o conftest.c > sh: > Pushing language C > Popping language C > Pushing language Cxx > Popping language Cxx > Pushing language FC > Popping language FC > Pushing language C > Popping language C > sh: mpicc -o conftest conftest.o > Executing: mpicc -o conftest conftest.o > sh: > sh: mpicc -c -o conftest.o conftest.c > Executing: mpicc -c -o conftest.o conftest.c > sh: > Pushing language C > Popping language C > sh: mpicc -o conftest conftest.o > Executing: mpicc -o conftest conftest.o > sh: > Executing: ./conftest > sh: ./conftest > Executing: ./conftest > sh: > Popping language C > ================================================================================ > TEST checkCPreprocessor from > config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:437) > TESTING: checkCPreprocessor from > config.setCompilers(python/BuildSystem/config/setCompilers.py:437) > Locate a functional C preprocessor > Checking for program /opt/intel/fc/10.0.020/bin/mpicc...not found > Checking for program /usr/X11R6/bin/mpicc...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpicc...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpicc...found > Defined make macro "CPP" to "mpicc -E" > Pushing language C > sh: mpicc -E conftest.c > Executing: mpicc -E conftest.c > sh: # 1 "conftest.c" > # 1 "" > # 1 "" > # 1 "conftest.c" > # 1 "confdefs.h" 1 > # 2 "conftest.c" 2 > # 1 "conffix.h" 1 > # 3 "conftest.c" 2 > # 1 "/usr/include/stdlib.h" 1 3 4 > # 61 "/usr/include/stdlib.h" 3 4 > # 1 "/usr/include/available.h" 1 3 4 > # 62 "/usr/include/stdlib.h" 2 3 4 > # 1 "/usr/include/_types.h" 1 3 4 > # 27 "/usr/include/_types.h" 3 4 > # 1 "/usr/include/sys/_types.h" 1 3 4 > # 32 "/usr/include/sys/_types.h" 3 4 > # 1 "/usr/include/sys/cdefs.h" 1 3 4 > # 33 "/usr/include/sys/_types.h" 2 3 4 > # 1 "/usr/include/machine/_types.h" 1 3 4 > # 34 "/usr/include/machine/_types.h" 3 4 > # 1 "/usr/inc... > ... size_t, size_t, > int (*)(const void *, const void *)); > void qsort_r(void *, size_t, size_t, void *, > int (*)(void *, const void *, const void *)); > int radixsort(const unsigned char **, int, const unsigned char *, > unsigned); > void setprogname(const char *); > int sradixsort(const unsigned char **, int, const unsigned char *, > unsigned); > void sranddev(void); > void srandomdev(void); > void *reallocf(void *, size_t); > long long > strtoq(const char *, char **, int); > unsigned long long > strtouq(const char *, char **, int); > extern char *suboptarg; > void *valloc(size_t); > # 3 "conftest.c" 2 > > Popping language C > ================================================================================ > TEST checkCxxCompiler from > config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:541) > TESTING: checkCxxCompiler from > config.setCompilers(python/BuildSystem/config/setCompilers.py:541) > Locate a functional Cxx compiler > ================================================================================ > TEST checkFortranCompiler from > config.setCompilers(/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py:708) > TESTING: checkFortranCompiler from > config.setCompilers(python/BuildSystem/config/setCompilers.py:708) > Locate a functional Fortran compiler > Checking for program /opt/intel/fc/10.0.020/bin/mpif90...not found > Checking for program /usr/X11R6/bin/mpif90...not found > Checking for program /opt/toolworks/totalview.8.3.0-0/bin/mpif90...not found > Checking for program /Users/bknaepen/Unix/mpich2-106/bin/mpif90...found > Defined make macro "FC" to "mpif90" > Pushing language FC > sh: mpif90 -c -o conftest.o conftest.F > Executing: mpif90 -c -o conftest.o conftest.F > sh: > Possible ERROR while running compiler: ret = 256 > error message = {ifort: error #10106: Fatal error in > /opt/intel/fc/10.0.020/bin/fpp, terminated by segmentation violation > } > Source: > program main > > end > Popping language FC > Error testing Fortran compiler: Cannot compile FC with mpicc. > MPI installation mpif90 is likely incorrect. > Use --with-mpi-dir to indicate an alternate MPI. > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > --------------------------------------------------------------------------------------- > Fortran compiler you provided with --with-fc=mpif90 does not work > ********************************************************************************* > File "./config/configure.py", line 190, in petsc_configure > framework.configure(out = sys.stdout) > File > "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/framework.py", > line 878, in configure > child.configure() > File > "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py", > line 1267, in configure > self.executeTest(self.checkFortranCompiler) > File > "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/base.py", line > 93, in executeTest > return apply(test, args,kargs) > File > "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py", > line 714, in checkFortranCompiler > for compiler in self.generateFortranCompilerGuesses(): > File > "/Users/bknaepen/Unix/petsc-2.3.3-p8/python/BuildSystem/config/setCompilers.py", > line 631, in generateFortranCompilerGuesses > raise RuntimeError('Fortran compiler you provided with > --with-fc='+self.framework.argDB['with-fc']+' does not work') > From balay at mcs.anl.gov Sun Nov 18 09:37:49 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 18 Nov 2007 09:37:49 -0600 (CST) Subject: problem compiling PETSC on MacOS Leopard In-Reply-To: References: Message-ID: >>>>>>>> Executing: mpif90 -c -o conftest.o conftest.F sh: Possible ERROR while running compiler: ret = 256 error message = {ifort: error #10106: Fatal error in /opt/intel/fc/10.0.020/bin/fpp, terminated by segmentation violation >>>>>>> ifort is giving SEGV - hence configure failed. There must be some compatibility issue with ifort and Leopard. Satish On Sun, 18 Nov 2007, Bernard Knaepen wrote: > Hello, > > I would like to compile PETSC on Leopard but I am encountering a problem > during configuration. The scripts stops with: > > dolfin:petsc-2.3.3-p8 bknaepen$ ./config/configure.py --with-cc=mpicc > --with-fc=mpif90 --with-cxx=mpicxx From timothy.stitt at ichec.ie Sun Nov 18 11:22:21 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 18 Nov 2007 17:22:21 +0000 Subject: Parallel ISCreateGeneral() Message-ID: <474074CD.7050509@ichec.ie> Hi all, Just wanted to know if the "the length of the index set" for a call to ISCreateGeneral() in a parallel code, is a global length, or the length of the local elements on each process? Thanks, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Nov 18 11:27:01 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 Nov 2007 11:27:01 -0600 Subject: Parallel ISCreateGeneral() In-Reply-To: <474074CD.7050509@ichec.ie> References: <474074CD.7050509@ichec.ie> Message-ID: IS are not really parallel, so all the lengths, etc. only refer to local things. Matt On Nov 18, 2007 11:22 AM, Tim Stitt wrote: > Hi all, > > Just wanted to know if the "the length of the index set" for a call to > ISCreateGeneral() in a parallel code, is a global length, or the length > of the local elements on each process? > > Thanks, > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Sun Nov 18 11:34:32 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 18 Nov 2007 17:34:32 +0000 Subject: Parallel ISCreateGeneral() In-Reply-To: References: <474074CD.7050509@ichec.ie> Message-ID: <474077A8.7080804@ichec.ie> OK..so I should be using the aggregate length returned by MatGetOwnershipRange() routine? Thanks Matt for you help. Matthew Knepley wrote: > IS are not really parallel, so all the lengths, etc. only refer to local things. > > Matt > > On Nov 18, 2007 11:22 AM, Tim Stitt wrote: > >> Hi all, >> >> Just wanted to know if the "the length of the index set" for a call to >> ISCreateGeneral() in a parallel code, is a global length, or the length >> of the local elements on each process? >> >> Thanks, >> >> Tim. >> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Nov 18 11:37:10 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 Nov 2007 11:37:10 -0600 Subject: Parallel ISCreateGeneral() In-Reply-To: <474077A8.7080804@ichec.ie> References: <474074CD.7050509@ichec.ie> <474077A8.7080804@ichec.ie> Message-ID: On Nov 18, 2007 11:34 AM, Tim Stitt wrote: > OK..so I should be using the aggregate length returned by > MatGetOwnershipRange() routine? If you are using it to permute a Mat, yes. Matt > Thanks Matt for you help. > > > Matthew Knepley wrote: > > IS are not really parallel, so all the lengths, etc. only refer to local things. > > > > Matt > > > > On Nov 18, 2007 11:22 AM, Tim Stitt wrote: > > > >> Hi all, > >> > >> Just wanted to know if the "the length of the index set" for a call to > >> ISCreateGeneral() in a parallel code, is a global length, or the length > >> of the local elements on each process? > >> > >> Thanks, > >> > >> Tim. > >> > >> -- > >> Dr. Timothy Stitt > >> HPC Application Consultant - ICHEC (www.ichec.ie) > >> > >> Dublin Institute for Advanced Studies > >> 5 Merrion Square - Dublin 2 - Ireland > >> > >> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >> > >> > >> > > > > > > > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Sun Nov 18 11:52:32 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 18 Nov 2007 17:52:32 +0000 Subject: Parallel ISCreateGeneral() In-Reply-To: References: <474074CD.7050509@ichec.ie> <474077A8.7080804@ichec.ie> Message-ID: <47407BE0.8050508@ichec.ie> Matt, It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require index sets. I have distributed my rows across the processes and now just a bit confused about the arguments to the ISCreateGeneral() routine to set up the IS sets used by the Factor routines in parallel. So my basic question is what in general is the length and integers that get passed to ISCreateGeneral() when doing this type of calculation in parallel? Are they local index values (0..#rows on process-1) or do they refer to the distributed indices of the global matrix? Tim. Matthew Knepley wrote: > On Nov 18, 2007 11:34 AM, Tim Stitt wrote: > >> OK..so I should be using the aggregate length returned by >> MatGetOwnershipRange() routine? >> > > If you are using it to permute a Mat, yes. > > Matt > > >> Thanks Matt for you help. >> >> >> Matthew Knepley wrote: >> >>> IS are not really parallel, so all the lengths, etc. only refer to local things. >>> >>> Matt >>> >>> On Nov 18, 2007 11:22 AM, Tim Stitt wrote: >>> >>> >>>> Hi all, >>>> >>>> Just wanted to know if the "the length of the index set" for a call to >>>> ISCreateGeneral() in a parallel code, is a global length, or the length >>>> of the local elements on each process? >>>> >>>> Thanks, >>>> >>>> Tim. >>>> >>>> -- >>>> Dr. Timothy Stitt >>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>> >>>> Dublin Institute for Advanced Studies >>>> 5 Merrion Square - Dublin 2 - Ireland >>>> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>> >>>> >>>> >>>> >>> >>> >>> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Nov 18 12:01:50 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 Nov 2007 12:01:50 -0600 Subject: Parallel ISCreateGeneral() In-Reply-To: <47407BE0.8050508@ichec.ie> References: <474074CD.7050509@ichec.ie> <474077A8.7080804@ichec.ie> <47407BE0.8050508@ichec.ie> Message-ID: On Nov 18, 2007 11:52 AM, Tim Stitt wrote: > Matt, > > It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls > which require index sets. I have distributed my rows across the > processes and now just a bit confused about the arguments to the > ISCreateGeneral() routine to set up the IS sets used by the Factor > routines in parallel. > > So my basic question is what in general is the length and integers that > get passed to ISCreateGeneral() when doing this type of calculation in > parallel? Are they local index values (0..#rows on process-1) or do they > refer to the distributed indices of the global matrix? To be consistent, these would be local sizes and global numberings. However, I am not sure why you would be doing this. I do not believe any of the parallel LU packages accept an ordering from the user (they calculate their own), and I would really only use them from a KSP (or PC at the least). Matt > Tim. > > > Matthew Knepley wrote: > > On Nov 18, 2007 11:34 AM, Tim Stitt wrote: > > > >> OK..so I should be using the aggregate length returned by > >> MatGetOwnershipRange() routine? > >> > > > > If you are using it to permute a Mat, yes. > > > > Matt > > > > > >> Thanks Matt for you help. > >> > >> > >> Matthew Knepley wrote: > >> > >>> IS are not really parallel, so all the lengths, etc. only refer to local things. > >>> > >>> Matt > >>> > >>> On Nov 18, 2007 11:22 AM, Tim Stitt wrote: > >>> > >>> > >>>> Hi all, > >>>> > >>>> Just wanted to know if the "the length of the index set" for a call to > >>>> ISCreateGeneral() in a parallel code, is a global length, or the length > >>>> of the local elements on each process? > >>>> > >>>> Thanks, > >>>> > >>>> Tim. > >>>> > >>>> -- > >>>> Dr. Timothy Stitt > >>>> HPC Application Consultant - ICHEC (www.ichec.ie) > >>>> > >>>> Dublin Institute for Advanced Studies > >>>> 5 Merrion Square - Dublin 2 - Ireland > >>>> > >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >> -- > >> Dr. Timothy Stitt > >> HPC Application Consultant - ICHEC (www.ichec.ie) > >> > >> Dublin Institute for Advanced Studies > >> 5 Merrion Square - Dublin 2 - Ireland > >> > >> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >> > >> > >> > > > > > > > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Sun Nov 18 12:21:37 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 18 Nov 2007 18:21:37 +0000 Subject: Parallel ISCreateGeneral() In-Reply-To: References: <474074CD.7050509@ichec.ie> <474077A8.7080804@ichec.ie> <47407BE0.8050508@ichec.ie> Message-ID: <474082B1.4060809@ichec.ie> Oh...ok I am now officially confused. I have developed a serial code for getting the first k rows of an inverted sparse matrix..thanks to PETSC users/developers help this past week. In that code I was calling MatLUFactorSymbolic() and MatLUFactorNumeric() to factor the sparse matrix and then calling MatSolve for each of the first k columns in the identity matrix as the RHS. I then varied the matrix type from the command line to test MUMPS, SUPERLU etc. for the best performance. Now I just want to translate the code into a parallel version...so I now assemble rows in a distributed fashion and now working on translating the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require index sets...hence my original question. Are you saying that I now shouldn't be calling those routines? Tim. Matthew Knepley wrote: > On Nov 18, 2007 11:52 AM, Tim Stitt wrote: > >> Matt, >> >> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls >> which require index sets. I have distributed my rows across the >> processes and now just a bit confused about the arguments to the >> ISCreateGeneral() routine to set up the IS sets used by the Factor >> routines in parallel. >> >> So my basic question is what in general is the length and integers that >> get passed to ISCreateGeneral() when doing this type of calculation in >> parallel? Are they local index values (0..#rows on process-1) or do they >> refer to the distributed indices of the global matrix? >> > > To be consistent, these would be local sizes and global numberings. However, > I am not sure why you would be doing this. I do not believe any of the parallel > LU packages accept an ordering from the user (they calculate their own), > and I would really only use them from a KSP (or PC at the least). > > Matt > > >> Tim. >> >> >> Matthew Knepley wrote: >> >>> On Nov 18, 2007 11:34 AM, Tim Stitt wrote: >>> >>> >>>> OK..so I should be using the aggregate length returned by >>>> MatGetOwnershipRange() routine? >>>> >>>> >>> If you are using it to permute a Mat, yes. >>> >>> Matt >>> >>> >>> >>>> Thanks Matt for you help. >>>> >>>> >>>> Matthew Knepley wrote: >>>> >>>> >>>>> IS are not really parallel, so all the lengths, etc. only refer to local things. >>>>> >>>>> Matt >>>>> >>>>> On Nov 18, 2007 11:22 AM, Tim Stitt wrote: >>>>> >>>>> >>>>> >>>>>> Hi all, >>>>>> >>>>>> Just wanted to know if the "the length of the index set" for a call to >>>>>> ISCreateGeneral() in a parallel code, is a global length, or the length >>>>>> of the local elements on each process? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Tim. >>>>>> >>>>>> -- >>>>>> Dr. Timothy Stitt >>>>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>>>> >>>>>> Dublin Institute for Advanced Studies >>>>>> 5 Merrion Square - Dublin 2 - Ireland >>>>>> >>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> -- >>>> Dr. Timothy Stitt >>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>> >>>> Dublin Institute for Advanced Studies >>>> 5 Merrion Square - Dublin 2 - Ireland >>>> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>> >>>> >>>> >>>> >>> >>> >>> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Nov 18 12:37:37 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 Nov 2007 12:37:37 -0600 Subject: Parallel ISCreateGeneral() In-Reply-To: <474082B1.4060809@ichec.ie> References: <474074CD.7050509@ichec.ie> <474077A8.7080804@ichec.ie> <47407BE0.8050508@ichec.ie> <474082B1.4060809@ichec.ie> Message-ID: On Nov 18, 2007 12:21 PM, Tim Stitt wrote: > Oh...ok I am now officially confused. > > I have developed a serial code for getting the first k rows of an > inverted sparse matrix..thanks to PETSC users/developers help this past > week. > > In that code I was calling MatLUFactorSymbolic() and > MatLUFactorNumeric() to factor the sparse matrix and then calling > MatSolve for each of the first k columns in the identity matrix as the > RHS. I then varied the matrix type from the command line to test MUMPS, > SUPERLU etc. for the best performance. > > Now I just want to translate the code into a parallel version...so I now > assemble rows in a distributed fashion and now working on translating > the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require > index sets...hence my original question. > > Are you saying that I now shouldn't be calling those routines? You can certainly do it that way, but it is much easier to just use a KSP. You set the Mat using KSPSetOperators, then KSPSetType(ksp, KSPPREONLY), and PCSetType(pc, PCLU) (or MUMPS or whatever). Then KSPSolve() with the identity columns. We handle everything else. Matt > Tim. > > Matthew Knepley wrote: > > On Nov 18, 2007 11:52 AM, Tim Stitt wrote: > > > >> Matt, > >> > >> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls > >> which require index sets. I have distributed my rows across the > >> processes and now just a bit confused about the arguments to the > >> ISCreateGeneral() routine to set up the IS sets used by the Factor > >> routines in parallel. > >> > >> So my basic question is what in general is the length and integers that > >> get passed to ISCreateGeneral() when doing this type of calculation in > >> parallel? Are they local index values (0..#rows on process-1) or do they > >> refer to the distributed indices of the global matrix? > >> > > > > To be consistent, these would be local sizes and global numberings. However, > > I am not sure why you would be doing this. I do not believe any of the parallel > > LU packages accept an ordering from the user (they calculate their own), > > and I would really only use them from a KSP (or PC at the least). > > > > Matt > > > > > >> Tim. > >> > >> > >> Matthew Knepley wrote: > >> > >>> On Nov 18, 2007 11:34 AM, Tim Stitt wrote: > >>> > >>> > >>>> OK..so I should be using the aggregate length returned by > >>>> MatGetOwnershipRange() routine? > >>>> > >>>> > >>> If you are using it to permute a Mat, yes. > >>> > >>> Matt > >>> > >>> > >>> > >>>> Thanks Matt for you help. > >>>> > >>>> > >>>> Matthew Knepley wrote: > >>>> > >>>> > >>>>> IS are not really parallel, so all the lengths, etc. only refer to local things. > >>>>> > >>>>> Matt > >>>>> > >>>>> On Nov 18, 2007 11:22 AM, Tim Stitt wrote: > >>>>> > >>>>> > >>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> Just wanted to know if the "the length of the index set" for a call to > >>>>>> ISCreateGeneral() in a parallel code, is a global length, or the length > >>>>>> of the local elements on each process? > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Tim. > >>>>>> > >>>>>> -- > >>>>>> Dr. Timothy Stitt > >>>>>> HPC Application Consultant - ICHEC (www.ichec.ie) > >>>>>> > >>>>>> Dublin Institute for Advanced Studies > >>>>>> 5 Merrion Square - Dublin 2 - Ireland > >>>>>> > >>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> -- > >>>> Dr. Timothy Stitt > >>>> HPC Application Consultant - ICHEC (www.ichec.ie) > >>>> > >>>> Dublin Institute for Advanced Studies > >>>> 5 Merrion Square - Dublin 2 - Ireland > >>>> > >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >> -- > >> Dr. Timothy Stitt > >> HPC Application Consultant - ICHEC (www.ichec.ie) > >> > >> Dublin Institute for Advanced Studies > >> 5 Merrion Square - Dublin 2 - Ireland > >> > >> +353-1-6621333 (tel) / +353-1-6621477 (fax) > >> > >> > >> > > > > > > > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Sun Nov 18 12:46:49 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 18 Nov 2007 18:46:49 +0000 Subject: Parallel ISCreateGeneral() In-Reply-To: References: <474074CD.7050509@ichec.ie> <474077A8.7080804@ichec.ie> <47407BE0.8050508@ichec.ie> <474082B1.4060809@ichec.ie> Message-ID: <47408899.6060001@ichec.ie> OK Matt...will try that out. Thanks. Matthew Knepley wrote: > On Nov 18, 2007 12:21 PM, Tim Stitt wrote: > >> Oh...ok I am now officially confused. >> >> I have developed a serial code for getting the first k rows of an >> inverted sparse matrix..thanks to PETSC users/developers help this past >> week. >> >> In that code I was calling MatLUFactorSymbolic() and >> MatLUFactorNumeric() to factor the sparse matrix and then calling >> MatSolve for each of the first k columns in the identity matrix as the >> RHS. I then varied the matrix type from the command line to test MUMPS, >> SUPERLU etc. for the best performance. >> >> Now I just want to translate the code into a parallel version...so I now >> assemble rows in a distributed fashion and now working on translating >> the MatLUFactorSymbolic() and MatLUFactorNumeric() calls which require >> index sets...hence my original question. >> >> Are you saying that I now shouldn't be calling those routines? >> > > You can certainly do it that way, but it is much easier to just use a KSP. > You set the Mat using KSPSetOperators, then KSPSetType(ksp, KSPPREONLY), > and PCSetType(pc, PCLU) (or MUMPS or whatever). Then KSPSolve() with the > identity columns. We handle everything else. > > Matt > > >> Tim. >> >> Matthew Knepley wrote: >> >>> On Nov 18, 2007 11:52 AM, Tim Stitt wrote: >>> >>> >>>> Matt, >>>> >>>> It is in setup for MatLUFactorSymbolic() and MatLUFactorNumeric() calls >>>> which require index sets. I have distributed my rows across the >>>> processes and now just a bit confused about the arguments to the >>>> ISCreateGeneral() routine to set up the IS sets used by the Factor >>>> routines in parallel. >>>> >>>> So my basic question is what in general is the length and integers that >>>> get passed to ISCreateGeneral() when doing this type of calculation in >>>> parallel? Are they local index values (0..#rows on process-1) or do they >>>> refer to the distributed indices of the global matrix? >>>> >>>> >>> To be consistent, these would be local sizes and global numberings. However, >>> I am not sure why you would be doing this. I do not believe any of the parallel >>> LU packages accept an ordering from the user (they calculate their own), >>> and I would really only use them from a KSP (or PC at the least). >>> >>> Matt >>> >>> >>> >>>> Tim. >>>> >>>> >>>> Matthew Knepley wrote: >>>> >>>> >>>>> On Nov 18, 2007 11:34 AM, Tim Stitt wrote: >>>>> >>>>> >>>>> >>>>>> OK..so I should be using the aggregate length returned by >>>>>> MatGetOwnershipRange() routine? >>>>>> >>>>>> >>>>>> >>>>> If you are using it to permute a Mat, yes. >>>>> >>>>> Matt >>>>> >>>>> >>>>> >>>>> >>>>>> Thanks Matt for you help. >>>>>> >>>>>> >>>>>> Matthew Knepley wrote: >>>>>> >>>>>> >>>>>> >>>>>>> IS are not really parallel, so all the lengths, etc. only refer to local things. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> On Nov 18, 2007 11:22 AM, Tim Stitt wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Just wanted to know if the "the length of the index set" for a call to >>>>>>>> ISCreateGeneral() in a parallel code, is a global length, or the length >>>>>>>> of the local elements on each process? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Tim. >>>>>>>> >>>>>>>> -- >>>>>>>> Dr. Timothy Stitt >>>>>>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>>>>>> >>>>>>>> Dublin Institute for Advanced Studies >>>>>>>> 5 Merrion Square - Dublin 2 - Ireland >>>>>>>> >>>>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> -- >>>>>> Dr. Timothy Stitt >>>>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>>>> >>>>>> Dublin Institute for Advanced Studies >>>>>> 5 Merrion Square - Dublin 2 - Ireland >>>>>> >>>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> -- >>>> Dr. Timothy Stitt >>>> HPC Application Consultant - ICHEC (www.ichec.ie) >>>> >>>> Dublin Institute for Advanced Studies >>>> 5 Merrion Square - Dublin 2 - Ireland >>>> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax) >>>> >>>> >>>> >>>> >>> >>> >>> >> -- >> Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> >> >> > > > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From zonexo at gmail.com Sun Nov 18 14:34:32 2007 From: zonexo at gmail.com (Ben Tay) Date: Sun, 18 Nov 2007 13:34:32 -0700 Subject: Dual core performance estimate Message-ID: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> Hi, someone was talking abt core 2 duo performance on os x in some previous email. it seems that due to memory issues, it's not possible to get 2x the performance. there's also some mention of amd vs intel dual core. for computation using PETSc, is there any reason to buy one instead of the other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what sort of performance increase can we expect as compared to PETSc + nompi on the same machine? or is that too difficult an answer to give since there are too many factors? thank you regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From aja2111 at columbia.edu Sun Nov 18 14:53:13 2007 From: aja2111 at columbia.edu (Aron Ahmadia) Date: Sun, 18 Nov 2007 15:53:13 -0500 Subject: Dual core performance estimate In-Reply-To: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> Message-ID: <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com> Hi Ben, You're asking a question that is very specific to the program you're running. I think the general consensus on this list has been that for the more common uses of PETSc, getting dual-cores will not speed up your performance as much as dual-processors. For OS X, dual-cores are pretty much the baseline now, so I wouldn't worry too much about it. ~A On Nov 18, 2007 3:34 PM, Ben Tay wrote: > Hi, > > someone was talking abt core 2 duo performance on os x in some previous > email. it seems that due to memory issues, it's not possible to get 2x the > performance. there's also some mention of amd vs intel dual core. > > for computation using PETSc, is there any reason to buy one instead of the > other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what sort > of performance increase can we expect as compared to PETSc + nompi on the > same machine? > > or is that too difficult an answer to give since there are too many factors? > > thank you > > regards From grs2103 at columbia.edu Sun Nov 18 18:59:44 2007 From: grs2103 at columbia.edu (Gideon Simpson) Date: Sun, 18 Nov 2007 19:59:44 -0500 Subject: Dual core performance estimate In-Reply-To: <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com> References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com> Message-ID: <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu> I asked the original question, and I have a follow up. Like it or not, multi-core CPUs have been thrust upon us by the manufacturers and many of us are more likely to have access to a shared memory, multi core/multi processor machine, than a properly built cluster with MPI in mind. So, two questions in this direction: 1. How feasible would it be to implement OpenMP in PETSc so that multi core CPUs could be properly used? 2. Even if we are building a cluster, it looks like AMD/Intel are thrusting multi core up on is. To that end, what is the feasibility of merging MPI and OpenMP so that between nodes, we use MPI, but within each node, OpenMP is used to take advantage of the multiple cores. -gideon On Nov 18, 2007, at 3:53 PM, Aron Ahmadia wrote: > Hi Ben, > > You're asking a question that is very specific to the program you're > running. I think the general consensus on this list has been that for > the more common uses of PETSc, getting dual-cores will not speed up > your performance as much as dual-processors. For OS X, dual-cores are > pretty much the baseline now, so I wouldn't worry too much about it. > > ~A > > On Nov 18, 2007 3:34 PM, Ben Tay wrote: >> Hi, >> >> someone was talking abt core 2 duo performance on os x in some >> previous >> email. it seems that due to memory issues, it's not possible to >> get 2x the >> performance. there's also some mention of amd vs intel dual core. >> >> for computation using PETSc, is there any reason to buy one >> instead of the >> other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, >> what sort >> of performance increase can we expect as compared to PETSc + nompi >> on the >> same machine? >> >> or is that too difficult an answer to give since there are too >> many factors? >> >> thank you >> >> regards > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Nov 18 20:00:30 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 18 Nov 2007 20:00:30 -0600 (CST) Subject: Dual core performance estimate In-Reply-To: <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu> References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com> <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu> Message-ID: Gideon, On Sun, 18 Nov 2007, Gideon Simpson wrote: > I asked the original question, and I have a follow up. Like it or not, > multi-core CPUs have been thrust upon us by the manufacturers and many of us > are more likely to have access to a shared memory, multi core/multi processor > machine, than a properly built cluster with MPI in mind. > > So, two questions in this direction: > > 1. How feasible would it be to implement OpenMP in PETSc so that multi core > CPUs could be properly used? > > 2. Even if we are building a cluster, it looks like AMD/Intel are thrusting > multi core up on is. To that end, what is the feasibility of merging MPI and > OpenMP so that between nodes, we use MPI, but within each node, OpenMP is > used to take advantage of the multiple cores. > > -gideon > Unfortunately using MPI+OpenMP on multi-core systems for the iterative solution of linear systems will not help AT ALL. Sparse matrix algorithms (like matrix-vector production, triangular solves) are memory bandwidth limited. The speed of the memory is not enough to support 2 (or more) processes both trying to pull sparse matrices from memory at the same time; the details of the parallelism are not the issue. Now it is possible that other parts of a PETSc code; like evaluating nonlinear functions, evaluating Jacobians and other stuff may NOT be memory bandwidth limited. Those parts of the code might benefit by using OpenMP on those pieces of the code, while only using the single thread on the linear solvers. That is, you would run PETSc with one MPI process per node, then in parts of your code you would use OpenMP loop level parallelism or OpenMP task parallelism. Barry > On Nov 18, 2007, at 3:53 PM, Aron Ahmadia wrote: > >> Hi Ben, >> >> You're asking a question that is very specific to the program you're >> running. I think the general consensus on this list has been that for >> the more common uses of PETSc, getting dual-cores will not speed up >> your performance as much as dual-processors. For OS X, dual-cores are >> pretty much the baseline now, so I wouldn't worry too much about it. >> >> ~A >> >> On Nov 18, 2007 3:34 PM, Ben Tay wrote: >>> Hi, >>> >>> someone was talking abt core 2 duo performance on os x in some previous >>> email. it seems that due to memory issues, it's not possible to get 2x the >>> performance. there's also some mention of amd vs intel dual core. >>> >>> for computation using PETSc, is there any reason to buy one instead of the >>> other? Also, supposed I use winxp + mpich2 + PETSc on a dual core, what >>> sort >>> of performance increase can we expect as compared to PETSc + nompi on the >>> same machine? >>> >>> or is that too difficult an answer to give since there are too many >>> factors? >>> >>> thank you >>> >>> regards >> > From balay at mcs.anl.gov Sun Nov 18 20:03:27 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 18 Nov 2007 20:03:27 -0600 (CST) Subject: Dual core performance estimate In-Reply-To: <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu> References: <804ab5d40711181234o2c7276b8q1de066c19342f29e@mail.gmail.com> <37604ab40711181253x1031e1ddl41abe4f5a3c7e874@mail.gmail.com> <3841E758-7F54-4713-A45C-52B74E4DB9F9@columbia.edu> Message-ID: On Sun, 18 Nov 2007, Gideon Simpson wrote: > I asked the original question, and I have a follow up. Like it or not, > multi-core CPUs have been thrust upon us by the manufacturers and many of us > are more likely to have access to a shared memory, multi core/multi processor > machine, than a properly built cluster with MPI in mind. Sure they are here to stay. > 1. How feasible would it be to implement OpenMP in PETSc so that > multi core CPUs could be properly used? > 2. Even if we are building a cluster, it looks like AMD/Intel are thrusting > multi core up on is. To that end, what is the feasibility of merging MPI and > OpenMP so that between nodes, we use MPI, but within each node, OpenMP is used > to take advantage of the multiple cores. You are missing the point on previous e-mails on this topic. The point was: when undersanding performance one gets on single/vs dual core - one should investigate memory bandwidth behavior. With sparse matrix operations, memory bandwidth is the primary determining factor. So if you split-up 'the same amount of memory-bandwidth between 2 processors, you split up performance between them as well. Memory bandwidth affects both OpenMP & MPI. Its not like - memory bandwidth is MPI-only issue [and OpenMP somehow avoids this problem]. So the inference: "MPI is not suitable for multi-core, but OpenMP is suitable" is incorrect. [if performance is limited by memory-bandwidth]. So our sugestion is: be aware of this issue - when analysing the performance you get. One way to look at it this is: performance per dollar. Since the second core is practically free - even 5% improvement [in 1 vs 2 node run] is a good investment. [There could be other parts of the application that are not-memory bandwidth limited - that benifit from the extra core] Note-1: when folks compare MPI performance vs OpenMPI, or when refering to mixed OpenMP/MPI code, they are sometimes mixing 2 things. - implementation difference [OpenMP communication could be implemented better than MPI communication on some machines] - algorithmic difference [for eg: if you have a 4 way SMP. if MPI impl was using bjacobi with num_blocks=4, vs OpenMP - which just unrolled a DirectSolver fortran subroutine] We feel that the first one is an implementation issue, and MPI should do the right thing. Wrt the second one, OpenMP/MPI mixed mode is more of an algorithmic issue [generally 2 level algrorithm]. Same 2-level-algorithm implmeneted with MPI/MPI should have similar behavior. PETSc currently has some support for this with "-pc_type openmp" Note-2: So multi-core hardware is the future, how does one fully utilze them? I guess one has to look at alternative algorithms that are not memory bandwidth limited, perhas that can somehow reduce memory bandwith requirement by just doing extra computation. [perhaps new researchwork? sorry I don't know more on this topic..] Satish From timothy.stitt at ichec.ie Tue Nov 20 11:45:31 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Tue, 20 Nov 2007 17:45:31 +0000 Subject: Load Balancing and KSPSolve Message-ID: <47431D3B.5000309@ichec.ie> Hi all (again), I finally got some data back from the KSP PETSc code that I put together to solve this sparse inverse matrix problem I was looking into. Ideally I am aiming for a O(N) (time complexity) approach to getting the first 'k' columns of the inverse of a sparse matrix. To recap the method: I have my solver which uses KSPSolve in a loop that iterates over the first k columns of an identity matrix B and computes the corresponding x vector. I am just a bit curious about some of the timings I am obtaining...which I hope someone can explain. Here are the timings I obtained for a global sparse matrix (4704 x 4704) and solving for the first 1176 columns in the identity using P processes (processors) on our cluster. (Timings are given in seconds for each process performing work in the loop and were obtained by encapsulating the loop with the cpu_time() Fortran intrinsic. The MUMPS package was requested for factorisation/solving, although similar timings were obtained for both the native solver and SUPERLU) P=1 [30.92] P=2 [15.47, 15.54] P=4 [4.68, 5.49, 4.67, 5.07] P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 0.25, 0.43, 1.09, 1.08, 1.1] Firstly, I notice very good scalability up to 16 processes...is this expected (by those people who use these solvers regularly)? Also I notice that the timings per process vary as we scale up. Is this a load-balancing problem related to more non-zero values being on a given processor than others? Once again is this expected? Please excuse my ignorance of matters relating to these solvers and their operation...as it really isn't my field of expertise. Regards, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From balay at mcs.anl.gov Tue Nov 20 12:34:09 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 20 Nov 2007 12:34:09 -0600 (CST) Subject: Load Balancing and KSPSolve In-Reply-To: <47431D3B.5000309@ichec.ie> References: <47431D3B.5000309@ichec.ie> Message-ID: Can you send the -log_summary for your runs [say p=1, p=8] Satish On Tue, 20 Nov 2007, Tim Stitt wrote: > Hi all (again), > > I finally got some data back from the KSP PETSc code that I put together to > solve this sparse inverse matrix problem I was looking into. Ideally I am > aiming for a O(N) (time complexity) approach to getting the first 'k' columns > of the inverse of a sparse matrix. > > To recap the method: I have my solver which uses KSPSolve in a loop that > iterates over the first k columns of an identity matrix B and computes the > corresponding x vector. > > I am just a bit curious about some of the timings I am obtaining...which I > hope someone can explain. Here are the timings I obtained for a global sparse > matrix (4704 x 4704) and solving for the first 1176 columns in the identity > using P processes (processors) on our cluster. > > (Timings are given in seconds for each process performing work in the loop and > were obtained by encapsulating the loop with the cpu_time() Fortran intrinsic. > The MUMPS package was requested for factorisation/solving, although similar > timings were obtained for both the native solver and SUPERLU) > > P=1 [30.92] > P=2 [15.47, 15.54] > P=4 [4.68, 5.49, 4.67, 5.07] > P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] > P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 0.25, > 0.43, 1.09, 1.08, 1.1] > > Firstly, I notice very good scalability up to 16 processes...is this expected > (by those people who use these solvers regularly)? > > Also I notice that the timings per process vary as we scale up. Is this a > load-balancing problem related to more non-zero values being on a given > processor than others? Once again is this expected? > > Please excuse my ignorance of matters relating to these solvers and their > operation...as it really isn't my field of expertise. > > Regards, > > Tim. > > From bsmith at mcs.anl.gov Tue Nov 20 12:43:33 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 20 Nov 2007 12:43:33 -0600 Subject: Load Balancing and KSPSolve In-Reply-To: <47431D3B.5000309@ichec.ie> References: <47431D3B.5000309@ichec.ie> Message-ID: Tim, This is an unrelated comment, but may help you with scaling to many processes. Since the matrix is so SMALL it will be hard to get good scaling on the linear solves for a large number of processes, but since you need MANY right hand sides you might consider having different groups of processes (MPI_Comms) handle collections of right hand sides. For example if you have 64 processes you might use 4 MPI_Comm's each of size 16, or even 8 MPI_Comm's each of size 8. Coding this is easy simply use MPI to generate the appropriate communicator (for the subsets of processes) and then create the Mat, the KSP etc on that communicator instead of MPI_COMM_WORLD Barry On Nov 20, 2007, at 11:45 AM, Tim Stitt wrote: > Hi all (again), > > I finally got some data back from the KSP PETSc code that I put > together to solve this sparse inverse matrix problem I was looking > into. Ideally I am aiming for a O(N) (time complexity) approach to > getting the first 'k' columns of the inverse of a sparse matrix. > > To recap the method: I have my solver which uses KSPSolve in a loop > that iterates over the first k columns of an identity matrix B and > computes the corresponding x vector. > > I am just a bit curious about some of the timings I am > obtaining...which I hope someone can explain. Here are the timings I > obtained for a global sparse matrix (4704 x 4704) and solving for > the first 1176 columns in the identity using P processes > (processors) on our cluster. > > (Timings are given in seconds for each process performing work in > the loop and were obtained by encapsulating the loop with the > cpu_time() Fortran intrinsic. The MUMPS package was requested for > factorisation/solving, although similar timings were obtained for > both the native solver and SUPERLU) > > P=1 [30.92] > P=2 [15.47, 15.54] > P=4 [4.68, 5.49, 4.67, 5.07] > P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] > P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, > 0.73, 0.25, 0.43, 1.09, 1.08, 1.1] > > Firstly, I notice very good scalability up to 16 processes...is this > expected (by those people who use these solvers regularly)? > > Also I notice that the timings per process vary as we scale up. Is > this a load-balancing problem related to more non-zero values being > on a given processor than others? Once again is this expected? > > Please excuse my ignorance of matters relating to these solvers and > their operation...as it really isn't my field of expertise. > > Regards, > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From timothy.stitt at ichec.ie Tue Nov 20 14:34:59 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Tue, 20 Nov 2007 20:34:59 +0000 Subject: Load Balancing and KSPSolve In-Reply-To: References: <47431D3B.5000309@ichec.ie> Message-ID: <474344F3.2000405@ichec.ie> Satish, Logs attached...hope they help. Thanks, Tim. Satish Balay wrote: > Can you send the -log_summary for your runs [say p=1, p=8] > > Satish > > On Tue, 20 Nov 2007, Tim Stitt wrote: > > >> Hi all (again), >> >> I finally got some data back from the KSP PETSc code that I put together to >> solve this sparse inverse matrix problem I was looking into. Ideally I am >> aiming for a O(N) (time complexity) approach to getting the first 'k' columns >> of the inverse of a sparse matrix. >> >> To recap the method: I have my solver which uses KSPSolve in a loop that >> iterates over the first k columns of an identity matrix B and computes the >> corresponding x vector. >> >> I am just a bit curious about some of the timings I am obtaining...which I >> hope someone can explain. Here are the timings I obtained for a global sparse >> matrix (4704 x 4704) and solving for the first 1176 columns in the identity >> using P processes (processors) on our cluster. >> >> (Timings are given in seconds for each process performing work in the loop and >> were obtained by encapsulating the loop with the cpu_time() Fortran intrinsic. >> The MUMPS package was requested for factorisation/solving, although similar >> timings were obtained for both the native solver and SUPERLU) >> >> P=1 [30.92] >> P=2 [15.47, 15.54] >> P=4 [4.68, 5.49, 4.67, 5.07] >> P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] >> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, 0.25, >> 0.43, 1.09, 1.08, 1.1] >> >> Firstly, I notice very good scalability up to 16 processes...is this expected >> (by those people who use these solvers regularly)? >> >> Also I notice that the timings per process vary as we scale up. Is this a >> load-balancing problem related to more non-zero values being on a given >> processor than others? Once again is this expected? >> >> Please excuse my ignorance of matters relating to these solvers and their >> operation...as it really isn't my field of expertise. >> >> Regards, >> >> Tim. >> >> >> > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log.1 URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log.8 URL: From balay at mcs.anl.gov Tue Nov 20 20:17:27 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 20 Nov 2007 20:17:27 -0600 (CST) Subject: Load Balancing and KSPSolve In-Reply-To: <474344F3.2000405@ichec.ie> References: <47431D3B.5000309@ichec.ie> <474344F3.2000405@ichec.ie> Message-ID: a couple of comments: Looks like most of the time is spent in MatSolve(). [90% for np=1] However on np=8 run, you have MatSolve() taking 42% time, whereas VecAssemblyBegin() taking 32% time. Depending upon whats beeing done with VecSetValues()/VecAssembly() - you might be able to reduce this time considerably. [ If you can generate values locally - then no communication is required. If you need to communicate values - then you can explore VecScatters() for more efficient communication] Wrt MatSolve() on 8 procs, the max/min time between any 2 procs is 2.6. [i.e slowest proc is taking 16 sec, so the fastest proc would probably be taking 6 sec.]. The max/min ratio of flops across procs is 1.8. So there is indeed a load balance issue that is contributing to different times on different processors [I guess the slowest proc is doing almost twice the amount of work as the fastest proc]. Satish On Tue, 20 Nov 2007, Tim Stitt wrote: > Satish, > > Logs attached...hope they help. > > Thanks, > > Tim. > > Satish Balay wrote: > > Can you send the -log_summary for your runs [say p=1, p=8] > > > > Satish > > > > On Tue, 20 Nov 2007, Tim Stitt wrote: > > > > > > > Hi all (again), > > > > > > I finally got some data back from the KSP PETSc code that I put together > > > to > > > solve this sparse inverse matrix problem I was looking into. Ideally I am > > > aiming for a O(N) (time complexity) approach to getting the first 'k' > > > columns > > > of the inverse of a sparse matrix. > > > > > > To recap the method: I have my solver which uses KSPSolve in a loop that > > > iterates over the first k columns of an identity matrix B and computes the > > > corresponding x vector. > > > > > > I am just a bit curious about some of the timings I am obtaining...which I > > > hope someone can explain. Here are the timings I obtained for a global > > > sparse > > > matrix (4704 x 4704) and solving for the first 1176 columns in the > > > identity > > > using P processes (processors) on our cluster. > > > > > > (Timings are given in seconds for each process performing work in the loop > > > and > > > were obtained by encapsulating the loop with the cpu_time() Fortran > > > intrinsic. > > > The MUMPS package was requested for factorisation/solving, although > > > similar > > > timings were obtained for both the native solver and SUPERLU) > > > > > > P=1 [30.92] > > > P=2 [15.47, 15.54] > > > P=4 [4.68, 5.49, 4.67, 5.07] > > > P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] > > > P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, > > > 0.25, > > > 0.43, 1.09, 1.08, 1.1] > > > > > > Firstly, I notice very good scalability up to 16 processes...is this > > > expected > > > (by those people who use these solvers regularly)? > > > > > > Also I notice that the timings per process vary as we scale up. Is this a > > > load-balancing problem related to more non-zero values being on a given > > > processor than others? Once again is this expected? > > > > > > Please excuse my ignorance of matters relating to these solvers and their > > > operation...as it really isn't my field of expertise. > > > > > > Regards, > > > > > > Tim. > > > > > > > > > > > > > > > > From timothy.stitt at ichec.ie Wed Nov 21 04:52:05 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Wed, 21 Nov 2007 10:52:05 +0000 Subject: Load Balancing and KSPSolve In-Reply-To: References: <47431D3B.5000309@ichec.ie> <474344F3.2000405@ichec.ie> Message-ID: <47440DD5.7050901@ichec.ie> Satish, Thanks for your helpful comments. I am unsure why the VecAssembyBegin() routine is taking a high percentage of the wall-clock when modifications to the parallel vector should be local (all I am doing is working out which element in the RHS b vector should be 1 and setting it). Here is my loop for iterating through the RHS Identity matrix and setting the relevant element to 1...prior to the call to KSPSolve. I then reset that value to 0 after the Solve in preparation for the next iteration. ! Get vector index range per process call VecGetOwnershipRange(B,firstElement,lastElement,error); do column=0,rhs-1 ! Loop over RHS columns in Identity Matrix if ((column.ge.firstElement).and.(column.lt.lastElement)) then call VecSetValue(B,column,one,INSERT_VALUES,error) end if call VecAssemblyBegin(B,error) call VecAssemblyEnd(B,error) ! Solve Ax=b call KSPSolve(ksp,b,x,error);!CHKERRQ(error) if ((column.ge.firstElement).and.(column.lt.lastElement)) then call VecSetValue(B,column,zero,INSERT_VALUES,error) end if end do Can you identify if I am doing something stupid which could be compromising the efficiency of the Assembly routine? Thanks again, Tim. Satish Balay wrote: > a couple of comments: > > Looks like most of the time is spent in MatSolve(). [90% for np=1] > > However on np=8 run, you have MatSolve() taking 42% time, whereas > VecAssemblyBegin() taking 32% time. Depending upon whats beeing done > with VecSetValues()/VecAssembly() - you might be able to reduce this > time considerably. [ If you can generate values locally - then no > communication is required. If you need to communicate values - then > you can explore VecScatters() for more efficient communication] > > Wrt MatSolve() on 8 procs, the max/min time between any 2 procs is > 2.6. [i.e slowest proc is taking 16 sec, so the fastest proc would > probably be taking 6 sec.]. The max/min ratio of flops across procs is > 1.8. So there is indeed a load balance issue that is contributing to > different times on different processors [I guess the slowest proc is > doing almost twice the amount of work as the fastest proc]. > > Satish > > On Tue, 20 Nov 2007, Tim Stitt wrote: > > >> Satish, >> >> Logs attached...hope they help. >> >> Thanks, >> >> Tim. >> >> Satish Balay wrote: >> >>> Can you send the -log_summary for your runs [say p=1, p=8] >>> >>> Satish >>> >>> On Tue, 20 Nov 2007, Tim Stitt wrote: >>> >>> >>> >>>> Hi all (again), >>>> >>>> I finally got some data back from the KSP PETSc code that I put together >>>> to >>>> solve this sparse inverse matrix problem I was looking into. Ideally I am >>>> aiming for a O(N) (time complexity) approach to getting the first 'k' >>>> columns >>>> of the inverse of a sparse matrix. >>>> >>>> To recap the method: I have my solver which uses KSPSolve in a loop that >>>> iterates over the first k columns of an identity matrix B and computes the >>>> corresponding x vector. >>>> >>>> I am just a bit curious about some of the timings I am obtaining...which I >>>> hope someone can explain. Here are the timings I obtained for a global >>>> sparse >>>> matrix (4704 x 4704) and solving for the first 1176 columns in the >>>> identity >>>> using P processes (processors) on our cluster. >>>> >>>> (Timings are given in seconds for each process performing work in the loop >>>> and >>>> were obtained by encapsulating the loop with the cpu_time() Fortran >>>> intrinsic. >>>> The MUMPS package was requested for factorisation/solving, although >>>> similar >>>> timings were obtained for both the native solver and SUPERLU) >>>> >>>> P=1 [30.92] >>>> P=2 [15.47, 15.54] >>>> P=4 [4.68, 5.49, 4.67, 5.07] >>>> P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] >>>> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, >>>> 0.25, >>>> 0.43, 1.09, 1.08, 1.1] >>>> >>>> Firstly, I notice very good scalability up to 16 processes...is this >>>> expected >>>> (by those people who use these solvers regularly)? >>>> >>>> Also I notice that the timings per process vary as we scale up. Is this a >>>> load-balancing problem related to more non-zero values being on a given >>>> processor than others? Once again is this expected? >>>> >>>> Please excuse my ignorance of matters relating to these solvers and their >>>> operation...as it really isn't my field of expertise. >>>> >>>> Regards, >>>> >>>> Tim. >>>> >>>> >>>> >>>> >>> >>> >> >> > > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From timothy.stitt at ichec.ie Wed Nov 21 04:56:29 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Wed, 21 Nov 2007 10:56:29 +0000 Subject: Load Balancing and KSPSolve In-Reply-To: References: <47431D3B.5000309@ichec.ie> Message-ID: <47440EDD.5030304@ichec.ie> Definitely will do that Barry...thanks for the tip. We actually intend to use the code for larger matrices (of order 10^6) so your method would be very beneficial at that stage, on our cluster. As you say, multiple communicators are easy to implement which allows another level of parallelism on top of the parallel solver itself. Best, Tim. Barry Smith wrote: > > Tim, > > This is an unrelated comment, but may help you with scaling to > many processes. > Since the matrix is so SMALL it will be hard to get good scaling on > the linear solves > for a large number of processes, but since you need MANY right hand > sides you > might consider having different groups of processes (MPI_Comms) handle > collections > of right hand sides. For example if you have 64 processes you might > use 4 MPI_Comm's > each of size 16, or even 8 MPI_Comm's each of size 8. Coding this is easy > simply use MPI to generate the appropriate communicator (for the > subsets of processes) > and then create the Mat, the KSP etc on that communicator instead of > MPI_COMM_WORLD > > > Barry > > > > On Nov 20, 2007, at 11:45 AM, Tim Stitt wrote: > >> Hi all (again), >> >> I finally got some data back from the KSP PETSc code that I put >> together to solve this sparse inverse matrix problem I was looking >> into. Ideally I am aiming for a O(N) (time complexity) approach to >> getting the first 'k' columns of the inverse of a sparse matrix. >> >> To recap the method: I have my solver which uses KSPSolve in a loop >> that iterates over the first k columns of an identity matrix B and >> computes the corresponding x vector. >> >> I am just a bit curious about some of the timings I am >> obtaining...which I hope someone can explain. Here are the timings I >> obtained for a global sparse matrix (4704 x 4704) and solving for the >> first 1176 columns in the identity using P processes (processors) on >> our cluster. >> >> (Timings are given in seconds for each process performing work in the >> loop and were obtained by encapsulating the loop with the cpu_time() >> Fortran intrinsic. The MUMPS package was requested for >> factorisation/solving, although similar timings were obtained for >> both the native solver and SUPERLU) >> >> P=1 [30.92] >> P=2 [15.47, 15.54] >> P=4 [4.68, 5.49, 4.67, 5.07] >> P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] >> P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, >> 0.73, 0.25, 0.43, 1.09, 1.08, 1.1] >> >> Firstly, I notice very good scalability up to 16 processes...is this >> expected (by those people who use these solvers regularly)? >> >> Also I notice that the timings per process vary as we scale up. Is >> this a load-balancing problem related to more non-zero values being >> on a given processor than others? Once again is this expected? >> >> Please excuse my ignorance of matters relating to these solvers and >> their operation...as it really isn't my field of expertise. >> >> Regards, >> >> Tim. >> >> --Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From balay at mcs.anl.gov Wed Nov 21 10:16:25 2007 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 21 Nov 2007 10:16:25 -0600 (CST) Subject: Load Balancing and KSPSolve In-Reply-To: <47440DD5.7050901@ichec.ie> References: <47431D3B.5000309@ichec.ie> <474344F3.2000405@ichec.ie> <47440DD5.7050901@ichec.ie> Message-ID: If you are just setting local values, then its best to avoid calls to VecAssembyBegin()/VecAssemblyEnd(). These have calls to MPI_Allreduce() - eventhough there might not be any communication. [so with a MPI_Barrier time of 0.00820498sec, 4704 calls to MPI_Allreduce(), which is similar to a barrier - would add up to many seconds. In this case it could be most of the 12sec time taken by VecAssemblyBegin()] Normally local assembly of a vec is done by accessing the local vector data, directly and modifying the values. VecGetArray(vec,&ptr) ptr[local-dim]= val VecRestoreArray(vec) With fortran77, since pointer usageis not possible - there is a workarround. [check vec/vec/examples/tutorials/ex4f.F for VecGetArray() usage from F77]. But with F90, you can use VecGetArrayF90()/VecRestoreArrayF90() [as in ex4f90.F]. However in your case - you might be able to continuing using VecSetValue(), by just commenting out the calls to VecAssemblyBegin()/End(). [you might first want to run with -info, to make sure there is no communiation in VecAssembly] Satish On Wed, 21 Nov 2007, Tim Stitt wrote: > Satish, > > Thanks for your helpful comments. I am unsure why the VecAssembyBegin() > routine is taking a high percentage of the wall-clock when modifications to > the parallel vector should be local (all I am doing is working out which > element in the RHS b vector should be 1 and setting it). > > Here is my loop for iterating through the RHS Identity matrix and setting the > relevant element to 1...prior to the call to KSPSolve. I then reset that value > to 0 after the Solve in preparation for the next iteration. > > ! Get vector index range per process > call VecGetOwnershipRange(B,firstElement,lastElement,error); > > do column=0,rhs-1 ! Loop over RHS columns in Identity Matrix > > if ((column.ge.firstElement).and.(column.lt.lastElement)) then > call VecSetValue(B,column,one,INSERT_VALUES,error) > end if > > call VecAssemblyBegin(B,error) > call VecAssemblyEnd(B,error) > > ! Solve Ax=b > call KSPSolve(ksp,b,x,error);!CHKERRQ(error) > > if ((column.ge.firstElement).and.(column.lt.lastElement)) then > call VecSetValue(B,column,zero,INSERT_VALUES,error) > end if > > end do > > Can you identify if I am doing something stupid which could be compromising > the efficiency of the Assembly routine? > > Thanks again, > > Tim. > > Satish Balay wrote: > > a couple of comments: > > > > Looks like most of the time is spent in MatSolve(). [90% for np=1] > > > > However on np=8 run, you have MatSolve() taking 42% time, whereas > > VecAssemblyBegin() taking 32% time. Depending upon whats beeing done > > with VecSetValues()/VecAssembly() - you might be able to reduce this > > time considerably. [ If you can generate values locally - then no > > communication is required. If you need to communicate values - then > > you can explore VecScatters() for more efficient communication] > > > > Wrt MatSolve() on 8 procs, the max/min time between any 2 procs is > > 2.6. [i.e slowest proc is taking 16 sec, so the fastest proc would > > probably be taking 6 sec.]. The max/min ratio of flops across procs is > > 1.8. So there is indeed a load balance issue that is contributing to > > different times on different processors [I guess the slowest proc is > > doing almost twice the amount of work as the fastest proc]. > > > > Satish > > > > On Tue, 20 Nov 2007, Tim Stitt wrote: > > > > > > > Satish, > > > > > > Logs attached...hope they help. > > > > > > Thanks, > > > > > > Tim. > > > > > > Satish Balay wrote: > > > > > > > Can you send the -log_summary for your runs [say p=1, p=8] > > > > > > > > Satish > > > > > > > > On Tue, 20 Nov 2007, Tim Stitt wrote: > > > > > > > > > > > > > Hi all (again), > > > > > > > > > > I finally got some data back from the KSP PETSc code that I put > > > > > together > > > > > to > > > > > solve this sparse inverse matrix problem I was looking into. Ideally I > > > > > am > > > > > aiming for a O(N) (time complexity) approach to getting the first 'k' > > > > > columns > > > > > of the inverse of a sparse matrix. > > > > > > > > > > To recap the method: I have my solver which uses KSPSolve in a loop > > > > > that > > > > > iterates over the first k columns of an identity matrix B and computes > > > > > the > > > > > corresponding x vector. > > > > > > > > > > I am just a bit curious about some of the timings I am > > > > > obtaining...which I > > > > > hope someone can explain. Here are the timings I obtained for a global > > > > > sparse > > > > > matrix (4704 x 4704) and solving for the first 1176 columns in the > > > > > identity > > > > > using P processes (processors) on our cluster. > > > > > > > > > > (Timings are given in seconds for each process performing work in the > > > > > loop > > > > > and > > > > > were obtained by encapsulating the loop with the cpu_time() Fortran > > > > > intrinsic. > > > > > The MUMPS package was requested for factorisation/solving, although > > > > > similar > > > > > timings were obtained for both the native solver and SUPERLU) > > > > > > > > > > P=1 [30.92] > > > > > P=2 [15.47, 15.54] > > > > > >>>> P=4 [4.68, 5.49, 4.67, 5.07] > > > > > P=8 [2.36, 4,23, 2.81, 2.54, 3.42, 2.22, 1.41, 3.15] > > > > > P=16 [1.04, 0.45, 1.08, 0.27, 0.87, 0.93, 1.1, 1.06, 0.29, 0.34, 0.73, > > > > > 0.25, > > > > > 0.43, 1.09, 1.08, 1.1] > > > > > > > > > > Firstly, I notice very good scalability up to 16 processes...is this > > > > > expected > > > > > (by those people who use these solvers regularly)? > > > > > > > > > > Also I notice that the timings per process vary as we scale up. Is > > > > > this a > > > > > load-balancing problem related to more non-zero values being on a > > > > > given > > > > > processor than others? Once again is this expected? > > > > > > > > > > Please excuse my ignorance of matters relating to these solvers and > > > > > their > > > > > operation...as it really isn't my field of expertise. > > > > > > > > > > Regards, > > > > > > > > > > Tim. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From w_subber at yahoo.com Thu Nov 22 05:46:33 2007 From: w_subber at yahoo.com (Waad Subber) Date: Thu, 22 Nov 2007 03:46:33 -0800 (PST) Subject: pc_factor_fill Message-ID: <447770.46513.qm@web38207.mail.mud.yahoo.com> Hello PETSc Users, I have a serial code to solve multiple matrices. I am using LU factorization. When I run the code with the (-info) option, it gives me different values for (pc_factor_fill) depending on the input matrix. I am wondering if I can set these values for the (pc_factor_fill) inside the code instead of running it with runtime option, for it is one code with multiple inputs. Thanks a lot Waad For Matrix No.1 [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.96 [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96); [5] MatLUFactorSymbolic_SeqAIJ(): for best performance. For Matrix No.2 [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.42069 [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069); [9] MatLUFactorSymbolic_SeqAIJ(): for best performance. For Matrix No. n [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 1.87742 [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742); [8] MatLUFactorSymbolic_SeqAIJ(): for best performance. --------------------------------- Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From z.sheng at ewi.tudelft.nl Thu Nov 22 07:17:27 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Thu, 22 Nov 2007 14:17:27 +0100 Subject: pc_factor_fill In-Reply-To: <447770.46513.qm@web38207.mail.mud.yahoo.com> References: <447770.46513.qm@web38207.mail.mud.yahoo.com> Message-ID: <47458167.5060308@ewi.tudelft.nl> Waad Subber wrote: > Hello PETSc Users, > > I have a serial code to solve multiple matrices. I am using LU > factorization. When I run the code with the (-info) option, it gives > me different values for (pc_factor_fill) depending on the input > matrix. I am wondering if I can set these values for the > (pc_factor_fill) inside the code instead of running it with runtime > option, for it is one code with multiple inputs. > > Thanks a lot > > Waad > > > For Matrix No.1 > > [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed > 2.96 > [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use > [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96); > [5] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > For Matrix No.2 > > [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed > 2.42069 > [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use > [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069); > [9] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > For Matrix No. n > > [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed > 1.87742 > [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use > [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742); > [8] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > ------------------------------------------------------------------------ > Be a better sports nut! Let your teams follow you with Yahoo Mobile. > Try it now. > try *-pc_factor_fill * it works for me:) From z.sheng at ewi.tudelft.nl Thu Nov 22 09:10:07 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Thu, 22 Nov 2007 16:10:07 +0100 Subject: Matrix reuse Message-ID: <47459BCF.3070409@ewi.tudelft.nl> Dear all In my application, I have small matrices that are created and destroied, those matrices are of the same nozero pattern. I wonder if there is a way to reuse that matrices instead of destroy them every time. Thank you Best regards Zhifeng Sheng From timothy.stitt at ichec.ie Thu Nov 22 11:16:01 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Thu, 22 Nov 2007 17:16:01 +0000 Subject: Banded Tridiagonal Matrices in PETSc Message-ID: <4745B951.3000407@ichec.ie> Hi, I was just wondering if PETSc has any special provision for banded tridiagonal complex matrices when used in conjunction with KSPSolve(). Are there any special PETSc matrix types or factorisation/solver methods that benefit more from this matrix form? Currently I am just using standard AIJ representation in my serial/parallel codes. I would be grateful for any thoughts. Thanks, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From dalcinl at gmail.com Thu Nov 22 12:22:24 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 22 Nov 2007 15:22:24 -0300 Subject: Matrix reuse In-Reply-To: <47459BCF.3070409@ewi.tudelft.nl> References: <47459BCF.3070409@ewi.tudelft.nl> Message-ID: On 11/22/07, Zhifeng Sheng wrote: > In my application, I have small matrices that are created and destroied, > those matrices are of the same nozero pattern. > I wonder if there is a way to reuse that matrices instead of destroy > them every time. Just do not destroy them! Use the matrices, and then start again your loop with MatSetValues(), calling MatAssemblyBegin() and MatAssemblyEnd() after the loop. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bsmith at mcs.anl.gov Thu Nov 22 12:38:12 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 Nov 2007 12:38:12 -0600 Subject: pc_factor_fill In-Reply-To: <447770.46513.qm@web38207.mail.mud.yahoo.com> References: <447770.46513.qm@web38207.mail.mud.yahoo.com> Message-ID: <0779CEF7-E610-4ACE-B7CB-9241CD5F437D@mcs.anl.gov> PCFactorSetFill() after calling KSPSetPC() Barry If you are using multiple different KSP's you might look at KSPSetOptionsPrefix() to allow using command line options to set different values for different solvers. On Nov 22, 2007, at 5:46 AM, Waad Subber wrote: > Hello PETSc Users, > > I have a serial code to solve multiple matrices. I am using LU > factorization. When I run the code with the (-info) option, it > gives me different values for (pc_factor_fill) depending on the > input matrix. I am wondering if I can set these values for the > (pc_factor_fill) inside the code instead of running it with runtime > option, for it is one code with multiple inputs. > > Thanks a lot > > Waad > > > For Matrix No.1 > > [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 > needed 2.96 > [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use > [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96); > [5] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > For Matrix No.2 > > [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 > needed 2.42069 > [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 > or use > [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069); > [9] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > For Matrix No. n > > [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 > needed 1.87742 > [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 > or use > [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742); > [8] MatLUFactorSymbolic_SeqAIJ(): for best performance. > > Be a better sports nut! Let your teams follow you with Yahoo Mobile. > Try it now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Nov 22 12:41:17 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 Nov 2007 12:41:17 -0600 Subject: Banded Tridiagonal Matrices in PETSc In-Reply-To: <4745B951.3000407@ichec.ie> References: <4745B951.3000407@ichec.ie> Message-ID: There is a format Bdiag that stores by "banded diagonal". You will find that this performs slower than then the AIJ format. If your matrix has constant values along the "diagaonals" then you will benefit from using a MatShell and writing custom code. if the values along the diagonals are not constant you will not do any better than AIJ anyways. Barry On Nov 22, 2007, at 11:16 AM, Tim Stitt wrote: > Hi, > > I was just wondering if PETSc has any special provision for banded > tridiagonal complex matrices when used in conjunction with > KSPSolve(). Are there any special PETSc matrix types or > factorisation/solver methods that benefit more from this matrix form? > > Currently I am just using standard AIJ representation in my serial/ > parallel codes. > > I would be grateful for any thoughts. > > Thanks, > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From w_subber at yahoo.com Thu Nov 22 12:55:38 2007 From: w_subber at yahoo.com (Waad Subber) Date: Thu, 22 Nov 2007 10:55:38 -0800 (PST) Subject: pc_factor_fill In-Reply-To: <0779CEF7-E610-4ACE-B7CB-9241CD5F437D@mcs.anl.gov> Message-ID: <22056.18433.qm@web38209.mail.mud.yahoo.com> Thanks Barry One more question please: Should I get and set the fill factor like this: call MatGetInfo(A,MAT_LOCAL,info,ierr) FACTFILL = info(MAT_INFO_fill_ratio_needed) call PCFactorSetFill(pc,FACTFILL,ierr) call ISSetPermutation(indexSet,IERR);CHKERRQ(IERR) call MatLUFactorSymbolic(A,indexSet,indexSet,FACTFILL,... call MatLUFactorNumeric(A,FACTFILL,factorMat,IERR) Thanks Waad Barry Smith wrote: PCFactorSetFill() after calling KSPSetPC() Barry If you are using multiple different KSP's you might look at KSPSetOptionsPrefix() to allow using command line options to set different values for different solvers. On Nov 22, 2007, at 5:46 AM, Waad Subber wrote: Hello PETSc Users, I have a serial code to solve multiple matrices. I am using LU factorization. When I run the code with the (-info) option, it gives me different values for (pc_factor_fill) depending on the input matrix. I am wondering if I can set these values for the (pc_factor_fill) inside the code instead of running it with runtime option, for it is one code with multiple inputs. Thanks a lot Waad For Matrix No.1 [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.96 [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96); [5] MatLUFactorSymbolic_SeqAIJ(): for best performance. For Matrix No.2 [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.42069 [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069); [9] MatLUFactorSymbolic_SeqAIJ(): for best performance. For Matrix No. n [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 1.87742 [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742); [8] MatLUFactorSymbolic_SeqAIJ(): for best performance. --------------------------------- Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now. --------------------------------- Be a better pen pal. Text or chat with friends inside Yahoo! Mail. See how. -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.stitt at ichec.ie Thu Nov 22 12:55:54 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Thu, 22 Nov 2007 18:55:54 +0000 Subject: Banded Tridiagonal Matrices in PETSc In-Reply-To: References: <4745B951.3000407@ichec.ie> Message-ID: <4745D0BA.8030208@ichec.ie> Thanks Barry, that is all I need to know. I can stick with my current implementation then...great stuff. Barry Smith wrote: > > There is a format Bdiag that stores by "banded diagonal". You will > find that > this performs slower than then the AIJ format. > > If your matrix has constant values along the "diagaonals" then you > will benefit from > using a MatShell and writing custom code. if the values along the > diagonals are > not constant you will not do any better than AIJ anyways. > > Barry > > On Nov 22, 2007, at 11:16 AM, Tim Stitt wrote: > >> Hi, >> >> I was just wondering if PETSc has any special provision for banded >> tridiagonal complex matrices when used in conjunction with >> KSPSolve(). Are there any special PETSc matrix types or >> factorisation/solver methods that benefit more from this matrix form? >> >> Currently I am just using standard AIJ representation in my >> serial/parallel codes. >> >> I would be grateful for any thoughts. >> >> Thanks, >> >> Tim. >> >> --Dr. Timothy Stitt >> HPC Application Consultant - ICHEC (www.ichec.ie) >> >> Dublin Institute for Advanced Studies >> 5 Merrion Square - Dublin 2 - Ireland >> >> +353-1-6621333 (tel) / +353-1-6621477 (fax) >> > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From bsmith at mcs.anl.gov Thu Nov 22 18:54:47 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 Nov 2007 18:54:47 -0600 Subject: pc_factor_fill In-Reply-To: <22056.18433.qm@web38209.mail.mud.yahoo.com> References: <22056.18433.qm@web38209.mail.mud.yahoo.com> Message-ID: <7FA47E14-B041-4423-8D95-7164874E36CD@mcs.anl.gov> If you are using KSP then you should NEVER call MatGetInfo(), MatLUFactor.... just call the PCFactorSetFill(). If you are not using the KSP then you need to declare a MatFactorInfo and fill it up the way you want. Note that MatInfo is different from MatFactorInfo you need to set the value in MatFactorInfo that you pass to the factorization routines. Barry On Nov 22, 2007, at 12:55 PM, Waad Subber wrote: > Thanks Barry > > One more question please: > > Should I get and set the fill factor like this: > > call MatGetInfo(A,MAT_LOCAL,info,ierr) > FACTFILL = info(MAT_INFO_fill_ratio_needed) > call PCFactorSetFill(pc,FACTFILL,ierr) > call ISSetPermutation(indexSet,IERR);CHKERRQ(IERR) > call MatLUFactorSymbolic(A,indexSet,indexSet,FACTFILL,... > call MatLUFactorNumeric(A,FACTFILL,factorMat,IERR) > > Thanks > > Waad > > Barry Smith wrote: > > PCFactorSetFill() after calling KSPSetPC() > > Barry > > If you are using multiple different KSP's you might look at > KSPSetOptionsPrefix() > to allow using command line options to set different values for > different solvers. > > On Nov 22, 2007, at 5:46 AM, Waad Subber wrote: > >> Hello PETSc Users, >> >> I have a serial code to solve multiple matrices. I am using LU >> factorization. When I run the code with the (-info) option, it >> gives me different values for (pc_factor_fill) depending on the >> input matrix. I am wondering if I can set these values for the >> (pc_factor_fill) inside the code instead of running it with runtime >> option, for it is one code with multiple inputs. >> >> Thanks a lot >> >> Waad >> >> >> For Matrix No.1 >> >> [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 >> needed 2.96 >> [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or >> use >> [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96); >> [5] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> >> For Matrix No.2 >> >> [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 >> needed 2.42069 >> [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 >> or use >> [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069); >> [9] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> >> For Matrix No. n >> >> [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 >> needed 1.87742 >> [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 >> or use >> [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742); >> [8] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> >> Be a better sports nut! Let your teams follow you with Yahoo >> Mobile. Try it now. > > > > Be a better pen pal. Text or chat with friends inside Yahoo! Mail. > See how. -------------- next part -------------- An HTML attachment was scrubbed... URL: From w_subber at yahoo.com Thu Nov 22 19:29:04 2007 From: w_subber at yahoo.com (Waad Subber) Date: Thu, 22 Nov 2007 17:29:04 -0800 (PST) Subject: pc_factor_fill In-Reply-To: <7FA47E14-B041-4423-8D95-7164874E36CD@mcs.anl.gov> Message-ID: <760027.34004.qm@web38212.mail.mud.yahoo.com> Thank you Barry :o) Waad Barry Smith wrote: If you are using KSP then you should NEVER call MatGetInfo(), MatLUFactor....just call the PCFactorSetFill(). If you are not using the KSP then you need to declare a MatFactorInfo and fill it up the way you want. Note that MatInfo is different from MatFactorInfo you need to set the value in MatFactorInfo that you pass to the factorization routines. Barry On Nov 22, 2007, at 12:55 PM, Waad Subber wrote: Thanks Barry One more question please: Should I get and set the fill factor like this: call MatGetInfo(A,MAT_LOCAL,info,ierr) FACTFILL = info(MAT_INFO_fill_ratio_needed) call PCFactorSetFill(pc,FACTFILL,ierr) call ISSetPermutation(indexSet,IERR);CHKERRQ(IERR) call MatLUFactorSymbolic(A,indexSet,indexSet,FACTFILL,... call MatLUFactorNumeric(A,FACTFILL,factorMat,IERR) Thanks Waad Barry Smith wrote: PCFactorSetFill() after calling KSPSetPC() Barry If you are using multiple different KSP's you might look at KSPSetOptionsPrefix() to allow using command line options to set different values for different solvers. On Nov 22, 2007, at 5:46 AM, Waad Subber wrote: Hello PETSc Users, I have a serial code to solve multiple matrices. I am using LU factorization. When I run the code with the (-info) option, it gives me different values for (pc_factor_fill) depending on the input matrix. I am wondering if I can set these values for the (pc_factor_fill) inside the code instead of running it with runtime option, for it is one code with multiple inputs. Thanks a lot Waad For Matrix No.1 [5] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.96 [5] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.96 or use [5] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.96); [5] MatLUFactorSymbolic_SeqAIJ(): for best performance. For Matrix No.2 [9] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 2.42069 [9] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 2.42069 or use [9] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,2.42069); [9] MatLUFactorSymbolic_SeqAIJ(): for best performance. For Matrix No. n [8] MatLUFactorSymbolic_SeqAIJ(): Reallocs 3 Fill ratio:given 0 needed 1.87742 [8] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.87742 or use [8] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.87742); [8] MatLUFactorSymbolic_SeqAIJ(): for best performance. --------------------------------- Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now. --------------------------------- Be a better pen pal. Text or chat with friends inside Yahoo! Mail. See how. --------------------------------- Never miss a thing. Make Yahoo your homepage. -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend at chalmers.se Fri Nov 23 03:04:15 2007 From: berend at chalmers.se (Berend van Wachem) Date: Fri, 23 Nov 2007 10:04:15 +0100 Subject: Matrix reuse In-Reply-To: <47459BCF.3070409@ewi.tudelft.nl> References: <47459BCF.3070409@ewi.tudelft.nl> Message-ID: <4746978F.1080901@chalmers.se> Hi, I use MatZeroEntries on the matrix, and then re-use it. I'm not sure what the gain in time is, though. Berend. Zhifeng Sheng wrote: > Dear all > > In my application, I have small matrices that are created and destroied, > those matrices are of the same nozero pattern. > > I wonder if there is a way to reuse that matrices instead of destroy > them every time. > > Thank you > > Best regards > Zhifeng Sheng > From amjad11 at gmail.com Fri Nov 23 07:10:55 2007 From: amjad11 at gmail.com (amjad ali) Date: Fri, 23 Nov 2007 18:10:55 +0500 Subject: Establishing Fast Eth Beowulf Cluster for Using PETSc on that Message-ID: <428810f20711230510y508d6eeerc6a185a18363535d@mail.gmail.com> Hello everybody at PETSc-maint and PETSc-users, Being new in cluster world what I have learnt to establish a simple PC cluster is to do 1)setup network (making master node as NFS server, compute nodes as NFS clients) 2)setup login system (RSH/SSH) 3)setup parallel environment (MPI). Now I want to setup a new Beowulf Cluster of few PCs to run PETSc program on that. please tell me steps of how to build it. I do not want to install MPICH2 separately on the cluster (just want to install/use as a part of PETSc). I have successfully tested PETSc parallel programs on my PC (definately with your kind guidence). Regards, Amjad Ali. From dalcinl at gmail.com Fri Nov 23 09:01:06 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 23 Nov 2007 12:01:06 -0300 Subject: Matrix reuse In-Reply-To: <4746978F.1080901@chalmers.se> References: <47459BCF.3070409@ewi.tudelft.nl> <4746978F.1080901@chalmers.se> Message-ID: On 11/23/07, Berend van Wachem wrote: > Hi, > I use MatZeroEntries on the matrix, and then re-use it. I'm not sure > what the gain in time is, though. MatZeroEntries just zero-out in the scalar entries, but it retains the nonzero structure of the matrix. Thus, the next time you make a loop calling MatSetValues(), PETSc will fastly put the new values in the right location, avoiding any memory allocation or data movement inside the sparse structure. Spase matrix assembly is not actually a trivial task, and in the parallel case, it is even far harder. But PETSc make it really easy and it is optimized for data reuse. Then, to get good performace, you have to create and preallocate your matrix, and next reuse it as much as you can. You can gain a lot, unless your matrices are really small. > Zhifeng Sheng wrote: > > Dear all > > > > In my application, I have small matrices that are created and destroied, > > those matrices are of the same nozero pattern. > > > > I wonder if there is a way to reuse that matrices instead of destroy > > them every time. > > > > Thank you > > > > Best regards > > Zhifeng Sheng > > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Fri Nov 23 09:38:25 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 23 Nov 2007 12:38:25 -0300 Subject: Establishing Fast Eth Beowulf Cluster for Using PETSc on that In-Reply-To: <428810f20711230510y508d6eeerc6a185a18363535d@mail.gmail.com> References: <428810f20711230510y508d6eeerc6a185a18363535d@mail.gmail.com> Message-ID: On 11/23/07, amjad ali wrote: > Now I want to setup a new Beowulf Cluster of few PCs to run PETSc > program on that. please tell me steps of how to build it. > I do not > want to install MPICH2 separately on the cluster (just want to > install/use as a part of PETSc). I use a similar setup. I would recommed you to do the following: * Make '/usr/local' availabe at your nodes using NFS. * Build and install MPICH2 using prefix '/usr/local/mpich2' $ ./configure --prefix=/usr/local/mpich2 ... $ make $ su -c 'make install' # or: sudo make install * Now modify your path: $ export PATH=/usr/local/mpich2/bin:$PATH * Now build PETSc, unpacking sources on '/usr/local' $ su -l $ cd /usr/local $ tar -zxf petsc-xxx.tar.gz $ cd petsc-xxx $ export PETSC_DIR=`pwd` $ export PETSC_ARCH=linux-gnu $ python config/configure.py $ make And this should be all you have to do. You will have PETSc built at '/usr/local/petsc-xxx', and all your nodes would be able to see and use it via NFS. If you have any problem, feel free to ask me again. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From ps at cs.caltech.edu Fri Nov 23 18:58:33 2007 From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=) Date: Fri, 23 Nov 2007 16:58:33 -0800 Subject: dropping columns/rows from matrix? Message-ID: <47477739.3010103@cs.caltech.edu> As part of a greedy basis pursuit algorithm I drop/undrop columns/rows from a matrix and resolve. I don't want to rebuild the matrix each time. Is there a quick way to do this? Basically the setup is this. Consider a 2-manifold triangle mesh and a discretization (piecewise linear FE) of the Laplace-Beltrami operator over this mesh (symmetric positive (semi-)definite [constant vector is the only null space vector]). Fix boundary conditions (zero Dirichlet in my case). Solve for a given rhs (I am using CG and absolute Jacobi as a precon with good success). Based on the solution, take out a column (and the same row) and resolve. Repeat this a few times until, say, 10 variables are dropped. Now pick one of them, say, i, and reintroduce it. Based on the solution replace i with i_new. Now visit another variable of the original 10 and "move" it. Etc. Each one of the solves is quick (and I need to do hundreds for matrices with hundreds of thousands to millions of variables). I'd rather not rebuild the matrix each time... Any suggestions? Thanks much! Peter From bsmith at mcs.anl.gov Fri Nov 23 19:28:52 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 23 Nov 2007 19:28:52 -0600 Subject: dropping columns/rows from matrix? In-Reply-To: <47477739.3010103@cs.caltech.edu> References: <47477739.3010103@cs.caltech.edu> Message-ID: Peter, We use MatGetSubMatrix() to do this. Rather than dropping/adding rows and columns just grab the part you want and use it then destroy it and grab another part you want. We've done this with active set methods and the time to do the MatGetSubMatrix() has been surprisingly small percentage of the time. Barry On Nov 23, 2007, at 6:58 PM, Peter Schr?der wrote: > As part of a greedy basis pursuit algorithm I drop/undrop columns/ > rows from a matrix and resolve. I don't want to rebuild the matrix > each time. Is there a quick way to do this? > > Basically the setup is this. Consider a 2-manifold triangle mesh and > a discretization (piecewise linear FE) of the Laplace-Beltrami > operator over this mesh (symmetric positive (semi-)definite > [constant vector is the only null space vector]). Fix boundary > conditions (zero Dirichlet in my case). Solve for a given rhs (I am > using CG and absolute Jacobi as a precon with good success). Based > on the solution, take out a column (and the same row) and resolve. > Repeat this a few times until, say, 10 variables are dropped. Now > pick one of them, say, i, and reintroduce it. Based on the solution > replace i with i_new. Now visit another variable of the original 10 > and "move" it. Etc. > > Each one of the solves is quick (and I need to do hundreds for > matrices with hundreds of thousands to millions of variables). I'd > rather not rebuild the matrix each time... Any suggestions? > > Thanks much! > > Peter > From ps at cs.caltech.edu Fri Nov 23 20:13:20 2007 From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=) Date: Fri, 23 Nov 2007 18:13:20 -0800 Subject: dropping columns/rows from matrix? In-Reply-To: References: <47477739.3010103@cs.caltech.edu> Message-ID: <474788C0.8050903@cs.caltech.edu> Barry Smith wrote: > We use MatGetSubMatrix() to do this. Rather than dropping/adding > rows and columns just grab the > part you want and use it then destroy it and grab another part you > want. We've done this with active set > methods and the time to do the MatGetSubMatrix() has been surprisingly > small percentage of the time. Aaaah. Ok. Even if the submatrix is everyone but one column/row? Very well. The one other way I had thought might be to fix one of the variables (some entry of the x vector is forced to zero) and ignore the corresponding part of the residual. But this may be too hacky an approach given the overall structure of Petsc. I'll use MatGetSubMatrix() for now. peter From bsmith at mcs.anl.gov Fri Nov 23 20:38:52 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 23 Nov 2007 20:38:52 -0600 Subject: dropping columns/rows from matrix? In-Reply-To: <474788C0.8050903@cs.caltech.edu> References: <47477739.3010103@cs.caltech.edu> <474788C0.8050903@cs.caltech.edu> Message-ID: <2EB9D0CA-5750-4AF5-A509-E8AFF015B0FB@mcs.anl.gov> On Nov 23, 2007, at 8:13 PM, Peter Schr?der wrote: > Barry Smith wrote: >> We use MatGetSubMatrix() to do this. Rather than dropping/adding >> rows and columns just grab the >> part you want and use it then destroy it and grab another part you >> want. We've done this with active set >> methods and the time to do the MatGetSubMatrix() has been >> surprisingly small percentage of the time. > Aaaah. Ok. Even if the submatrix is everyone but one column/row? > Very well. > Yup :-). Only a couple of percent of the active set solver was spent in the GetSubMatrix() > The one other way I had thought might be to fix one of the variables > (some entry of the x vector is forced to zero) and ignore the > corresponding part of the residual. But this may be too hacky an > approach given the overall structure of Petsc. > > I'll use MatGetSubMatrix() for now. > > peter > From dalcinl at gmail.com Sat Nov 24 14:39:20 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 24 Nov 2007 17:39:20 -0300 Subject: dropping columns/rows from matrix? In-Reply-To: <2EB9D0CA-5750-4AF5-A509-E8AFF015B0FB@mcs.anl.gov> References: <47477739.3010103@cs.caltech.edu> <474788C0.8050903@cs.caltech.edu> <2EB9D0CA-5750-4AF5-A509-E8AFF015B0FB@mcs.anl.gov> Message-ID: On 11/23/07, Barry Smith wrote: > > Barry Smith wrote: > >> We use MatGetSubMatrix() to do this. >the time to do the MatGetSubMatrix() has been > >> surprisingly small percentage of the time. Indeed, MatGetSubMatrix is surprisingly fast in my personal experience. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From timothy.stitt at ichec.ie Sun Nov 25 07:18:16 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 25 Nov 2007 13:18:16 +0000 Subject: Diagonal Elements of an Inverse Matrix Message-ID: <47497618.8010004@ichec.ie> Hi PETSc Users/Developers, I was just wondering if anyone knew of any O(N) methods for obtaining the diagonal elements of the inverse of a block tridiagonal matrix,without computing all the off-diagonal values at the same time? Actually, the general case would be most useful were selected elements in the inverse could be obtained in O(N) time. I would be grateful if anyone could shed any light on this... Thanks, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From timothy.stitt at ichec.ie Sun Nov 25 12:54:54 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 25 Nov 2007 18:54:54 +0000 Subject: Zero Pivot Row in LU Factorization Message-ID: <4749C4FE.7070602@ichec.ie> Hi all, Can anyone suggest ways of overcoming the following pivot error I keep receiving in my PETSc code during a KSPSolve(). [1]PETSC ERROR: Detected zero pivot in LU factorization see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance 0.00165189 * rowsum 1.65189e+09! From checking the documentation....the error is in row 1801, which means it is most likely not a matrix assembly issue? I tried the following prior to the solve with no luck either..... call KSPGetPC(ksp,pc,error) call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) Is there anything else I can try? Thanks, Tim. -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Nov 25 13:02:10 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 25 Nov 2007 13:02:10 -0600 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <4749C4FE.7070602@ichec.ie> References: <4749C4FE.7070602@ichec.ie> Message-ID: On Nov 25, 2007 12:54 PM, Tim Stitt wrote: > Hi all, > > Can anyone suggest ways of overcoming the following pivot error I keep > receiving in my PETSc code during a KSPSolve(). > > [1]PETSC ERROR: Detected zero pivot in LU factorization > see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! > [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance > 0.00165189 * rowsum 1.65189e+09! > > From checking the documentation....the error is in row 1801, which > means it is most likely not a matrix assembly issue? > > I tried the following prior to the solve with no luck either..... > > call KSPGetPC(ksp,pc,error) > call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) I bet you are shifting the Block-Jacobi PC, not the LU PC which is the subsolver. Matt > Is there anything else I can try? > > Thanks, > > Tim. > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From timothy.stitt at ichec.ie Sun Nov 25 13:10:12 2007 From: timothy.stitt at ichec.ie (Tim Stitt) Date: Sun, 25 Nov 2007 19:10:12 +0000 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <4749C4FE.7070602@ichec.ie> References: <4749C4FE.7070602@ichec.ie> Message-ID: <4749C894.5090602@ichec.ie> I should also add that the code executes without this error when using 1 processor...but then displays the error when running in parallel with more than one process. Tim Stitt wrote: > Hi all, > > Can anyone suggest ways of overcoming the following pivot error I keep > receiving in my PETSc code during a KSPSolve(). > > [1]PETSC ERROR: Detected zero pivot in LU factorization > see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! > > [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance > 0.00165189 * rowsum 1.65189e+09! > > From checking the documentation....the error is in row 1801, which > means it is most likely not a matrix assembly issue? > > I tried the following prior to the solve with no luck either..... > > call KSPGetPC(ksp,pc,error) > call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) > > Is there anything else I can try? > > Thanks, > > Tim. > -- Dr. Timothy Stitt HPC Application Consultant - ICHEC (www.ichec.ie) Dublin Institute for Advanced Studies 5 Merrion Square - Dublin 2 - Ireland +353-1-6621333 (tel) / +353-1-6621477 (fax) From knepley at gmail.com Sun Nov 25 13:13:09 2007 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 25 Nov 2007 13:13:09 -0600 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <4749C894.5090602@ichec.ie> References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> Message-ID: This is because on many processes I believe you are running Block-Jacobi with LU on the diagonal blocks. It is easy for one of these blocks to be singular. Matt On Nov 25, 2007 1:10 PM, Tim Stitt wrote: > I should also add that the code executes without this error when using 1 > processor...but then displays the error when running in parallel with > more than one process. > > Tim Stitt wrote: > > Hi all, > > > > Can anyone suggest ways of overcoming the following pivot error I keep > > receiving in my PETSc code during a KSPSolve(). > > > > [1]PETSC ERROR: Detected zero pivot in LU factorization > > see > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! > > > > [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance > > 0.00165189 * rowsum 1.65189e+09! > > > > From checking the documentation....the error is in row 1801, which > > means it is most likely not a matrix assembly issue? > > > > I tried the following prior to the solve with no luck either..... > > > > call KSPGetPC(ksp,pc,error) > > call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) > > > > > Is there anything else I can try? > > > > Thanks, > > > > Tim. > > > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From ps at cs.caltech.edu Sun Nov 25 18:01:51 2007 From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=) Date: Sun, 25 Nov 2007 16:01:51 -0800 Subject: running solvers on submatrices Message-ID: <474A0CEF.8030101@cs.caltech.edu> Following advice from Barry yesterday I am now using IS to subselect parts of my matrix. So far so good. I am having problems with the PC though on the second round of invoking the solver. Basically the PC still has the old number of variables. Here is the basic flow: KSPCreate( PETSC_COMM_SELF, &ksp ); KSPSetType( ksp, KSPCG ); KSPGetPC( ksp, &m_pc ); PCSetType( pc, PCJACOBI ); Loop over decreasing numbers of variables mess with the index set to get the correct columns/rows Mat A; MatGetSubMatrix( K, is, is, PETSC_DECIDE, MAT_INITIAL_MATRIX, &A ); KSPSetOperators( ksp, A, A, DIFFERENT_NONZERO_PATTERN); KSPSetUp( ksp ); KSPSolve( ksp, b, x ); MatDestroy( A ); (Mat K contains the entire matrix from which I am subselecting.) The first solve works fine. Then I kill one variable (A is rebuilt from K) and now I die in KSPSetUp with the Jacobi precon finding that the working array diag is still the old size while mat (A) is the new size (one variable less). What's the proper way to deal with this? I would prefer not to destroy and recreate the ksp and pc each time through the loop as this implies file I/O to read .petscrc (which I use to control what type of solver is used in this section). Peter From bsmith at mcs.anl.gov Sun Nov 25 18:12:06 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 Nov 2007 18:12:06 -0600 Subject: running solvers on submatrices In-Reply-To: <474A0CEF.8030101@cs.caltech.edu> References: <474A0CEF.8030101@cs.caltech.edu> Message-ID: <02711B04-D221-4A40-B3EB-7FD2C779BA3B@mcs.anl.gov> Peter, The design of PETSc requires destroying the KSP and creating a new one. A mechanism to allow resizing would require a major redo of PETSc. The .petscrc is actually only read ONCE in PetscInitialize() so there is no extra IO for each KSP creation. Also, the time to create and destroy the KSP each time is not noticable; this is what we do in the active set code. Barry On Nov 25, 2007, at 6:01 PM, Peter Schr?der wrote: > Following advice from Barry yesterday I am now using IS to subselect > parts of my matrix. So far so good. I am having problems with the PC > though on the second round of invoking the solver. Basically the PC > still has the old number of variables. Here is the basic flow: > > KSPCreate( PETSC_COMM_SELF, &ksp ); > KSPSetType( ksp, KSPCG ); > KSPGetPC( ksp, &m_pc ); > PCSetType( pc, PCJACOBI ); > > Loop over decreasing numbers of variables > mess with the index set to get the correct columns/rows > Mat A; > MatGetSubMatrix( K, is, is, PETSC_DECIDE, MAT_INITIAL_MATRIX, &A ); > KSPSetOperators( ksp, A, A, DIFFERENT_NONZERO_PATTERN); > KSPSetUp( ksp ); > KSPSolve( ksp, b, x ); > MatDestroy( A ); > > (Mat K contains the entire matrix from which I am subselecting.) The > first solve works fine. Then I kill one variable (A is rebuilt from > K) and now I die in KSPSetUp with the Jacobi precon finding that the > working array diag is still the old size while mat (A) is the new > size (one variable less). > > What's the proper way to deal with this? I would prefer not to > destroy and recreate the ksp and pc each time through the loop as > this implies file I/O to read .petscrc (which I use to control what > type of solver is used in this section). > > Peter > From bsmith at mcs.anl.gov Sun Nov 25 18:15:17 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 Nov 2007 18:15:17 -0600 Subject: Zero Pivot Row in LU Factorization In-Reply-To: <4749C894.5090602@ichec.ie> References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> Message-ID: <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> KSP *subksp; KSPGetPC(ksp,pc) PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp) KSPGetPC(subksp[0],&subpc); PCFactorSetxxxxxx(subpc, .... Barry On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote: > I should also add that the code executes without this error when > using 1 processor...but then displays the error when running in > parallel with more than one process. > > Tim Stitt wrote: >> Hi all, >> >> Can anyone suggest ways of overcoming the following pivot error I >> keep receiving in my PETSc code during a KSPSolve(). >> >> [1]PETSC ERROR: Detected zero pivot in LU factorization >> see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot >> ! >> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance >> 0.00165189 * rowsum 1.65189e+09! >> >> From checking the documentation....the error is in row 1801, which >> means it is most likely not a matrix assembly issue? >> >> I tried the following prior to the solve with no luck either..... >> >> call KSPGetPC(ksp,pc,error) >> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error) >> >> Is there anything else I can try? >> >> Thanks, >> >> Tim. >> > > > -- > Dr. Timothy Stitt > HPC Application Consultant - ICHEC (www.ichec.ie) > > Dublin Institute for Advanced Studies > 5 Merrion Square - Dublin 2 - Ireland > > +353-1-6621333 (tel) / +353-1-6621477 (fax) > From ps at cs.caltech.edu Sun Nov 25 18:26:15 2007 From: ps at cs.caltech.edu (=?ISO-8859-1?Q?Peter_Schr=F6der?=) Date: Sun, 25 Nov 2007 16:26:15 -0800 Subject: running solvers on submatrices In-Reply-To: <02711B04-D221-4A40-B3EB-7FD2C779BA3B@mcs.anl.gov> References: <474A0CEF.8030101@cs.caltech.edu> <02711B04-D221-4A40-B3EB-7FD2C779BA3B@mcs.anl.gov> Message-ID: <474A12A7.1070204@cs.caltech.edu> Barry Smith wrote: > Also, the time to create and destroy the KSP each time is not > noticable; this is what we do in the > active set code. Aye. Thanks. From z.sheng at ewi.tudelft.nl Mon Nov 26 03:49:16 2007 From: z.sheng at ewi.tudelft.nl (Zhifeng Sheng) Date: Mon, 26 Nov 2007 10:49:16 +0100 Subject: Diagonal Elements of an Inverse Matrix In-Reply-To: <47497618.8010004@ichec.ie> References: <47497618.8010004@ichec.ie> Message-ID: <474A969C.3000905@ewi.tudelft.nl> Tim Stitt wrote: > Hi PETSc Users/Developers, > > I was just wondering if anyone knew of any O(N) methods for obtaining > the diagonal elements of the inverse of a block tridiagonal > matrix,without computing all the off-diagonal values at the same time? > > Actually, the general case would be most useful were selected elements > in the inverse could be obtained in O(N) time. > > I would be grateful if anyone could shed any light on this... > > Thanks, > > Tim. > There is one method call approximate schur inverse . It computes the exact number of inverse matrix on diagonals without knowing the number on off-diagonal blocks. best regards Zhifeng From amjad11 at gmail.com Tue Nov 27 07:56:43 2007 From: amjad11 at gmail.com (amjad ali) Date: Tue, 27 Nov 2007 18:56:43 +0500 Subject: Better C2D or Quadcore Message-ID: <428810f20711270556x2bcf06c4me3d28b1b62dbd660@mail.gmail.com> Hello, I planned to buy 9 PCs each having one Core2Duo E6600 (networked with GiGE) to make cluster for running PETSc based applications. I got an advice that because the prices of Xeon Quadcore is going to drop next month, so I should buy 9 PCs each having one Quadcore Xeon (networked with GiGE) to make cluster for running PETSc based applications. Which is better for me to get better performance/speedup? My question is due to following as given in PETSc-FAQ: *What kind of parallel computers or clusters are needed to use PETSc?* PETSc can be used with any kind of parallel system that supports MPI. BUT for any decent performance one needs - a fast, low-latency interconnect; any ethernet, even 10 gigE simply cannot provide the needed performance. - high per-CPU memory performance. Each CPU (core in dual core systems) needs to have its own memory bandwith of roughly 2 or more gigabytes. For example, standard dual processor "PC's" will notprovide better performance when the second processor is used, that is, you will not see speed-up when you using the second processor. This is because the speed of sparse matrix computations is almost totally determined by the speed of the memory, not the speed of the CPU. regards, Amjad Ali. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Nov 27 08:26:26 2007 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 27 Nov 2007 08:26:26 -0600 Subject: [Beowulf] Better C2D or Quadcore In-Reply-To: <474C2740.5050109@cse.ucdavis.edu> References: <428810f20711270556x2bcf06c4me3d28b1b62dbd660@mail.gmail.com> <474C2740.5050109@cse.ucdavis.edu> Message-ID: On Nov 27, 2007 8:18 AM, Bill Broadley wrote: > > - high per-CPU memory performance. Each CPU (core in dual core > > systems) needs to have its own memory bandwith of roughly 2 or more > > gigabytes. > > Er, presumably thats 2 or more GB/sec. > > > For example, standard dual processor "PC's" will > > notprovide better performance when the second processor is used, that > > Er, standard dual processor PCs can hit 4GB/sec. Even my $750 desktop from > dell, lousy memory, 1.8 GHz cpu gets 4GB/sec at stream add and triad: > Function Rate (MB/s) Avg time Min time Max time > Add: 3945.7460 0.0124 0.0122 0.0126 > Triad: 3951.5930 0.0124 0.0121 0.0129 FAQ can't always have a precise explanation. The issue here is balance. Machine & Peak (MF/s) & Triad (MB/s) & MF/MW & Eq. MF/s \\ Matt's Laptop & 1700 & 1122.4 & 12.1 & 93.5 (5.5\%) \\ Intel Core2 Quad & 38400 & 5312.0 & 57.8 & 442.7 (1.2\%) \\ So, yes the bandwidth goes up, but not at anywhere near the rate to keep a bandwidth hungry matvec satisfied. The first numbers are for my laptop, and the second are from the STREAMS site. Obviously those are not good percentages of peak, so yo ucan really get away with slower, cheaper processors. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From amjad11 at gmail.com Tue Nov 27 09:26:32 2007 From: amjad11 at gmail.com (amjad ali) Date: Tue, 27 Nov 2007 20:26:32 +0500 Subject: [Beowulf] Better C2D or Quadcore In-Reply-To: <474C310E.9010602@charter.net> References: <428810f20711270556x2bcf06c4me3d28b1b62dbd660@mail.gmail.com> <474C310E.9010602@charter.net> Message-ID: <428810f20711270726o433d7639v543b33654c2bba5c@mail.gmail.com> Hello, What will you be doing with PETSc? It has lots of options and capability. What type of problems will you be solving? I want to solve some CFD problems using PETSc. BTW - I'm still getting some information, but if you drop back to 8 > systems, I > know of an inexpensive 8-port Infiniband switch (SDR). You can also find > inexpensive IB SDR NICs. When you go above 8 ports you have to either > switch to a large switch or start looking at using several tiers of > small switches > (the guys at aggregate.org can help there). Yes I can drop to 8 systems. So please tell me about that. My basic question was that whether C2D E6600 nodes will be better than a Quadcore which would be somehow nearer in price? regards to all. Amjad Ali. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amjad11 at gmail.com Wed Nov 28 00:24:14 2007 From: amjad11 at gmail.com (amjad ali) Date: Wed, 28 Nov 2007 11:24:14 +0500 Subject: which MPI can we use Message-ID: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com> Hello, Please name the MPI libraries (other than MPICH2) which can be used efficiently with PETSc? reagards, Ali. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Wed Nov 28 09:49:56 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 28 Nov 2007 12:49:56 -0300 Subject: which MPI can we use In-Reply-To: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com> References: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com> Message-ID: On 11/28/07, amjad ali wrote: > Please name the MPI libraries (other than MPICH2) which can be used > efficiently with PETSc? On Linux/GNU, surelly Open-MPI. You also have Intel-MPI (actually, it is based on MPICH2). -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bsmith at mcs.anl.gov Thu Nov 29 07:38:57 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 Nov 2007 07:38:57 -0600 Subject: Fwd: [PETSC #17089] PETSc on XT3 with CNL? References: Message-ID: <80886221-3BED-40FC-B9B9-C2A3CD3624EC@mcs.anl.gov> Begin forwarded message: > From: Tom Cortese > Date: November 29, 2007 7:36:06 AM CST > To: petsc-maint at mcs.anl.gov > Cc: petsc-maint at mcs.anl.gov > Subject: [PETSC #17089] PETSc on XT3 with CNL? > > > Hello, > > Has anyone there tried installing PETSc on a Cray XT3 running > compute-node linux instead of catamount on the compute nodes? > > Any recommendations? > > Thanx, > > -Tom Cortese > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbostandoust at yahoo.com Thu Nov 29 07:46:14 2007 From: mbostandoust at yahoo.com (Mehdi Bostandoost) Date: Thu, 29 Nov 2007 05:46:14 -0800 (PST) Subject: Fwd: [PETSC #17089] PETSc on XT3 with CNL? In-Reply-To: <80886221-3BED-40FC-B9B9-C2A3CD3624EC@mcs.anl.gov> Message-ID: <434667.64427.qm@web33502.mail.mud.yahoo.com> we have petsc on XT3 at PSC,but it is on catamount. Barry Smith wrote: Begin forwarded message: From: Tom Cortese Date: November 29, 2007 7:36:06 AM CST To: petsc-maint at mcs.anl.gov Cc: petsc-maint at mcs.anl.gov Subject: [PETSC #17089] PETSc on XT3 with CNL? Hello, Has anyone there tried installing PETSc on a Cray XT3 running compute-node linux instead of catamount on the compute nodes? Any recommendations? Thanx, -Tom Cortese --------------------------------- Get easy, one-click access to your favorites. Make Yahoo! your homepage. -------------- next part -------------- An HTML attachment was scrubbed... URL: From keita at cray.com Thu Nov 29 08:55:07 2007 From: keita at cray.com (Keita Teranishi) Date: Thu, 29 Nov 2007 08:55:07 -0600 Subject: Fwd: [PETSC #17089] PETSc on XT3 with CNL? In-Reply-To: <434667.64427.qm@web33502.mail.mud.yahoo.com> References: <80886221-3BED-40FC-B9B9-C2A3CD3624EC@mcs.anl.gov> <434667.64427.qm@web33502.mail.mud.yahoo.com> Message-ID: <925346A443D4E340BEB20248BAFCDBDF032C37B0@CFEVS1-IP.americas.cray.com> Tom, PETSc is available for XT series and runs both on Catamount and CNL. On Jaguar at ORNL, you can run PETSc on CNL. Could you tell me which XT3 you are using now? Thank you, ================================ Keita Teranishi Math Software Group Cray Inc. keita at cray.com ================================ ________________________________ From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Mehdi Bostandoost Sent: Thursday, November 29, 2007 7:46 AM To: petsc-users at mcs.anl.gov Subject: Re: Fwd: [PETSC #17089] PETSc on XT3 with CNL? we have petsc on XT3 at PSC,but it is on catamount. Barry Smith wrote: Begin forwarded message: From: Tom Cortese Date: November 29, 2007 7:36:06 AM CST To: petsc-maint at mcs.anl.gov Cc: petsc-maint at mcs.anl.gov Subject: [PETSC #17089] PETSc on XT3 with CNL? Hello, Has anyone there tried installing PETSc on a Cray XT3 running compute-node linux instead of catamount on the compute nodes? Any recommendations? Thanx, -Tom Cortese ________________________________ Get easy, one-click access to your favorites. Make Yahoo! your homepage. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jwicks at cs.brown.edu Thu Nov 29 09:03:11 2007 From: jwicks at cs.brown.edu (John R. Wicks) Date: Thu, 29 Nov 2007 10:03:11 -0500 Subject: PCGetFactoredMatrix In-Reply-To: Message-ID: <000c01c83298$f66bd920$0201a8c0@jwickslptp> The documentation for PCGetFactoredMatrix is not clear. What does this return for ILU(0), for example? Does it return the product LU or the in place factorization? From knepley at gmail.com Thu Nov 29 11:04:21 2007 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 Nov 2007 11:04:21 -0600 Subject: PCGetFactoredMatrix In-Reply-To: <000c01c83298$f66bd920$0201a8c0@jwickslptp> References: <000c01c83298$f66bd920$0201a8c0@jwickslptp> Message-ID: It depends on the package, but the petsc stuff stores L and U in one matrix. Matt On Nov 29, 2007 9:03 AM, John R. Wicks wrote: > The documentation for PCGetFactoredMatrix is not clear. What does this > return for ILU(0), for example? > Does it return the product LU or the in place factorization? > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From jwicks at cs.brown.edu Thu Nov 29 12:07:24 2007 From: jwicks at cs.brown.edu (John R. Wicks) Date: Thu, 29 Nov 2007 13:07:24 -0500 Subject: PCGetFactoredMatrix In-Reply-To: Message-ID: <000201c832b2$b29277d0$0201a8c0@jwickslptp> I would like to compute the residual A - LU, where LU is the ILU factorization of A. What is the most convenient way of doing so? > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: Thursday, November 29, 2007 12:04 PM > To: petsc-users at mcs.anl.gov > Subject: Re: PCGetFactoredMatrix > > > It depends on the package, but the petsc stuff stores L and U > in one matrix. > > Matt > > On Nov 29, 2007 9:03 AM, John R. Wicks wrote: > > The documentation for PCGetFactoredMatrix is not clear. What does > > this return for ILU(0), for example? Does it return the > product LU or > > the in place factorization? > > > > > > > > -- > What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any > results to which their experiments lead. > -- Norbert Wiener > From bsmith at mcs.anl.gov Thu Nov 29 14:43:25 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 Nov 2007 14:43:25 -0600 Subject: PCGetFactoredMatrix In-Reply-To: <000201c832b2$b29277d0$0201a8c0@jwickslptp> References: <000201c832b2$b29277d0$0201a8c0@jwickslptp> Message-ID: John, There is no immediate way to do this. For the SeqAIJ format, we store both the LU in a single CSR format. with for each row first the part of L (below the diagonal) then 1/D_i then the part of U for that row. You can see how the triangular solves are done by looking at src/mat/impls/aij/seq/aijfact.c the routine MatSolve_SeqAIJ() Note that it is actually more complicated due to the row and column permutations (the factored matrix is stored in the ordering of the permutations). For BAIJ matrix the storage is similar except it is stored by block instead of point and the inverse of the block diagonal is stored. One could take the MatSolve_SeqAIJ() routine and modify it to do the matrix vector product without too much difficulty. If you decide to do this we would gladly include it in our distribution. Barry One can ask why we don't provide this functionality in PETSc since computing A - LU is a reasonable thing to do if one wants to understand the convergence of the method. The answer is two-fold 1) time and energy and 2) though we like everyone to use PETSc we driven more by people who are not interested in the solution algorithms etc but only in getting the answer easily and relatively efficiently. On Nov 29, 2007, at 12:07 PM, John R. Wicks wrote: > I would like to compute the residual A - LU, where LU is the ILU > factorization of A. What is the most convenient way of doing so? > >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley >> Sent: Thursday, November 29, 2007 12:04 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: PCGetFactoredMatrix >> >> >> It depends on the package, but the petsc stuff stores L and U >> in one matrix. >> >> Matt >> >> On Nov 29, 2007 9:03 AM, John R. Wicks wrote: >>> The documentation for PCGetFactoredMatrix is not clear. What does >>> this return for ILU(0), for example? Does it return the >> product LU or >>> the in place factorization? >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin >> their experiments is infinitely more interesting than any >> results to which their experiments lead. >> -- Norbert Wiener >> > From hzhang at mcs.anl.gov Thu Nov 29 16:49:54 2007 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 29 Nov 2007 16:49:54 -0600 (CST) Subject: PCGetFactoredMatrix In-Reply-To: References: <000201c832b2$b29277d0$0201a8c0@jwickslptp> Message-ID: John, You may look at ~petsc/src/mat/examples/tests/ex74.c in which we use || y - x || as an indicator for || A - ICC ||. In ex74.c, x is a randomly generated vector, b=A*x, and ICC*y = b. If you uncomment line 318 printf("lf: %d, error: %G\n", lf,norm2); and run ex74, you get lf: -1, error: 3.33036E-15 lf: 0, error: 4.44135 lf: 1, error: 4.40183 lf: 2, error: 3.13597 lf: 3, error: 2.39443 lf: 4, error: 1.79942 lf: 5, error: 1.4183 lf: 6, error: 1.11197 lf: 7, error: 0.877789 lf: 8, error: 0.750784 lf: 9, error: 0.571567 which shows the error || y - x || for ICC(lf), lf=level of fill. Hong On Thu, 29 Nov 2007, Barry Smith wrote: > > John, > > There is no immediate way to do this. > For the SeqAIJ format, we store both the LU in a single CSR format. > with for each row first the part of L (below the diagonal) then 1/D_i > then the part of U for that row. You can see how the triangular solves > are done by looking at src/mat/impls/aij/seq/aijfact.c the routine > MatSolve_SeqAIJ() > Note that it is actually more complicated due to the row and column > permutations > (the factored matrix is stored in the ordering of the permutations). > For BAIJ matrix the storage is similar except it is stored by block instead > of point > and the inverse of the block diagonal is stored. > > One could take the MatSolve_SeqAIJ() routine and modify it to do the matrix > vector product without too much difficulty. > > If you decide to do this we would gladly include it in our distribution. > > Barry > > One can ask why we don't provide this functionality in PETSc since computing > A - LU is a reasonable thing to do if one wants to understand the convergence > of the method. The answer is two-fold 1) time and energy and 2) though we > like everyone to use PETSc we driven more by people who are not interested > in the solution algorithms etc but only in getting the answer easily and > relatively > efficiently. > > > On Nov 29, 2007, at 12:07 PM, John R. Wicks wrote: > >> I would like to compute the residual A - LU, where LU is the ILU >> factorization of A. What is the most convenient way of doing so? >> >>> -----Original Message----- >>> From: owner-petsc-users at mcs.anl.gov >>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley >>> Sent: Thursday, November 29, 2007 12:04 PM >>> To: petsc-users at mcs.anl.gov >>> Subject: Re: PCGetFactoredMatrix >>> >>> >>> It depends on the package, but the petsc stuff stores L and U >>> in one matrix. >>> >>> Matt >>> >>> On Nov 29, 2007 9:03 AM, John R. Wicks wrote: >>>> The documentation for PCGetFactoredMatrix is not clear. What does >>>> this return for ILU(0), for example? Does it return the >>> product LU or >>>> the in place factorization? >>>> >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >> > From gdiso at ustc.edu Fri Nov 30 03:09:03 2007 From: gdiso at ustc.edu (Gong Ding) Date: Fri, 30 Nov 2007 17:09:03 +0800 Subject: Help: The SNES test result of my jacobian matrix Message-ID: Hi all, I use -snes_type test to check the Jacobian matrix of my semiconductor code. And get the result as follows: Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct. Run with -snes_test_display to show difference of hand-coded and finite difference Jacobian. Norm of matrix ratio 0.00277812 difference 0.0152703 Norm of matrix ratio 1.82658e-09 difference 1.01895e-08 Norm of matrix ratio 1.82964e-09 difference 1.02066e-08 [0]PETSC ERROR: SNESSolve() line 1871 in src/snes/interface/snes.c It seems PETSC check hand writing Jacobian for 3 times. The Norm is something large in the first time check but keeps small in other two. Dose this mean my Jacobian implementation correct or not? However, my code seems work well. Yours GONG DING Last mail seems lost. I send it again. From bsmith at mcs.anl.gov Fri Nov 30 08:10:36 2007 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 30 Nov 2007 08:10:36 -0600 Subject: Help: The SNES test result of my jacobian matrix In-Reply-To: References: Message-ID: <37B6C889-190D-41B4-8119-72385A2C8725@mcs.anl.gov> The testing of the Jacobian is done for three different input vectors. Since the first number is fairly large this indicates the Jacobian computed is not correct. You can run with -snes_test_display ALSO and it will show you both your Jacobian and the one computed with differencing so you can see exactly what entries are wrong. Good luck, Barry On Nov 30, 2007, at 3:09 AM, Gong Ding wrote: > Hi all, > I use -snes_type test to check the Jacobian matrix of my > semiconductor code. > And get the result as follows: > > Testing hand-coded Jacobian, if the ratio is > O(1.e-8), the hand-coded Jacobian is probably correct. > Run with -snes_test_display to show difference > of hand-coded and finite difference Jacobian. > Norm of matrix ratio 0.00277812 difference 0.0152703 > Norm of matrix ratio 1.82658e-09 difference 1.01895e-08 > Norm of matrix ratio 1.82964e-09 difference 1.02066e-08 > [0]PETSC ERROR: SNESSolve() line 1871 in src/snes/interface/snes.c > > It seems PETSC check hand writing Jacobian for 3 times. > The Norm is something large in the first time check but keeps small > in other two. > Dose this mean my Jacobian implementation correct or not? > However, my code seems work well. > > Yours > GONG DING > > Last mail seems lost. I send it again. >